I haven’t checked on Roblox recently, but afaik it doesn’t really allow complete creative freedom or the ability to have a picture and say “make the world look like this, and make the character textures match the vibe” and have it happen. Don’t they still have a unified world experience or can you really customize things that deeply now?
Can you make a basically indistinguishable copy of other games in Roblox? If so, that’s pretty cool, even without AI integration.
Roblox can't beat Google in AI. Roblox has network effects with users, but on an old school tech platform where users can't magic things into existence.
I've seen Roblox's creative tools, even their GenAI tools, but they're bolted on. It's the steam powered horse problem.
It’s also possible for LLMs to be inevitable, generate massive amounts of wealth and still be mostly fluff in terms of objective human progress.
The major change from my perspective is new consumer behavior: people simply enjoy talking to and building with LLMs. This fact alone is generating a lot (1) new spend and (2) content to consume.
The most disappointing outcome of the LLM era would be increasing the amount of fake, meaningless busywork humans have to do just to sift through LLM generated noise just to find signal. And indeed there are probably great products to be built that help you do just that; and there is probably a lot of great signal to be found! But the motion to progress ratio concerns me.
For example, I love Cursor. Especially for boilerplating. But SOTA models with tons of guidance can still not reliably implement features in my larger codebases within the timeframe it would take me to do it myself. Test-time compute and reasoning makes things even slower.
> For example, I love Cursor. Especially for boilerplating. But SOTA models with tons of guidance can still not reliably implement features in my larger codebases within the timeframe it would take me to do it myself. Test-time compute and reasoning makes things even slower.
Importantly it also takes you guiding it to complete the task. Meaning you still need to pay a human and the cost of the LLM, so it's slower and a bit more expensive.
I am not convinced either that AI working on complex programming tasks could be guided by less skilled devs, meaning you still need to pay the skilled dev.
In my experience so far, the cost analysis doesn't work for more complex application development. Even if the cost of the LLM was free it is often wasting the skilled dev's time.
All these metrics will change over the years and maybe the math works out eventually, or in specific circumstances, and I forsee LLMs assisting in development into the future.
I am not seeing the cataclysmic wholesale replacement of humans in the workforce some are predicting, at this stage.
I think it's likely we learn to develop healthier relationships with these technologies. The timeframe? I'm not sure. May take generations. May happen quicker than we think.
It's clear to me that language models are a net accelerant. But if they make the average person more "loquacious" (first word that came to mind, but also lol) then the signal for raw intellect will change over time.
Nobody wants to be in a relationship with a language model. But language models may be able to help people who aren't otherwise equipped to handle major life changes and setbacks! So it's a tool - if you know how to use it.
Let's use a real-life example: relationship advice. Over time I would imagine that "ChatGPT-guided relationships" will fall into two categories: "copy-and-pasters", who are just adding a layer of complexity to communication that was subpar to begin with ("I just copied what ChatGPT said"), and "accelerators" who use ChatGPT to analyze their own and their partners motivations to find better solutions to common problems.
It still requires a brain and empathy to make the correct decisions about the latter. The former will always end in heartbreak. I have faith that people will figure this out.
>Nobody wants to be in a relationship with a language model.
I'm not sure about it. I don't have first or second hand experience with this, but I've been hearing about a lot of cases of people really getting into a sort of relationship with an AI, and I can understand a bit of the appeal. You can "have someone" who's entirely unjudgemental, who's always there for you when you want to chat about your stuff, and isn't ever making demands of you. It's definitely nothing close to a real relationship, big I do think it's objectively better than the worst of human relationships, and is probably better for your psyche than being lonely.
For better or for worse, I imagine that we'll see rapid growth in human-AI relationships over the coming decade, driven by improvements in memory and long-term planning (and possibly robotic bodies) on the one hand, and a growth of the loneliness epidemic on the other.
I am building a company in this space, so can hopefully give some insight [0].
The issue right now is that both (1) function calling and (2) codegen just aren't really very good. The hype train far exceeds capabilities. Giving great demos like fetching some Stripe customers, generating an email or getting the weather work flawlessly. But anything more sophisticated goes off the rails very quickly. It's difficult to get models to reliably call functions with the right parameters, to set up multi-step workflows and more.
Add codegen into the mix and it's hairier. You need a deployment and testing apparatus to make sure the code actually works... and then what is it doing? Does it need secret keys to make web requests to other services? Should we rely on functions for those?
The price / performance curve is a consideration, too. Good models are slow and expensive. Which means their utility has to be higher in order to charge a customer to pay for the costs, but they also take a lot longer to respond to requests which reduces perception of value. Codegen is even slower in this case. So there's a lot of alpha in finding the right "mixture of models" that can plan and execute functions quickly and accurately.
For example, OpenAI's GPT-4.1-nano is the fastest function calling model on the market. But it routinely tries to execute the same function twice in parallel. So if you combine it with another fast model, like Gemini Flash, you can reduce error rates - e.g. 4.1-nano does planning, Flash executes. But this is non-obvious to anybody building these systems until they've tried and failed countless times.
I hope to see capabilities improve and costs and latency trend downwards, but what you're suggesting isn't quite feasible yet. That said I (and many others) are interested in making it happen!
Well in the mean time we could just have the LLM shoot Jira tickets at human developers to build out new tools it requires ASAP? And until it’s done have a placeholder message returned to the client? Could be a good way to keep developers working constantly. And eventually when the tech is good you replace the human devs with LLMs.
I think this is a cop out. OpenAI literally published a better integration spec two years ago, stored on `/.well-known/ai-plugin.json`. It just gave a summary of an OpenAPI spec, which ChatGPT could consume and then run your functions.
It was simple and elegant, the timing was just off. So the first shot at this problem actually looked quite good, and we're currently in a regression.
MCP as a spec is really promising; a universal way to connect LLMs to tools. But in practice you hit a lot of edge cases really quickly. To name a few; auth, streaming of tool responses, custom instructions per tool, verifying tool authenticity (is the server I'm using trustworthy?). It's still not entirely clear (*for remote servers*) to me what you can do with MCP that you can't do with just a REST API, the latter being a much more straightforward integration path.
If other vendors do adopt MCP (OpenAI and Gemini have promised to) the problem they're going to run into very quickly is that they want to do things (provide UI elements, interaction layers) that go beyond the MCP spec. And a huge amount of MCP server integrations will just be lackluster at best; perhaps I'm wrong -- but if I'm { OpenAI, Anthropic, Google } I don't want a consumer installing Bob's Homegrown Stripe Integration from a link they found on 10 Best MCP Integrations, sharing their secret key, and getting (A) a broken experience that doesn't match the brand or worse yet, (B) credentials stolen.
I anticipate alignment issues as well. Anthropic is building MCP to make the Anthropic experience great. But Anthropic's traffic is fractional compared to ChatGPT - 20M monthly vs 400M weekly. Gemini claims 350M monthly. The incentive structure is all out of whack; how long are OpenAI and Google going to let an Anthropic team (or even a committee?) drive an integration spec?
Consumers have barely interacted with these things yet. They did once, with ChatGPT Plugins, and it failed. It doesn't entirely make sense to me that OpenAI is okay to do this again but let another company lead the charge and define the limitations of the end user experience (because that what the spec ultimately does, dictates how prompts and function responses are transported), when the issue wasn't the engineering effort (ChatGPT's integration model was objectively more elegant) but a consumer experience issue.
The optimistic take on this is the community is strong and motivated enough to solve these problems as an independent group, and the traction is certainly there. I am interested to see how it all plays out!
OpenAI takes the backseat and wait until something stable/usable comes out of it which gains traction and takes it over then. Old classic playbook to let others make the mistakes and profit from it…
> It's still not entirely clear (for remote servers) to me what you can do with MCP that you can't do with just a REST API,
Nothing, as far as I can tell.
> the latter being a much more straightforward integration path.
The (very) important difference is that the MCP protocol has built in method discovery. You don't have to 'teach' your LLM about what REST endpoints are available and what they do. It's built into the protocol. You write code, then the LLM automatically knows what it does and how to work with it, because you followed the MCP protocol. It's quite powerful in that regard.
But otherwise, yea it's not anything particularly special. In the same way that all of the API design formats prior to REST could do everything a REST API can do.
In the grand scheme of things I think we are still very early. MCP might be the thing which is why I'd rather try and contribute if I can; it does have a grassroots movement I haven't seen in a while. But the wonderful thing about the market is that incentives, e.g. good customer experiences that people pay for, will probably win. This means that MCP, if it remains the focal point for this sort of work, will become a lot better regardless of whether or not early pokes and prods by folks like us are successful or not. :)
I've been programming since I was eight, but truly fell in love with biology in 12th grade chemistry: the first introduction to organic chemistry and biochemistry. It was the first time I truly started grokking the application of systems-level thinking to the biological world; how do trees "know" to turn red in the autumn? How do fetuses assemble themselves from two cells?
I decided to purse a double major in biochemistry and evolutionary biology and it was one of the best decisions I've made in my life. The perspective you gain from understanding all life in terms of both networks and population dynamics of atoms, molecules, cells, tissue, organisms and populations -- and how every layer reflects the layer both underneath and above it in a fractal pattern -- is mind-expanding in a way I think you just don't and can't get designing software systems alone.
I work as a software engineer / founder now, but always reflect wistfully on my time as a biologist. I hope to get back to it some day in some way, and think what the Arc Institute team is doing is inspirational [0].
Has anyone seen content that used this multiscale networking and population dynamics as an instructional approach?
For small example, there was a Princeton(?) coffee-table book which used "everyday" examples to illustrate cell/embryonic organizational techniques - like birds equally spacing themselves along a wire. Or compartmentalization, as a cross-cutting theme from molecules to ecosystems.
I've an odd hobby interest in exploring what science education content might look like, if incentives were vastly different, and massive collaborative domain expertise was allocated to crafting insightful powerful rough-quantitative richly-interwoven tapestry.