Mar 2, 2025
I think this is a strong take on the on the consequences of the recent RL breakthroughs from Alexander Doria:
I think it’s time to call it: the model is the product.
All current factors in research and market development push in this direction.
Generalist scaling is stalling. This was the whole message behind the release of GPT-4.5: capacities are growing linearly while compute cost are on a geometric curve. Even with all the efficiency gains in training and infrastructure of the past two years, OpenAI can’t deploy this giant model with a remotely affordable pricing. Opinionated training is working much better than expected. The combination of reinforcement learning and reasoning means that models are suddenly learning tasks. It’s not machine learning, it’s not base model either, it’s a secret third thing. It’s even tiny models getting suddenly scary good at math. It’s coding model no longer just generating code but managing an entire code base by themselves. It’s Claude playing Pokemon with very poor contextual information and no dedicated training. Inference cost are in free fall. The recent optimizations from DeepSeek means that all the available GPUs could cover a demand of 10k tokens per day from a frontier model for… the entire earth population. There is nowhere this level of demand. The economics of selling tokens does not work anymore for model providers: they have to move higher up in the value chain. This is also an uncomfortable direction. All investors have been betting on the application layer. In the next stage of AI evolution, the application layer is likely to be the first to be automated and disrupted.
Deep research from OpenAI is the first example of this trend:
DeepResearch is not a standard LLM, nor a standard chatbot. It’s a new form of research language model, explicitly designed to perform search tasks end to end. The difference is immediately striking to everyone using it seriously: the model generate lengthy reports with consistent structure and underlying source analysis process.
This is the actual meaning of building agents, contrary to what most startups are doing today:
What most agent startups are currently building is not agents, it’s workflows, that is “systems where LLMs and tools are orchestrated through predefined code paths.” Workflows may still bring some value, especially for vertical adaptations. Yet, to anyone currently working in the big labs it’s strikingly obvious that all major progress in autonomous systems will be through redesigning the models in the first place.
We had a very concrete demonstration of this with the release of Claude 3.7, a model primarily trained with complex code use cases in mind.
This means the value of agents is captured in the model later, not the application layer:
What all this all means in practice: displacing complexity. Training anticipates a wide range of actions and edge cases, so that deployment becomes much more simple. But in this process most of the value is now created and, likely in the end, captured by the model trainer.
There’s a prediction worth following towards the bottom. This will tell us which mode we’re running in:
The Model is the ProductLately, there were some felt irritation at the lack of “vertical RL” in the current Silicon Valley startup landscape.