The race to the bottom is beginning.
- OpenAI’s smaller GPT-4o Mini promises to kick off a race to the bottom in terms of pricing as it is only slightly inferior to GPT-4o but is 60% cheaper which is likely to trigger a spiral of price cuts as users realise that there is not much that separates all these models from another in terms of performance.
- GPT-4o Mini costs 60% in terms of $ per token where the price can be cut in half again if the user’s enquiry is not time-sensitive.
- The model is, in all likelihood, much smaller than its big brother (I would estimate 50%) but on the benchmarks is only slightly inferior to its much larger peers.
- To my mind, this is a strong indication that the scaling laws that keep everyone building bigger and bigger models that use more compute and more power have almost reached the end of their useful life.
- There is plenty of evidence that suggests that the best way to improve AI from here is not to build bigger but to build smarter or to train smaller models with more data.
- At the moment, this view is very much in the minority as the mad rush to superintelligent machines has everyone spending tens of billions on data centres (AI Factories) and begging Nvidia to sell them some silicon.
- The problem is that I see no evidence whatsoever that this approach will produce superintelligent machines meaning that a reset of expectations is needed.
- I have long argued that this will begin when the price of using these models begins to fall and the launch of GPT-4o Mini is another indication that this is starting to happen.
- Users of the ChatGPT Enterprise service have also informed me that they are seeing price cuts indicating that there is now real competitive pressure in the market for these services.
- Combine this with the fact that the differences in performance from one foundation model to another are no longer very large and one can see how this could rapidly become a commodity triggering a race to the bottom in terms of price.
- While this does not mean very much for the big players who have almost infinite resources, it will be existential for start-ups who have raised money from VCs and will inevitably need more.
- The problem that they will have will be that their initial valuations were based at a pricing level that is no longer realistic meaning that break-even will be pushed out and long-term revenue assumptions will be cut.
- This will lead to big cuts to valuations and just like in autonomous driving, this will be the catalyst that triggers the reset.
- I don’t think that the reset in generative AI will be anything like as bad as it was in autonomous driving because generative AI has good products that generate real revenues now.
- It is just that these products will never produce superintelligence and they won’t generate as much revenue as the Kool-Aid drinkers think that they will.
- With lower or negative returns on investment, the flow of capital will slow and become more discerning meaning that only the best projects will get funded.
- This will lead to heavy consolidation where OpenAI gets acquired by Microsoft, Anthropic by Amazon and so on.
- This will hurt Nvidia but given that it is pretty much the only one with real profits in this sector at the moment, it will probably hurt it much less than anyone else.
- Hence, if I had to own anything in generative AI, I would own Nvidia but I continue to be positioned more laterally where inference at the edge and nuclear power are two great adjacencies.
Blog Comments
Andrew
July 19, 2024 at 10:16 pm
Disclosure: I am employed by Microsoft, but do not work directly on AI products so am asking this in a personal capacity. Big fan of the blog and have seen you reference a number of times that you believe that it is inevitable that Microsoft will acquire OpenAI. Does recent news from CMA raise any challenges in your mind?