OpenAI – Strawberries and Cream

Project Strawberry unlikely to get the cream.

  • OpenAI has come up with a system to rank its progress towards artificial general intelligence (AGI) but apart from the first two, the levels are so vague that they will be of little use other than to help the company raise more money to spend on compute.
  • This goes hand in hand with Project Strawberry which is a new reasoning technology that OpenAI hopes will help move its models along the road towards true machine intelligence as opposed to statistical pattern recognition.
  • Strawberry is yet another mysterious project like Q which was also supposed to provide something similar but has yet to see the light of day.
  • Both of these projects are contradictory to Mira Murati’s comments which strongly imply that what OpenAI has in its labs is only an incremental improvement over what is currently available in the market.
  • OpenAI’s 5 levels on the journey to super intelligence or AGI where machines replace humans at most economic tasks are as follows:
    • Level 1, Chatbots with conversational language.
    • This is a pretty easy one and it is safe to say that everyone has comfortably met this standard.
    • Level 2, Reasoners which means human-level problem solving.
    • This is also pretty clear and OpenAI (and I suspect many others) would argue that they are very close to achieving this standard.
    • Here, we see the first divergence of opinion with a number of well-known sceptics and RFM arguing that there is no evidence whatsoever that these machines can reason.
    • In fact, the empirical evidence points to the exact opposite and even some of OpenAI’s own data from 2020 strongly supports this view.
    • Our position is that statistical-based systems can’t reason and that a different technology will be required to allow the technology to reach stage two.
    • Level 3, Agents, which are systems that can spend several days taking actions on a user’s behalf.
    • What this means is unclear which leaves it open to spurious claims.
    • Level 4, Innovators, which means AI that can aid innovation.
    • Again, this is vague as the current crop of chatbots can be useful in the process of innovation today.
    • This is because they can make connections in patterns of data that humans might not otherwise have made and help point humans in the right direction.
    • Level 5, Organisations, which I presume means that one could have an entire company providing goods and services entirely staffed by AI.
    • Again, this is pretty vague as, in theory, one could do this today, although the AI-staffed company would probably quickly go bankrupt.
  • To muddy the water further, these levels are open to review meaning that the standard could be altered to match the reality rather than the other way around.
  • Instead, I would propose a much simpler test: whether the machines understand causality, which can be tested in two ways.
  • First, the ability to reason to a conclusion that is not present in the data set as opposed to randomly making things up.
  • Second, the ability to distinguish a relationship that is causal as opposed to one that is merely correlated.
  • In my opinion, all AI that is based on deep learning (which is statistical in nature) fails both of these tests and I have seen no evidence to the contrary.
  • Furthermore, I do not expect to see real progress beyond OpenAI’s level 1 until the industry stops focusing on making the models bigger and focuses on an architecture that combines different technologies that complement each other’s weaknesses.
  • Of this, there is no real sign and so I continue to think that the stage is set for a reset of expectations from “super intelligence is almost here” to a “new technology that does a few things exceptionally well but not much else”.
  • This reset will most hurt those that have multibillion-dollar valuations but very little to show in terms of revenues or profits.
  • Nvidia is not one of these as AI is now a $100bn business with 80% gross margins but it is firmly on the AI rollercoaster and there is no way it can get off.
  • Hence, it will correct when the reset comes but much less than the likes of OpenAI, Anthropic, Mistral and so on.
  • I continue to prefer the lateral plays of inference at the edge or nuclear power (to run data centres, EVs etc.) where I have positions in both.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.