Amazon & AI – A Question of Time

No one is going to dislodge Nvidia yet.

  • Amazon is readying the next version of its in-house AI training and inference chips but until CUDA is no longer the control point of AI training and Amazon iterates much faster, it is going nowhere (except in-house and with Anthropic).
  • Trainium is Amazon’s in house data centre chip that is designed to train large language models larger than 100bn in size and was made generally available in late 2022.
  • Inferentia is Tranium’s sibling and is optimised for inference where Amazon claims it is 40% cheaper to execute requests but doesn’t go on to say compared to what.
  • This is where the whole benchmark game gets very murky and, in my experience, its pretty easy to get a benchmark to say anything you want it to and so I always treat them with a pinch of salt.
  • Amazon is working on Trainium 2 which I expect will be launched at AWS Reinvent with a whole host of benchmarks showcasing why its better, faster and cheaper but I very much doubt that it will make much difference.
  • Dave Brown, vice-president of compute and networking services at AWS, gives the game away when he says to the FT “We want to be absolutely the best place to run Nvidia, but at the same time we think it’s healthy to have an alternative”.
  • In plain English this means: customers, please don’t worry we will still give you as much Nvidia as you want, but please consider looking at our in-house alternative.
  • This is indicative of Nvidia’s market power, and all of its biggest customers are trying to reduce their dependence on it for training and inference and so far, they are having very little success.
  • What is more, it looks to me like they are not going to succeed until the nature of the market changes and the control point migrates away from silicon development and goes further up the technology stack.
  • This is beginning to happen, but it is quite a slow process.
  • For example, we are seeing developers beginning to move towards developing their services on top of a foundation model and using the tools of the model’s creator.
  • If this becomes the standard way to develop AI services in the future, then customers will care much less about the silicon that sits at the bottom of the stack as the foundation model’s owner will have taken care of that.
  • These companies have a strong incentive to ensure that their models are optimised for multiple silicon providers as then they will win some pricing power back from Nvidia and reduce their cost of doing business.
  • Until this becomes the mainstream, I can’t see anyone really making a dent in Nvidia’s position as its platform is already the most mature with the widest range of tools and has become the industry standard.
  • Furthermore, its product cadence means that it is always at least one generation ahead of everyone else which allows it to credibly claim that it is cheaper to use Nvidia even when it earns 75%+ gross margins.
  • The net result is that Amazon will be able to use Trainium 2 for its in-house models but only Anthropic (in which Amazon is a big investor and likely acquiror) is likely to make the effort to ensure it’s models are fully optimised for Trainium.
  • It is not until silicon development matters less that Nvidia’s grip may weaken but it has seen this coming which is why it is already making a play further up the AI technology stack.
  • Even if this does not work, it will be some time before Nvidia’s grip is weakened meaning that in the short-term nothing is going to change.
  • Hence, I continue to think that Nvidia is the only way to play the generative AI craze directly if I was forced to go that way.
  • However, I continue to prefer the adjacencies of inference at the edge (Qualcomm) and nuclear power as their valuations are far more reasonable and both will still perform well even if generative AI fails to live up to expectations.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.