Nvidia AI Foundry – CUDA II

Game changer that will be largely ignored.

  • Nvidia AI Foundry is a consequential announcement as it looks like it has been designed to maintain Nvidia’s dominance in AI when CUDA is no longer the control point in the AI ecosystem.
  • However, I am pretty certain that the timing of the announcement and the complexity of the topic will ensure that no one pays very much attention to it.
  • The best way to understand Nvidia AI Foundry is to compare it with TSMC (as Nvidia does in its materials).
  • TSMC manufactures chips for fabless (no factory) semiconductor companies who design their chips, and send those designs to TSMC, which manufactures them on their behalf.
  • Nvidia AI Foundry is directly analogous to TSMC in that it provides tools and services for AI developers to create their own models which are then packaged into NIMs (Nvidia Inference Microservices) which are optimised for the inference architecture that the developer wishes to run.
  • A NIM is a customised model or models that sit inside a wrapper containing a standardised API meaning that 1)  multiple NIMs can be easily plugged together to create a specific service and, 2) they are optimised to run on Nvidia inference hardware.
  • The NIMs are “manufactured” in a NIM Factory which is the equivalent of one of TSMC’s fabs and then the finished NIM is delivered to the developer to use in the service it wishes to offer to customers.
  • The key to this is that while NIMs will run on non-Nvidia hardware, they will run so much faster and so much more efficiently on Nvidia hardware, that there is no point in using anything else.
  • This means that if NIMs become the standard way to create and deploy AI services (generative and other), then Nvidia will have replaced the control point of CUDA with the control point of NIMs as nobody using a NIM is going to use non-Nvidia hardware.
  • Hence, with NIMs as an industry standard, Nvidia will hold onto its 85% market share in training and inference as well as its 75%+ gross margins.
  • Nvidia AI Factory is already off to a strong start with some heavyweight endorsement from Meta Platforms and Accenture with Meta Platforms timing the release of its new model LlaMa 3.1 to coincide with Nvidia AI Foundry Launch.
  • Accenture has been tinkering with NIMs as an early partner but will now use the AI Foundry to create NIMs on behalf of its clients.
  • Given the size of Accenture, this is a big endorsement and the CEO of Accenture, Julie Sweet was quoted directly in the announcement.
  • Nvidia AI Foundry is launching with Meta’s Llama 3.1 update which is available now in the sizes of 8bn, 70bn and for the first time, 405bn.
  • What is new here is that this is the first time that Meta has made its flagship model available to open source and, if Meta is to be believed, it ranks first in 7 out of 15 evaluations when compared to Nvidia’s Nemotron 4, GPT-4, GPT-4o and Claude 3.5 Sonnet.
  • This is relevant because none of the other flagship models (other than Nemotron 4) are available to the open source which combined with being available on AI Foundry from day 1 should provide a draw for developers.
  • Its initial use case in the AI Foundry is for synthetic data generation but I suspect that those that wish to compete with OpenAI or Anthropic directly will quickly create NIMs that do precisely that using Llama 3.1 405bn.
  • This is really bad news for OpenAI, Anthropic, Google and so on as all of them are hoping to make money from something Meta has just given away for free.
  • This could potentially accelerate the race to the bottom in terms of generative AI service pricing which is what I think will trigger the reset from AI hype to reality.
  • This announcement also makes Nvidia’s strategy much clearer which is to remain the go-to place to develop AI services even when developers are no longer so focused on CUDA.
  • This is what RFM Research refers to as AI Ecosystem 3.0 which is way ahead of where we are today (AI Ecosystem 1.0) where the control point is at the silicon development platform.
  • Nvidia’s current dominance in this space provides it with an opportunity to migrate loyalty and preference for CUDA into loyalty and preference for the AI Foundry which is why I refer to it as CUDA II.
  • Ironically, the announcement was made on the day that Nvidia’s shares tanked 8% in line with the rest of the technology sector which underlines that the level of attention and understanding of what AI Foundry implies is pretty much zero.
  • Hence, if there is a big correction in AI-related stocks as reality bursts the hype bubble and AI Foundry really takes off, then there will be another opportunity to invest in Nvidia for those who missed it the first time around (like me).
  • For now, I am sitting tight in my AI adjacencies of inference at the edge and nuclear power.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.