Nvidia GTC 2024 Day 1 – Entrenchment

Nvidia knows exactly what it is doing.

  • Nvidia once again distanced itself from its rivals revealing yet more hardware advances and new features for its platforms that aim to increase the degree of its entrenchment which will support its revenues and gross margins for some time yet.
  • The highlights were:
    • First, Blackwell: which replaces Hopper and once again puts Nvidia at the cutting edge of AI training and (if Nvidia gets its way), inference in the cloud as well.
    • Blackwell is two separate dies made on TSMC’s 4nm process that are stuck together in such a way that they behave as a single piece of silicon.
    • This combined with the new Nvidia Link Switch chip allows every GPU to communicate with every other GPU in real time meaning that they can work as a single system.
    • Blackwell has been designed for trillion-parameter models which is effectively a leak of Open AI and everyone else’s latest model architectures (see below).
    • This is significant because the biggest problem with multi-GPU training systems is that they are limited by the speed at which data comes on and off the individual chips and how well the GPUs can work together.
    • This combined with further iterations of the InfiniBand technology it acquired from Mellanox re-establishes the lead it acquired two years ago when it launched Hopper.
    • Rivals have made some progress in closing this gap only to have Nvidia open it up again.
    • The take-home message on Blackwell vs Hopper is that it can do the same job as Hopper but with ¼ the number of GPUs and ¼ the amount of power consumption.
    • If one assumes that the price of Blackwell will be double that of Hopper, then the cost to the client compared to Hopper to train large language models still drops by 50%.
    • This, combined with the complete lock that Nvidia has on the development platform, is how revenues grow, gross margins stay the same and a lot of cash is generated.
    • Second, LLM Architectures: The Blackwell specification has given the game away on the latest generation of LLMs where everyone is playing their cards very close to their chests in terms of what the architectures are, how big the models are and how they are trained.
    • If everyone had stopped growing the size of their models, then the Blackwell architecture would be very different and Blackwell has been specifically designed for trillion parameter models.
    • Nvidia cited GPT-4 as being 1.8tn parameters (10x the size of GPT-3) but I can find no reference anywhere to a disclosure by OpenAI as to the size of this model and so I suspect that Nvidia is using the current market estimate.
    • However, there is no way that Nvidia would have designed Blackwell the way that it has unless its customers were asking for systems that can support models with trillions of parameters, meaning that the industry is still following Kaplan’s bigger is better approach.
    • There is some evidence to suggest that this is not always the case, but for now, the stage is set for massive models to become even more massive over the next 12 months at least.
    • This is excellent news for Nvidia because bigger means more compute which means more chips at higher prices and fat 70% gross margins.
    • Third, inference in the cloud: Following on from its Q4 results, inference again took an increasingly important role.
    • Nvidia’s partners are talking about a compute requirement for inference 4-5x the size of that for training and inference is already taking up 50% of the capacity that Nvidia has already deployed.
    • Blackwell has been also designed with inference in mind and as such offers a 30x improvement over Hopper using a combination of 4-bit processing and the new NVlink Switch chip.
    • This is a very large improvement and a reflection of my view that 2024 is the year of inference compared to 2023 which was all about training.
    • However, it does not change RFM’s view that the real market for inference will be at the edge which will remain an economic no-brainer for the provider of a generative AI service at scale regardless of what Nvidia does in the cloud (see here).
    • Third, Nvidia Inference Microservices (NIMs): which reinforces Nvidia’s move towards inference as well as aims to preserve its appeal should the focus in AI move away from silicon.
    • NIMs are pre-packaged and trained generative AI models that can be deployed anywhere in the ecosystem where CUDA is being used.
    • I thought that Nvidia might launch a foundation model for developers to push back against the threat of losing its control point as developers move from developing on silicon to instead developing on foundation models.
    • This approach is different but is likely to have a similar effect if it proves to be very popular.
    • Nvidia will create a library of NIMs with specific pre-trained capabilities such as chip design, medical expertise, drug discovery and so on that customers can license and use to improve their business performance.
    • If these prove to be popular, they could help to shore up the stickiness of Nvidia silicon if this comes under pressure from developers moving towards foundation models as their preferred development platform.
    • Fourth, partners: Nvidia is moving to lock up the AI industry before any of its potential rivals get a look in.
    • This is classic first-mover advantage with the idea being that by the time competitors have equivalent products, everyone is already on Nvidia’s platforms and can’t be bothered to switch.
    • Critically, Nvidia announced extensions and enlargements to its partnerships with Microsoft, Google and AWS which are significant as all three already have competing silicon or are actively developing it.
    • The problem for them is that the AI freight train is leaving the station and currently it has Nvidia’s badge on it meaning that right now, one has to climb on or get left behind.
    • While the development platform for AI remains CUDA, there is very little that anyone can do to reduce their dependence on Nvidia.
    • Fifth, Omniverse: where Nvidia is breathing some life into what has become a moribund segment of the technology industry: The Metaverse.
    • Generative AI fits really well with The Metaverse as one can use AI to generate both the 3D environment as well as the scenarios and simulations that make digital twins so effective in reducing development and maintenance costs of large complex assets like factories.
    • Here. once again, partnerships were crucial with Nvidia getting Siemens and SAP on board which are two of the world’s most important suppliers to industry.
    • Nvidia will be powering Siemens Xcelerator giving it access to all of Siemens’ industrial ecosystem further cementing its position.
    • Nvidia will also be powering SAP’s AI offerings again aimed at increasing its stickiness and helping it to become further entrenched.
  • Nvidia’s strategy is very simple which is to leverage the popularity of CUDA with developers to remain the go-to place to develop and run AI in the cloud.
  • Combine this with a product cadence that keeps it ahead in terms of pure hardware horsepower and performance and its competitors remain very far away from laying a glove on its market position.
  • Hence, I don’t see any threats to its profitability for some time, but its market share is now so high that growth is pretty much out of its control and will be determined by the market rather than anything that Nvidia can do.
  • Nvidia’s shares are expensive but compared to the valuations being given to the generative AI companies that are using its products, it is not hard to argue that it is cheap.
  • However, the easy money in Nvidia has been made and to go further from here requires the AI bubble to continue without check.
  • There are plenty of signs that this will be the case for a while but at some stage, there will be a reset and at that time there will be almost nothing that Nvidia will be able to do about it.
  • The good news is that for the moment, it looks like the run will continue.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.