Nvidia – Chat with RTX

Nvidia needs to go further.

  • Nvidia has released an LLM-based chatbot for Nvidia graphics cards but by not using its own foundation model, it is still leaving itself open to pressure as the AI Ecosystem develops and the control point moves up through the technology stack.
  • The new chatbot is called Chat with RTX and is only available on PCs that have a GeForce RTX 30 or 40 series Nvidia graphics card with at least 8GB of onboard VRAM installed.
  • These are pretty high-end cards, once again underlining that the problems with running LLMs on edge devices as opposed to the cloud are very far from solved.
  • However, there is plenty of low-hanging fruit in this area which is the source of much research to fix but as it stands today, the power and compute requirements of LLMs remain a problem for edge devices.
  • Chat with RTX is essentially a framework by which the user can select a generic open-source foundation model and then use it to analyse local documents as well as combine that analysis with data available on the Internet.
  • This is essentially a consumer version of the enterprise use case that RFM Research has concluded could have a substantial impact on how enterprises store and use data.
  • The installation process is lengthy taking 30 minutes on the fastest PC as the app is a colossal 40GB in size and gobbles 3GB of RAM when it is running.
  • This tells me that the app contains both the LlaMa and Mistral open-source models which the user can choose between when the app is running.
  • The advantage of this is that all of the inference is carried out on the RTX graphics card meaning that the model can be adapted for the specific user and the data never leaves the device.
  • RFM research has concluded that security and privacy are two reasons why most inference will eventually be carried out locally (the other is economics) and so from this perspective, Nvidia has made the right choice.
  • However, by using 3rd party foundation models, it is allowing potential competitors to put pressure on its ability to control the market for AI training.
  • This is because there are signs that the platform of choice for the development of generative AI will move from silicon development platforms to the foundation model.
  • If developers migrate from developing gen AI services on CUDA to developing them on GPT, Gemini, LlaMa, Mistral and so on, then Nvidia is at risk of losing its grip on the AI ecosystem.
  • The foundation model owners have no interest in being limited to Nvidia and are likely to support multiple silicon platforms giving Nvidia’s competitors chances to gain some market share.
  • Furthermore, developers will have become abstracted from the silicon meaning that they will care less about which silicon is used to fine-tune their services using the foundation model that they have chosen.
  • If the development platform of choice will become the foundation model, then Nvidia needs to go further up through the stack and offer one of its own to compete against OpenAI, Microsoft, Google et al.
  • Developers at some point may need to be choosing Nvidia at the foundation model level for the company to fully retain its grip on the AI ecosystem and offering the foundation models of others does not do this.
  • This will take a long time to manifest itself as the Apple App Store took more than 4 years to define the iOS ecosystem which gives Nvidia time to counter.
  • Hence, for now, there appears to be no one who can lay a glove on Nvidia but in a few years, the picture may look very different.
  • Nvidia needs to go further up the technology stack to counter this long-term threat.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.