Artificial Intelligence – Desperately Seeking Independence

DeepSeek’s techniques need to be independently verified.

  • It is not DeepSeek R1’s performance that is surprising, but how cheaply the company was able to train its model that has sent shockwaves through the AI industry and rattled the cage of the stock market.
  • Hardly anyone paid attention to DeepSeek when its R1 model was conveniently launched on the same day as the US presidential inauguration (see here) but now that it has topped the app store charts, panic has set in.
  • DeepSeek R1’s abilities are unremarkable in that it competes head-to-head with everyone else because it is clear that the field of models has been tightening for some time.
  • It is remarkable that DeepSeek claims that its model cost only $6m to train compared to the $500m that OpenAI is thought to have spent training GPT-4o.
  • However, this 98.8% cost reduction needs to be taken with a large dose of scepticism as:
    • Total cost: where the cost is based on a theoretical assumption to train V3, not R1 (by DeepSeek in its paper) and is not what DeepSeek actually spent.
    • The estimates for OpenAI are based on its capital needs and so implicitly include everything that the company needs to function whereas DeepSeek’s numbers are being reversed engineered into the cost just to run the H800 chips.
    • Subsidy: DeepSeek is clearly very popular with the Chinese state at the moment, and it is not clear what assistance the company received in terms of grants and subsidies.
    • The fact that the company released its model to coincide with the presidential inauguration is a sign that DeepSeek is working with the Chinese state to help its PR efforts to advance China’s reputation as a technology superpower.
    • Hence, while DeepSeek’s claims are very impressive, I put no faith in them until they have been independently verified.
    • Third Independent reproduction: is a foundational pillar of scientific enquiry and one of the purposes of scientific publication is to enable others to replicate the results.
    • DeepSeek’s release of R1 under the very relaxed MIT licence and its disclosure of its methods puts it far ahead of OpenAI and everyone else (except Meta) in terms of openness, but whether its results are reproducible remains to be seen.
    • The last thing the Chinese state wants is to invent a way to train leading-edge AI 30x cheaper than anyone else and then give that innovation away.
    • Hence, I fear that the devil will be in the details and that there are fundamental elements of these techniques (assuming they are real) that DeepSeek is keeping to itself.
    • We are now likely to see attempts by Western companies to test DeepSeek’s techniques and if they work, rapidly employ them in their own workflows to close the efficiency gap that DeepSeek claims to have opened.
  • RFM Research (see here) has long held the opinion that AI training is incredibly inefficient and the endless wave of money that has been pouring into the sector has done nothing to improve that situation.
  • I have also been of the opinion for some time (see here) that limiting China’s access to advanced technology would force it to innovate in terms of producing AI more efficiently.
  • This appears to be precisely what has happened as the example of DeepSeek going beyond CUDA to reprogram some of the processing units to deal with cross-chip communications to get around the memory bandwidth limitations imposed by US sanctions demonstrates.
  • The net result is that I don’t think that we are about to witness a rush to switch out GPT, Grok, Claude or Gemini for DeepSeek R1 as R1 is a Chinese model and, with 700bn parameters, it will be impossible to ensure that there are no backdoors or gaps that would allow the Chinese state access to user data.
  • However, what we are going to see customers question the price they are being charged to access these models especially if it proves possible to train these models much more cheaply and with far fewer resources.
  • The main beneficiaries here are the Chinese state (for obvious reasons), Apple, Qualcomm and MediaTek who will benefit from using these techniques (if proven) to deploy more advanced models at the edge where they will consume fewer resources.
  • This is why the picks and shovels of AI have been hammered, and the end users have emerged relatively unscathed.
  • Meta has also emerged unscathed as it was the first proponent of open source for AI and I suspect that if DeepSeek’s techniques are verified, they will appear in Llama-based models first.
  • It is in these sorts of environments, that one wants to own the value end of the AI investment spectrum and here I remain comfortable with my position in Qualcomm and nuclear power.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.

Leave a Comment