Google vs. DeepSeek – Trading Silver Bullets

Another set of unsubstantiated claims

  • Google has come out swinging against DeepSeek claiming that Gemini is more efficient in both training and inference, but its comments are even more nebulous than DeepSeek’s leading me to think that there is something in what DeepSeek has been claiming.
  • In an interview with Bloomberg (see here), Desmis Hassabis makes a series of observations and claims with regards to Gemini vs. DeepSeek some of which are credible and some of which need to be substantiated before they can be believed.
    • First, Gemini is more efficient: Mr Hassabis claimed that Gemini is more efficient than DeepSeek R1 both in terms of “training to performance and its cost to performance” but failed to offer any evidence to support this claim.
    • Given that the performance of the top models these days are all within spitting distance of one another, Mr Hassabis is basically saying that Gemini is cheaper to train and cheaper to run than DeepSeek R1.
    • Unfortunately, this was substantiated by the statement “We don’t talk about that very much”, which, in my opinion, is worth nothing.
    • DeepSeek R1 is the first model I have come across where the fine-tuning step is fully automated which, combined with lower costs in general in China, and Mr Hassabis’ other comments, leads me to think that DeepSeek has achieved more than is being admitted to here.
    • Second, no silver bullets: While Mr Hassabis admits that DeepSeek is the “best team that I have seen come out of China” and that the model is “very impressive” he is correct to point out that there is nothing particularly new about what DeepSeek has done.
    • This concurs with RFM Research which has concluded that DeepSeek has not used any novel techniques, but it has done so in a way that has produced potentially very interesting results.
    • Across the technology industry, there are many examples of existing techniques being used in novel ways resulting in a large step forward in terms of performance and efficiency.
    • Consequently, the fact that DeepSeek has not invented a new technique does not detract from the possibility that the claims made have some basis.
    • In fact, the defensive nature of Google’s statements which had clearly been prepared in advance leads me to think that behind the scenes, DeepSeek has rattled Google and it is working to verify precisely what DeepSeek has achieved.
    • Third, DeepSeek is exaggerating: which is almost certainly true.
    • DeepSeek very cleverly implanted a $6m training figure in its press release (dressed up to look like a scientific paper) which everyone immediately latched onto and assumed that this was the total cost.
    • This was then compared with OpenAI’s rumoured $500m cost to develop GPT-4 resulting in the misconception that DeepSeek is 100x cheaper to train.
    • This is not a fair comparison as the $6m refers to the last training run that DeepSeek carried out and does not include all of the other training runs, design and build costs and company overheads that were incurred to get it to its published state.
    • RFM’s back-of-the-envelope calculation reveals that if all of DeepSeek’s claims are true, then it could be something like 7x cheaper than OpenAI.
    • This makes sense to me as OpenAI is a voracious consumer of compute as it believes that bigger models and bigger compute will make its models better.
    • With an almost endless supply of money, it has never had to worry about efficiency whereas China has been cut off from cutting-edge compute for a few years.
    • Back in October 2024, I postulated that a silver lining to the resource constraints that were aimed at containing China’s AI development would force it to innovate in this area (see here) which appears to be exactly what it has done.
    • Hence, the most likely source of innovations around efficiency was always going to be China as opposed to the West where the magic money tree (see here) is still showering its bounty upon anyone working on AI.
  • The net result is that I think that Google is making some valid claims with regard to DeepSeek, but is underplaying the advances that DeepSeek has made when it comes to efficiency.
  • What Google (and everyone else) will now do is attempt to replicate DeepSeek’s techniques and if they work, incorporate them into their workflows.
  • DeepSeek is at a minimum working with the explicit approval of the Chinese state, and I suspect that there is more state involvement than DeepSeek is letting on.
  • Hence, if the advances are as good as DeepSeek says it is, it makes no sense for China to give the innovations away and receive nothing in return for them.
  • This is why I think that  DeepSeek innovations are in how it has modified and implemented existing techniques to train its models rather than in the model itself.
  • These, it can easily keep to itself despite releasing the model to open source and China can then benefit from all of the PR buzz without having to give the real intellectual property away.
  • When the magic money tree runs out of leaves and the correction comes, everyone will be forced to do more with less and in this circumstance, China may find itself ahead in this particular area.
  • RFM and Alavan Independent have always rated China as a force to be reckoned with in AI where it will put up much more of a fight than it has in semiconductors.
  • This looks certain to set the tone of the ideological struggle for 2025.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.

Leave a Comment