Meta Platforms & AI – Size Does Matter

Meta Platforms is implying its models are the most efficient.

  • In the ever-expanding, world of huge AI models, Meta is hinting that it is going the other way with Llama 3 meaning that if it can match its peers on performance, then it will quickly become the model of choice for almost everyone.
  • Meta has launched its latest foundation model, Llama 3 which will be made available to the open-source community in 2 sizes, one at 70bn parameters and the other at 8bn.
  • This is surprising as Llama 2’s largest size is already 70bn representing no increase in size over the old version.
  • This is a huge disparity between what OpenAI (and I suspect everyone else) is doing as GPT-4 is thought to be 1.8tn parameters representing a 10x increase on GPT-3.
  • Given Nvidia’s product roadmap of designing its products for bigger and bigger models, it would seem that this trend in the industry may continue but Meta appears to be bucking this trend.
  • Llama 3 is available in just 2 sizes compared to Llama 2 which was available in 5 implying that these are the only sizes that saw any real take up in the market.
  • This makes sense because when one has a data centre or a powerful device one would want the best performance but when running it on a device with a battery one would want the smallest.
  • Hence, I don’t think that this size reduction is going to cause a decline in the popularity or usage of Llama as the standard model for open source.
  • Ever since Kaplan’s famous research paper in 2020 where it was demonstrated that the bigger a model is, the better it performs, the trend in the industry has been to create larger and larger models in the hope that they will deliver better performance.
  • According to Nvidia, the industry standard is now well over 1tn parameters which is unimaginably large, and is a reason why so much money is being invested in compute resources.
  • It is this version that is used to compare to Gemini, GPT-4, Claude 3 and so on and by and large, it manages to hold its ground on the standard benchmarks.
  • This is highly significant because the focus is moving away from training towards inference and a 70bn model will be far cheaper to store and run than one that is more than 1tn parameters in size.
  • The benchmarks that Meta is showing (see here) are against smaller models but many of them are still 50% to 100% larger than Llama 3’s 70bn.
  • Instead of size, Meta has put more of its training budget into training data where it trained Llama 3 on 15tn tokens which is seven times more than was used for Llama 2.
  • This is a technique that was demonstrated by Deep Mind (see here) where training for longer and with more data could allow smaller models to outperform much larger ones.
  • This has significant implications when it comes to the costs associated with running these models for commercial applications and also on battery-powered devices.
  • Furthermore, shifting the cost into training as opposed to inference gives a large benefit when it comes to running models at scale.
  • Generative AI models that have hundreds of millions of users will incur inference costs that are many times greater than the cost to train them.
  • The training cost is fixed while the inference cost is variable as it depends on usage and so smaller models with performance as larger ones will be far better at offering their operators a high return on investment.
  • Right now, no one is paying a lot of attention to this but when investors start to ask for returns, this could be an advantage that plays strongly in Meta’s favour.
  • I expect that Meta, Google et al will follow OpenAI and Baidu’s lead in launching tools by which developers can create generative AI services using their foundation models which look set to become the OS of generative AI.
  • If Meta can make an efficiency argument, this will greatly help its reach with developers many of whom are already tinkering with its models in the open-source community.
  • At 25x 2024 PER, the easy money in Meta has been made but compared to other AI players, Meta along with Google looks to be more fairly priced.
  • However, the greatest AI bargain of all is Baidu which at less than 10x earnings and no competition in China looks set to dominate.
  • However, one has to be willing to stomach the demonstrably large risk of Chinese state interference in the private sector which has hammered sentiment and willingness to invest money in China.

RICHARD WINDSOR

Richard is founder, owner of research company, Radio Free Mobile. He has 16 years of experience working in sell side equity research. During his 11 year tenure at Nomura Securities, he focused on the equity coverage of the Global Technology sector.