Mark Zuckerberg, founder and CEO of Meta, announced Llama3-enabled Meta AI assistant in a Facebook … [+] post on April 18, 2024

Alex Zhavoronkov, PhD

As ChatGPT, Anthropic, Mistral, Google, AWS, 01.ai, and other LLM players are making headlines by releasing new and more capable models, making each other obsolete overnight in certain benchmarks, many questioned Meta’s strategy in AI. Its heavy investments in fundamental research led by one of the “fathers of deep learning,” Dr. Yann LeCun, and open source approach puzzled many industry analysts. How would the company make money? Why is it enabling so many competitors to take Llama, build on top of it, and beat Llama itself in benchmarks?

Last week, Meta definitively answered many of these questions and announced a range of highly capable Llama3 models that leave competitor benchmarks in the dust. These models are also open source, so if you want to build on top of them and create an application within your organization, you certainly can. But, for the first time, Meta did something else—it released Llama-based systems to consumers via its many channels.

Think about it. While hundreds of millions of people got onto ChatGPT, Microsoft’s Bing, and even XAI’s Grok, Meta has billions of users on WhatsApp, Instagram, and Facebook Messenger. These are messengers that can seamlessly support conversational AI. And Meta AI assistant seems to be amazing. “You can use Meta AI on Facebook, Instagram, WhatsApp and Messenger to get things done, learn, create and connect with the things that matter to you.” Just call @Meta AI in chat and start conversing. It is also available via Meta.AI website and can generate text and images.

Making highly capable Llama tools free on its messengers will likely demonetize most of the offerings by other LLM vendors. And since most of the other tools are based on models that are not open source, the level of public trust to Meta’s tools should be significantly higher. Open source approach is strategic and helps the entire industry grow for one clear purpose – building trust and achieving world-scale validation of Meta’s internal models to avoid the many fiascos experienced by Google and others in the past.

PARIS, FRANCE – JUNE 14: Vice President and Chief AI Scientist at META, Yann LeCun attends the Viva … [+] Technology conference at Parc des Expositions Porte de Versailles on June 14, 2023 in Paris, France. Yann LeCun is a French artificial intelligence (AI) and artificial vision researcher. (Photo by Chesnot/Getty Images)

Getty Images

Currently, Meta Llama3 outperforms other published models in most benchmarks. This could very well be the dusk of the era of hot overvalued startups developing their own LLMs and the dawn of a new era where all consumer-facing LLMs belong to only a few prominent players just as we saw with search. And Meta will be the dominant player in this game.

I contacted Yann LeCun for a comment, and he disagreed. “Quite the opposite: high-performance open source models open the door to a large variety of players who can fine-tune those models for particular languages, cultures, value systems, political leanings, and centers of interest.”

And I hope he is right and Meta continues to open source its powerful models for the community to thrive.

The King is Dead, Long Live the King!

Llama3 rocks in most benchmarks. As its developers reported on X, it outperforms all other open-source models and is likely to outperform most of the top models in many benchmarks.

Ahmad Al-Dhale, Vice President of GenAI at Meta, shared the benchmarks comparing the leading open source models.

Dr. Ashton Zhang research scientist at Meta working on Llama and the author of “Dive into Deep Learning” open source book on AI tweeted the benchmarking data with commentary.

The 70B model can run on your laptop. You can run Llama3:Instruct on you Macbook Pro with M1 chip (my configuration). Here is how easy it is with Open-WebUI.

So what should AI startups be focusing on?

In my opinion, they should be using the most powerful LLMs from the larger players and focusing on AI models where Meta, Microsoft, Amazon, and Google have little or no expertise and where validation of the model output requires significant domain expertise and experimentation. For example, at Insilico Medicine we developed multiple multimodal multiomics LLMs for chemistry and biology including the PreciousGPT series for aging research. It is virtually impossible to generate the data and validate these models at scale without a high-throughput fully-automated experimental laboratory and significant experience in these specialist domains. Transformers trained on text and imaging are not capable of solving these domain specific tasks but are capable of helping to plan, execute, and analyze the work of domain-specific models trained on biology and chemistry data types.