Meta, formerly known as Facebook, has released a new “state-of-the-art AI large-scale language model” called LLaMA (Large Language Model Meta AI), which the company hopes will help democratize access to AI research.
With a series of AI chatbots from Microsoft, OpenAI, and Google, Meta should not be forgotten, as the company has done and continues to do significant research in this area.
LLaMA consists of four different language models with 7 to 65 billion parameters, with the second smallest version outperforming OpenAI’s GPT-3 language model despite being “10 times smaller.” The largest model, LLaMA-65B, is said to compete with the best models from DeepMind and Google.
Meta has trained the LLaMA model using publicly available datasets such as Common Crawl, Wikipedia, and C4, meaning the company may be able to open-source the model and its weights, which would be a significant development in an industry dominated by big tech companies.
Meta positions LLaMA as a “foundation model” to build more sophisticated AI models on top of it in the future, just as OpenAI built ChatGPT on GPT-3. The company hopes that LLaMA will be useful in natural language research and may power applications such as question answering, natural language understanding or reading comprehension.
The most important aspect of LLaMA is the LLaMA-13B model, which is said to surpass GPT-3 while running on a single GPU, opening the door to achieving ChatGPT-like performance on consumer-level hardware soon. Parameters in a language model are an important factor that determines its performance. The more parameters, the more space it occupies and the more computational resources it requires. Therefore, if the same results as other models can be obtained with fewer parameters, it will significantly increase efficiency.
A simplified version of LLaMA is now available on GitHub, and interested researchers can request access to the full code via a form provided by Meta. The company has not announced plans for a broader release of models at this time.