Microsoft Unveils Phi-2: A Small Language Model Outshines Giants LLaMA 2 and Gemini


Published on:

Microsoft has unveiled Phi-2, a Small Language Model (SML) that has demonstrated its prowess by outperforming competitors like LLaMA 2 and others. Developed by Microsoft Research, Phi-2 has showcased exceptional reasoning and language comprehension abilities, surpassing the AI models of tech giants like Meta and Google in certain tasks.

Phi-2 belongs to a family of transformer-based models with fewer parameters. Despite having only 2.7 billion parameters, compared to GPT-4’s staggering 1.7 trillion, Phi-2 matches or exceeds models up to 25 times its size. In tests involving mathematics and programming, Phi-2 outperformed Meta’s LLaMA 2. It even tackled physics problems akin to Google’s Gemini Nano 2 AI.

The developers highlighted, “With just 2.7 billion parameters, Phi-2 surpasses the performance of the Mistral and LLaMA 2 models in the 7B and 13B parameters across various benchmark points. Notably, it achieves better performance compared to the 25 times larger LLaMA 2-70B model in multi-step reasoning tasks, namely coding and mathematics.”

A key to Phi-2’s success lies in its training. It was trained with a mix including synthetic NLP texts, code subsets from Stack Overflow, programming competitions, and more. Microsoft emphasized the importance of training data quality, differentiating it from GPT-4 by curating web data filtered for educational value. The training data set was described as “textbook quality,” a strategy employed since Phi’s first version.

Phi-2’s training, which took 14 days using 96 NVIDIA A100 graphics cards, resulted in an AI that offers less toxicity and bias in its responses compared to LLaMA-2. Microsoft Research conducted extensive testing with academic benchmarks and internal tools.

However, Phi-2 will be available only for research projects. It will be part of Azure AI Studio to encourage language model development, but its current license does not permit commercial application use, such as in ChatGPT.

Vishak is a skilled Editor-in-chief at Code and Hack with a passion for AI and coding. He has a deep understanding of the latest trends and advancements in the fields of AI and Coding. He creates engaging and informative content on various topics related to AI, including machine learning, natural language processing, and coding. He stays up to date with the latest news and breakthroughs in these areas and delivers insightful articles and blog posts that help his readers stay informed and engaged.

Related Posts:

Leave a Reply

Please enter your comment!
Please enter your name here