Stability AI and Carper AI Labs Unveil FreeWilly Models: Advancing Open-Source LLMs

Stability AI and Carper AI Labs have joined forces to introduce two groundbreaking large-scale language models, FreeWilly1 and FreeWilly2. Leveraging the foundation of Meta’s Llama model, these models aim to bridge the gap between small and large AI models, showcasing impressive performance in logic tasks and natural language understanding.

Both FreeWilly1 and FreeWilly2 are built on the robust foundation of Meta’s Llama model. FreeWilly1 is derived from Llama 65B, while FreeWilly2 utilizes the powerful Llama 2 70B. To enhance their performance, Stability AI leveraged synthetic datasets generated through the Supervised Fine-Tune (SFT) process, allowing for fine-tuning and elongation.

A key aspect of developing the FreeWilly models involved the implementation of the Orca Method, as outlined in Microsoft’s “Orca: Progressive Learning from Complex Explanation Traces of GPT-4.” Unlike traditional approaches that mimic large models’ output style, the Orca Method imparts step-by-step reasoning processes to smaller models. The goal is to achieve comparable performance to their larger counterparts.

To assess model performance, Stability AI adopted EleutherAI’s lm-eval-harness and introduced AGIEval, a human-centric benchmark for evaluating underlying models. Notably, FreeWilly2 demonstrates a clear advantage over FreeWilly1 and achieves an average improvement of approximately 4 points compared to Llama 2 in various benchmarks. While FreeWilly2 excels in most aspects, Llama 2 remains ahead in the vital general language understanding benchmark, MMLU.

Stability AI places a strong emphasis on responsible release practices for FreeWilly models. An internal red team rigorously tests the models for potential hazards. Moreover, the company actively seeks external feedback to further enhance safety measures and ensure ethical AI applications.

Stability AI and Carper AI Labs Unveil FreeWilly Models: Advancing Open-Source LLMs

Related Posts:

Better Than DeepSeek? Alibaba Unveils Free AI Model Qwen2.5-Max

DeepSeek: The Free Chinese AI Challenging ChatGPT and Gemini

Anthropic’s Claude 3.5 Sonnet Surpasses GPT-4o and Google’s Gemini in Benchmarks

Llama 3 Pushes Boundaries with 70 Billion Parameters and Advanced AI Chatbot

Google DeepMind Develops SAFE, an AI Fact-Checker, to Validate LLM Outputs with 72% Accuracy

Mistral and Microsoft Forge Alliance and the Launch of Mistral Large

Leave a Reply Cancel reply