Stability AI Releases StableVicuna: The First Open-Source Chatbot Trained with Human Feedback

Stability AI has released StableVicuna, the first large-scale open-source chatbot trained with human feedback. The Vicuna chatbot was introduced in early April, and the model is a 13 billion-parameter LLaMA model modified with the Alpaca formula. What sets the Vicuna variant apart is that it was improved using a process called “Reinforcement Learning with Human Feedback” (RLHF) LLM.

StableVicuna does more than just text generation and can write code and perform simple math, according to Stability AI. Although StableVicuna performs similarly to previously released open-source chatbots, Stability AI plans to develop it further and launch it on Discord soon. A demo is currently available on HuggingFace, and StableVicuna will also be available through a chat interface soon.

At Hugging Face, developers can download the model’s weights as a delta to the original LLaMA model. Users who want to access StableVicuna themselves will, however, require access to the original LLaMA, which can be obtained from the company site. It should be noted that commercial use is not permitted.

The issue with open-source chatbots enhanced with generated chatbot data is the potential of an echo chamber in which AI models reinforce their current flaws and biases through ever-new training cycles. Furthermore, fine-tuning training data can reinforce hallucinations if it contains information not actually present in the original model.

The success of ChatGPT, the most current large-scale language model trained on the GPT architecture, was due to reinforcement learning with human feedback (RLHF). RLHF was also used in the development of StableVicuna. By providing small-scale feedback work of thousands of people on the usefulness of tens of thousands of chat outputs, the chatbot was tuned to always have an appropriate response ready.

RLHF also guarantees that the chatbot’s output adheres to social standards. GPT-4 would be far more difficult to utilise without RLHF and might produce severe material that encourages crime or indicates the systematic destruction of mankind.

The release of StableVicuna marks a significant step forward in developing open-source chatbots that incorporate human feedback. By continuing to refine and develop these models, researchers and developers can help ensure that the technology remains ethical and responsible.

Stability AI Releases StableVicuna: The First Open-Source Chatbot Trained with Human Feedback

Related Posts:

Better Than DeepSeek? Alibaba Unveils Free AI Model Qwen2.5-Max

DeepSeek Sparks Global Chaos: China’s Advanced AI Sends Shockwaves Through the US and Beyond

DeepSeek: The Free Chinese AI Challenging ChatGPT and Gemini

Storybook Pi: Transforming Raspberry Pi 5 into an AI Fairy Tale Illustrator

Llama 3 Pushes Boundaries with 70 Billion Parameters and Advanced AI Chatbot

xAI’s Grok with Staggering 314 Billion Parameters Goes Open-Source

Leave a Reply Cancel reply