in ,

BloombergGPT: Bloomberg Develops Domain-Specific Language Model for Financial NLP Tasks


Bloomberg, a leading financial data company, has developed a large-scale language model (LLM) called BloombergGPT, specifically designed for natural language processing (NLP) tasks in the financial industry. 

With its complex terminology and unique language requirements, the financial sector has long required a domain-specific language model, and BloombergGPT represents a significant step in its development and application.

According to Shawn Edwards, Bloomberg’s Chief Technology Officer, the development of BloombergGPT marks the first LLM dedicated to the financial sector. The model is designed to enhance existing financial NLP tasks, including sentiment analysis, named entity recognition, and news classification, while bringing together the massive quantity of data available on the Bloomberg Terminal, unlocking the full potential of AI in the financial sector.

To create the BloombergGPT model, the company’s ML product and research team relied on its 40 years of experience collecting and maintaining financial linguistic documents to develop domain-specific datasets. They created a comprehensive 363 billion token dataset of English financial documents and a public dataset of 345 billion tokens, resulting in a large training corpus of over 700 billion tokens. The team used a portion of this corpus to train a decoder-only causal language model with 50 billion parameters.

BloombergGPT has been tested against popular benchmarks consisting of finance-specific NLP benchmarks, Bloomberg’s internal benchmarks, and broad categories of general NLP tasks. The model has demonstrated superior performance to similarly sized open models on financial tasks and performed equally well or better on general NLP benchmarks.

According to Gideon Mann, director of Bloomberg’s ML product and research team, the quality of machine learning and natural language processing models depends on the data given to them. Thanks to Bloomberg’s extensive collection of financial documents, they have carefully created large, clean, domain-specific datasets for training LLMs that are best suited for financial use cases.


Written by Vishak

Vishak is a skilled Editor-in-chief at Code and Hack with a passion for AI and coding. He has a deep understanding of the latest trends and advancements in the fields of AI and Coding. He creates engaging and informative content on various topics related to AI, including machine learning, natural language processing, and coding. He stays up to date with the latest news and breakthroughs in these areas and delivers insightful articles and blog posts that help his readers stay informed and engaged.

Leave a Reply


Your email address will not be published. Required fields are marked *

Sundar Pichai and Bard AI

Google CEO Sundar Pichai Promises an Upgrade for the Experimental AI Chatbot Bard

Mullvad Browser with built-in VPN

Mullvad Browser: A New Tool for Online Privacy and Anonymity