According to The Information, Microsoft has been building its artificial intelligence processor, “Athena,” since 2019. The chip is specifically designed for large-scale language model (LLM) training, similar to the Tensor Processing Unit (TPU) developed in-house by Google and the Trainium and Inferentia processor architectures developed by Amazon.
The initial Athena device will likely employ Taiwan Semiconductor’s (TSMC) 5-nanometer technology, and Microsoft might have it in-house as soon as next year. Given the urgent need for hardware development to keep up with the expanding scale of sophisticated generative AI models, which surpasses the processing capacity required to train them, this approach should be viewed as a natural trend.
According to John Peddie Research, NVIDIA is currently the market leader in supplying AI chips, with a market share of around 88%. However, Microsoft’s move to develop a custom AI accelerator strategy for LLM training suggests that hyper scalers must develop their custom silicon.
While NVIDIA makes very powerful general-purpose AI chips and is particularly strong in offering its parallel computing platform CUDA for ML training, it may be more cost-effective for hyper scalers to develop their chips for inference needs. This is especially true for customers who don’t require the expensive NVIDIA option.
Microsoft’s move to develop its AI chip for LLM training is expected to accelerate its generative AI strategy while cutting costs. In addition, companies are expected to continue developing their silicon to compete with NVIDIA and Intel in general-purpose cloud computing.
In conclusion, the trend towards developing custom silicon for AI chips is set to continue as hyper scalers seek to meet the growing demand for hardware development to keep up with the rapid scale of advanced generative AI models.