Google has shared details about its powerful supercomputer based on its custom Tensor Processing Unit (TPU) chips that the company uses for training its artificial intelligence (AI) systems. According to Google, its TPU chips are faster and more efficient than the NVIDIA A100 accelerators many other tech companies use.
The new system uses more than 4,000 fourth-generation TPU chips that Google developed in-house. These chips are used in over 90% of the company’s AI training tasks, including training chatbots that can communicate almost like humans and systems that generate images.
Google’s engineers used the supercomputer to train Google PaLM‘s large language model, the largest publicly available language model. The model was trained for 50 days using two 4,000-chip supercomputers, and the supercomputer’s mechanisms allowed for reconfiguring connections between chips on-the-fly to avoid failures and improve performance.
Google claims that its fourth-generation TPU chip is 1.7 times faster and 1.9 times more energy-efficient than the NVIDIA A100 accelerator that was released simultaneously. However, Google did not make a comparison with the more modern NVIDIA H100 model, which was released later and is based on newer technologies.
Google notes that the supercomputer was first launched in 2020 in an Oklahoma data centre, where it was used to train the AI Midjourney. This neural network generates images from text descriptions. The company claims its supercomputer is almost twice as fast as a comparable device based on the NVIDIA A100 GPU in AI training tasks.
Google also hinted that it is working on a new TPU to compete with the NVIDIA H100, but no details were provided.