Google Steps Up AI Game with Specialized TPU 8 Chips, Challenging Nvidia’s Dominance

On April 22, at the Google Cloud Next conference, Google Cloud unveiled its eighth-generation Tensor Processing Units (TPUs). This marks a strategic shift toward specialized AI hardware. Google is making a first-time move by splitting its TPU architecture into two distinct chips: the TPU 8t optimized for training large AI models, and the TPU 8i designed specifically for high-speed inference workloads.

The TPU 8t training chip can scale to superpods of 9,600 chips, delivering 121 exaflops of compute performance. This is nearly three times the performance of the previous Ironwood generation. Google asserts that this chip can significantly reduce frontier model development cycles from months to weeks.

On the other hand, the TPU 8i inference chip offers an impressive 80% better performance-per-dollar than its predecessor. It features 288GB of high-bandwidth memory and 384MB of on-chip SRAM, enabling faster AI responses.

Both chips run on Google’s custom Arm-based Axion CPU and support fourth-generation liquid cooling technology. This delivers up to twice the performance-per-watt compared to previous generations. The new processors will be generally available later this year as part of Google’s AI Hypercomputer platform. This platform combines custom hardware, software frameworks, and cloud services into an integrated stack.

The launch of these specialized chips intensifies competition with Nvidia, which currently holds over 92% of the data center GPU market. Google is betting that specialized chips tailored for specific AI workloads, rather than general-purpose GPUs, will provide cloud customers with superior performance and cost efficiency in the emerging era of AI agents.

Source: Google Blog

Move to the category:

Leave a Reply

Your email address will not be published. Required fields are marked *