Google has unveiled the details of a new version of its data center artificial intelligence chips and announced an Arm-based central processor.
An Arm processor is a type of CPU that uses the RISC architecture, simplifying instructions the computer has to process. Google’s Tensor Processing Units (TPUs) are one of the only alternatives to Nvidia’s sophisticated AI chips. However, developers can only use them via Google’s Cloud Platform and cannot purchase them directly.
However, Google’s new Axion CPU will first support the company’s AI operations before becoming available to Google Cloud’s business clients later this year. The company stated that its performance is better than that of x86 chips and general-purpose Arm chips in the cloud. The Axion chips will be used to run YouTube ads, power the Google Earth Engine, and support various other Google services.
And @ThomasorTK is announcing #Axion processors – @Google's first @ARM based chip – sees already 60% better energy efficiency than x86 #GoogleCloudNext pic.twitter.com/JpEdpA7oNd
— @GoogleCloud #GoogleCloudNext Holger Müller #AI (@holgermu) April 9, 2024
The Axion Arm-based CPU will deliver 30 per cent improved performance over “general-purpose Arm chips” and outperform Intel’s current processors by 50 per cent.
“We’re making it easy for customers to bring their existing workloads to Arm,” Google Cloud’s vice president and general manager of compute and machine learning infrastructure, Mark Lohmeyer, told Reuters. “Axion is built on open foundations but customers using Arm anywhere can easily adopt Axion without re-architecting or re-writing their apps.”
Lohmeyer also stated on a blog that the tech giant is improving its TPU AI chips: “TPU v5p is a next-generation accelerator that is purpose-built to train some of the largest and most demanding generative AI models.” The Alphabet subsidiary announced that the new TPU v5p chip is designed to operate in pods containing 8,960 chips, delivering twice the raw performance of the previous TPU generation. The TPU v5p is already available via Google’s cloud.
Google’s new cloud AI Hypercomputer architecture features
Google stated it has made significant improvements to its hypercomputer architecture, focusing on performance-optimized hardware enhancements. This includes the general availability of Cloud TPU v5p and A3 Mega VMs, which are powered by NVIDIA H100 Tensor Core GPUs. These updates are said to offer higher performance for large-scale training and come with enhanced networking capabilities.
It has optimized its storage portfolio for AI workloads with the introduction of Hyperdisk ML, a new block storage service designed for AI inference/serving workloads. New caching capabilities in Cloud Storage FUSE and Parallelstore have also been introduced, improving training and inferencing throughput and latency.
On the software front, Google has introduced several open source developments. This includes JetStream, a throughput- and memory-optimized inference engine for large language models (LLMs), which provides higher performance per dollar on open models like Gemma 7B.
Google is also introducing new flexible consumption options to better accommodate varying workload needs. This includes the Dynamic Workload Scheduler, which features a calendar mode for start time assurance and a flex start mode designed for optimized economics, further enhancing the efficiency and flexibility of Google’s cloud computing offerings.
Featured image: Canva