[ad_1]
Google has announced that it is expanding its AI-optimised infrastructure portfolio with Cloud TPU v5e. It claims that it is “the most cost-efficient, versatile, and scalable Cloud TPU to date.” With the new tensor processing unit (TPU), Google aims to address the inadequate computing infrastructure that is unable to handle increasing workloads like generative AI and LLMs.
“The number of parameters in LLMs has increased by 10x per year over the past five years. As a result, customers need AI-optimised infrastructure that is both cost-effective and scalable,” Google said.
“We offer a complete solution for AI, from computing infrastructure optimised for AI to the end-to-end software and services that support the full lifecycle of model training, tuning, and serving at global scale,” it added.
TPU v5e features, specs
According to Google, Cloud TPU v5e is purpose-built to bring the cost-efficiency and performance required for medium- and large-scale training and inference. It is claimed to deliver “up to 2x higher training performance per dollar and up to 2.5x inference performance per dollar for LLMs and gen AI models compared to Cloud TPU v4.”
Google said the new chip is a combination of performance and flexibility with cost benefits.
“The number of parameters in LLMs has increased by 10x per year over the past five years. As a result, customers need AI-optimised infrastructure that is both cost-effective and scalable,” Google said.
“We offer a complete solution for AI, from computing infrastructure optimised for AI to the end-to-end software and services that support the full lifecycle of model training, tuning, and serving at global scale,” it added.
TPU v5e features, specs
According to Google, Cloud TPU v5e is purpose-built to bring the cost-efficiency and performance required for medium- and large-scale training and inference. It is claimed to deliver “up to 2x higher training performance per dollar and up to 2.5x inference performance per dollar for LLMs and gen AI models compared to Cloud TPU v4.”
Google said the new chip is a combination of performance and flexibility with cost benefits.
“We balance performance, flexibility, and efficiency with TPU v5e pods, allowing up to 256 chips to be interconnected with an aggregate bandwidth of more than 400 Tb/s and 100 petaOps of INT8 performance,” Google said. It also allows customers to choose the right configurations to serve a wide range of LLM and gen AI model sizes.
Google’s new supercomputer
Google has also announced a new version of its supercomputer to run more generative AI models. Called A3 VMs, the machine is based on Nvidia H100 GPUs to power large-scale AI models. The A3 VM features dual next-generation 4th Gen Intel Xeon scalable processors, eight Nvidia H100 GPUs per VM, and 2TB of host memory.
[ad_2]
Source link
More Stories
Google Maps: Three privacy features coming to Google Maps on Android, iPhones
Most-Downloaded IPhone App: This Chinese app was the most-downloaded iPhone app in the US in 2023
Ukraine’s largest mobile operator goes offline for millions of users after cyber attack