Over the past five years, we at Arm have been incredibly gratified and proud to see our partners delivering cloud innovation through Arm Neoverse technology.
In the last six months alone, Alibaba Cloud announced new server chips for optimized cloud computing services powered by 128 Arm cores, Google Cloud unveiled its collaboration with Intel on the Arm Neoverse-based Mount Evans IPU, Oracle Cloud Infrastructure (OCI) introduced Arm-based cloud instances to OCI with the 80-core Arm-based Ampere A1 Compute and Tencent Cloud launched trial availability of its Arm-based Cloud Instance-SR1 based on the Ampere Altra processor.
The latest step came last week when AWS announced Graviton3, a server application processor that provides up to a 25 percent performance uplift over its predecessor while consuming up to 60 percent less energy.
AWS continues to set the bar for price-performance with success stories across a variety of workloads including media services (NextRoll: 50 percent TCO improvement), ERP (Globe3: 20 percent performance improvement and 20 percent price improvement), risk assessment (LexisNexis: 30 percent more traffic) and 3D modeling (S-Cube: 30-45 percent performance uplift with a 20 percent price reduction).
These milestones also highlight another industry-wide trend with far-reaching consequences for digital infrastructure. Namely, the industry-wide trend toward scalable customization.
The road to customization
Cloud was initially built deploying scale-out systems with co-equal nodes that work together. Scale-out systems can be built with commodity hardware since you’re going for quantity over quality. This approach led to significant cost efficiencies and drove a new need-based consumption model for compute services.
Cloud services grew but over time they began to look more like a commoditized infrastructure service until AWS reimagined the server and invested in the underlying compute silicon. This spun a new wave of differentiation and created more value for its customers. Now they can offer distinct, optimized experiences at scale.
Exhibit A: Data processing units (DPUs)
Consider AWS Nitro, an Arm-based system for managing network, storage, security, and other operational functions. These functions can absorb 30 percent or more of CPU cycles. A dedicated system with its own software stack, Nitro processes these functions on a parallel, optimized track, not as an offload from the CPU. The result? Reduced storage latency, increased networking speeds, accelerated application workloads, reduced costs for customers and reduced attack surfaces all at the same time.
Nitro also accelerates product development by allowing AWS to take a building block approach to new services. Since 2017, when the full Nitro stack became available, the number of instances available from AWS has grown by over 300, or 4x.
Other cloud providers are achieving similar results with data processing units (DPUs). DPU resources will also inspire new types of services. Imagine DPUs enabling AI-supported cloud security services for combatting the growing problem of deep fakes or developing zero delay defenses for zero-day attacks. Going forward, you likely won’t see many data centers built without DPUs, a huge difference from the datacenters of today.
New custom processors for purpose-built performance
The same sort of customization diversification is occurring across application processors. Although the Arm Neoverse-based chips from Ampere and AWS and those in the future coming from Alibaba and others might share a common foundation, they differ in speed, core count, design, performance, and features.
Graviton2, for example, features automated 256-bit system memory encryption to ameliorate a chronic security problem: namely, the failure of customers to make encryption an ordinary business practice. Graviton3 will offer significant performance improvements including a 3x increase in ML Inference workloads and will be the first server CPU to enable DDR5 memory. Graviton3 will power the new EC2 C7g instances on the AWS cloud. Tiers of instances varying by price, performance and application will proliferate as more variations on the base Neoverse designs come to market.
In high-performance computing (HPC), researchers from the UK, India, the Republic of Korea, the U.S. and elsewhere are currently building next-generation Arm Neoverse chips tailored to a different set of goals. Some want to establish breakthrough levels of power performance. Others are aiming at architectures that can trickle down commercially.
The benefits of scalable customization will also percolate across the network and to the edge. It has to. As my colleague Panch Chandrasekharan has noted, 5G networks need to manage more data and more complex workloads while keeping cost in line with 4G while keeping energy in check. A combination of open architectures and customized, specialized silicon make that possible. One project worth watching is DISH Networks’ nationwide 5G network being built on an AWS backbone. The cloudified network will be a first for the U.S. (DISH will adopt Graviton servers for its computational needs).
Ten years ago, ten years ahead
If you asked people in 2011 what the data center of the future looked like, they would have probably imagined a bigger version of what they already knew. It might have more flash memory or sported some “software-defined” elements, but overall, the core elements would have been the same: racks of dual-socket servers with CPUs juggling threads and churning out hot air.
Today, data centers are being built around a portfolio of processors and cloud-native software stacks. And, while systems are compatible, they are more different than ever. With a growing number of customers using Arm Neoverse as a platform for pushing the envelope, diversity will only grow too.