Ubiquitous compute, pervasive connectivity, artificial intelligence and cloud-to-edge infrastructure drive a diverse and expanding range of computing workloads from the desktop to the data center.
At Architecture Day 2021, Intel detailed the company’s architectural innovations to meet this exploding demand, setting the stage for new generations of leadership products. Intel architects provided details on two new x86 central processing unit architectures; Intel’s first performance hybrid architecture and Intel® Thread Director; Intel’s next-generation data center processors; infrastructure processing unit architectures; and upcoming graphics architectures.
- Raja Koduri Editorial: Intel Advances Architecture for Data Center, HPC-AI and Client Computing
- Fact Sheet: Intel Unveils Biggest Architectural Shifts in a Generation for CPUs, GPUs and IPUs
- Stuart Pann Editorial: Expanding Intel’s Foundry Partnerships: A Critical Piece of IDM 2.0
x86 Architectures: With its Efficient-core and its Performance-core, Intel signals the biggest architectural shift in a generation for x86 central processing units. The Efficient-core microarchitecture is designed for throughput efficiency and efficient offloading of background tasks for multitasking. It runs at low voltage and creates headroom to increase frequency and ramp up performance for more demanding workloads. The Performance-core microarchitecture is designed for speed, the highest performing CPU core Intel has built. It pushes the limits of low latency and single-threaded application performance and provides a significant boost at high-power efficiency that can better support large applications.
Client Computing – “Alder Lake,” Intel Thread Director: Intel’s next-generation client architecture, code-named “Alder Lake,” is Intel’s first performance hybrid architecture. Alder Lake integrates a Performance-core and an Efficient-core to provide significant performance across all workload types. For the cores to work seamlessly with the operating system, Intel developed Intel® Thread Director. Built directly into the core, Thread Director empowers the operating system to place the right thread on the right core at the right time. Alder Lake will deliver performance that scales to support all client segments from ultra-portable laptops to enthusiast and commercial desktops.
Client Computing – Graphics: Xe-HPG is a new discrete graphics microarchitecture for gamers and creators designed for scalability and enthusiast-class performance with a software-first approach. Products based on this microarchitecture will come to market in the first quarter of 2022 under the Intel® Arc™ brand and Alchemist family of system-on-chips.
Data Center – Sapphire Rapids: Combining Intel’s Performancecores with new accelerator cores, “Sapphire Rapids,” the next generation of Intel® Xeon® Scalable processors, represents the industry’s biggest data center platform advancement in over a decade. The processor delivers substantial compute performance across dynamic and increasingly demanding data center usages and is workload-optimized to deliver high performance on elastic compute models like cloud, microservices and artificial intelligence.
Data Center – Infrastructure Processing Unit: The IPU is designed to enable cloud and communication service providers to reduce overhead and free up performance for central processing units. Among the new IPU architectures: Mount Evans is Intel’s first ASIC IPU, designed to address the complexity of diverse and dispersed data centers. Oak Springs Canyon is an IPU reference platform built with the Intel® Xeon® D processor and the Intel® Agilex™ FPGA. And the Intel® N6000 Accelerated Development Platform is designed for use with Xeon-based servers.
Data Center – Ponte Vecchio: Based on the Xe-HPC microarchitecture, Ponte Vecchio delivers industry-leading FLOPs and compute density to accelerate artificial intelligence, high performance computing and advanced analytics workloads. Intel disclosed details of the Xe-HPC microarchitecture, including that A0 silicon performance is providing greater than 45 TFLOPS FP32 throughput, greater than 5 TBps memory fabric bandwidth and greater than 2 TBps connectivity bandwidth.
Xe HPG – High Quality Super Sampling: This XeSS demo in 4K shows high-quality super sampling in action on Xe HPG. XeSS uses deep learning to synthesize images that are very close to the quality of native high-res rendering. This reconstruction is performed by a neural network trained to deliver high performance and great quality. The contents and game levels shown in this demo were created by Rens. Rens is a 3D artist, environment artist and technical art director. He is known for his outstanding photogrammetry techniques and high-end rendering skills, and has worked with top game development studios like DICE, Epic Games and Sony.
One API AI Rendering Tool Kit: This demo shows an end-to-end film quality creator workflow using a pre-production implementation of Intel® Embree and artificial intelligence-based Intel® Open Image Denoise libraries from the open-source Intel® OneAPI Rendering Toolkit running cross-architecture on CPUs and Xe architecture ray tracing-accelerated GPUs. This includes a look at a CPU-based creator workflow using the commercial SideFx Houdini application with Pixar’s open-source USD APIs calling the Embree and Open Image Denoise APIs. Then using the same USD, Embree and Open Image Denoise API’s call on the CPU to deliver beautiful rendering in real time via an Xe GPU. Experience a real-time walk-through of an Intel history-inspired path traced scene at the fictitious 4004 Moore Lane. Finally, see the final 4K film-quality version of a 45-second film containing 1,350 frames rendered many times faster than with the CPU only. Now, the same feature-rich render kit capabilities artists and app developers crave on CPUs, including ray tracing and AI acceleration, are available on GPUs.
OneAPI Analytics Toolkit: This demo highlights Ponte Vecchio with oneAPI AI Analytics Toolkit running ResNet50 benchmarks. Early results from ResNet50 inference and training throughput when using Ponte Vecchio and Sapphire Rapids have already established a new performance bar and show that Intel is on track to deliver its goal of artificial intelligence and high performance computing performance leadership that surpasses the previous performance leader.
AMX: Advanced matrix extensions (AMX) is Intel’s next-generation, built-in AI acceleration advancement for machine learning inference and training, targeted for data center. AMX is a hardware block in the Sapphire Rapids CPU with a new expandable two-dimensional register file and new matrix multiply instructions to enhance performance for a variety of deep learning workloads. Demo shows AMX in action in Intel’s validation labs.
Alder Lake: This demo shows several examples of Intel Thread Director technology running on an Alder Lake platform with Microsoft Windows 11. Intel Thread Director provides hints to the operating system for optimally scheduling threads of various types to run on either a Performance-core or an Efficient-core. Thread Director technology allows Intel to provide smarter assistance to the operating system by monitoring the instruction mix, the current state of each core and other relevant microarchitecture telemetry at a granular level.