Intel ARC graphics cards based on the Alchemist Xe-HPG GPUs are all set for launch next year and based on the specifications, we could be looking at very competitive performance numbers against AMD and NVIDIA GPUs.
Intel's Flagship ARC Graphics Cards With Xe-HPG Alchemist GPU To Be Highly Competitive Against NVIDIA GA104 & AMD Navi 22
The first Intel ARC graphics cards will be powered by the Alchemist GPUs based on the Xe-HPG architecture. Intel has so far confirmed that the first discrete graphics cards will hit retail by Q1 2022 and will be based on the TSMC 6nm process node. Intel also detailed the specifications of Alchemist GPUs and the core building blocks which include the Xe-Core.
Intel ARC Xe-HPG Alchemist GPU - The Building Blocks
So rounding up what we learned, the Intel Xe-HPG Alchemist GPU features a Xe-Core which is the fundamental DNA of the 1st Gen ARC lineup. The Xe-Core is a compute block that is composed of 16 Vector Engines (256-bit per engine) and 16 Matrix Engines (1024-bit per engine). Each Vector Engine is composed of 8 ALUs so, in total, we are looking at 128 ALUs per Xe-Core. Each Matrix Engine block is also referred to as an XMX block which will handle tensor operations in both FP16 and INT8 modes. The Xe-Core further features its own dedicated L1 cache.
Intel fuses four Xe-Cores together to form a Render Slice which is composed of 4 Ray Tracing Units, four Sampler Units, Geometry/Rasterize/HiZ engines, and two Pixel Backend blocks with 8 units on each. These Render Slices are put together to form the main GPUs. The flagship is composed of an 8 Render Slice configuration which features 32 Xe-Cores, 512 Vector Engines, and 4096 ALUs. There will be different configurations with 2, 4, 6 Render Slices but we are focusing on the flagship part in this report.
Intel ARC Alchemist vs NVIDIA GA104 & AMD Navi 22 GPUs
GPU Name | Alchemist DG-512 | NVIDIA GA104 | AMD Navi 22 |
---|---|---|---|
Architecture | Xe-HPG | Ampere | RDNA 2 |
Process Node | TSMC 6nm | Samsung 8nm | TSMC 7nm |
Flagship Product | ARC (TBA) | GeForce RTX 3070 Ti | Radeon RX 6700 XT |
Raster Engine | 8 | 6 | 2 |
FP32 Cores | 32 Xe Cores | 48 SM Units | 40 Compute Units |
FP32 Units | 4096 | 6144 | 2560 |
FP32 Compute | ~16 TFLOPs | 21.7 TFLOPs | 12.4 TFLOPs |
TMUs | 256 | 192 | 160 |
ROPs | 128 | 96 | 64 |
RT Cores | 32 RT Units | 48 RT Cores (V2) | 40 RA Units |
Tensor Cores | 512 XMX Cores | 192 Tensor Cores (V3) | N/A |
Tensor Compute | ~131 TFLOPs FP16 ~262 TOPs INT8 |
87 TFLOPs FP16 174 TOPs INT8 |
25 TFLOPs FP16 50 TOPs INT8 |
L2 Cache | TBA | 4 MB | 3 MB |
Additional Cache | 16 MB Smart Cache? | N/A | 96 MB Infinity Cache |
Memory Bus | 256-bit | 256-bit | 192-bit |
Memory Capacity | 16 GB GDDR6 | 8 GB GDDR6X | 16 GB GDDR6 |
Launch | Q1 2022 | Q2 2021 | Q1 2021 |
Intel ARC Xe-HPG Alchemist GPU - Comparing It To NVIDIA's GA104 & AMD's Navi 22
A rundown of the specifications and comparison has been made by 3DCenter which gives us an idea of the theoretical performance that Intel's new GPU will have to offer. So right off the bat, Intel's ARC Xe-HPG Alchemist flagship will offer more TMUs and ROPs than the NVIDIA and AMD competition. The core count at 4096 is higher than AMD's Navi 22, Navi 21 (RX 6800) but lower compared to NVIDIA's GA104. NVIDIA is using a dual FP32 numbering methodology and should theoretically be 3072.
Intel's ARC Alchemist GPUs have lower ray tracing units than the competition but we don't know exactly how their Ray tracing implementation works. For example, while Navi 22 offers more RT cores than the GA106 Ampere GPUs, the hardware-level integration within NVIDIA's RT cores is superior in all regards to AMD's implementation. So the final performance would depend upon Intel's hardware-level integration and software-level optimization for ray tracing applications.
A major lead that Intel could have over the competition, especially NVIDIA since AMD lacks in this department, is AI assistance in supersampling technologies. Intel has already showcased an impressive demo of its XeSS technology and based on the expected numbers, Intel GPUs could outperform NVIDIA's Tensor Core implementation (DLSS) with its XMX architecture. Intel is also expected to feature a small but useful game cache on its GPUs and will be equipped with higher VRAM capacities of up to 16 GB (GDDR6) across a 256-bit bus interface. This would be twice as much memory as NVIDIA's RTX 3070 and RTX 3070 Ti so they may have to prepare a refresh to counter it.
Lastly, the theoretical FP32 compute performance is computed with an expected peak clock rate of 2 GHz. That's the most likely scenario for TSMC's 6nm process node given how well clocks scale on TSMC's 7nm process node. Based on that, the Intel Xe-HPG Alchemist GPU could offer around 16-17 TFLOPs of Compute power. This is slightly lower FLOPs than what NVIDIA's GA104 produces but it should be noted that not all FLOPs should be measured equally as gaming architecture runs very different compared to datacenter chips.
Based on these early specifications, we are looking at an Intel graphics card that could end up being faster than AMD's Radeon RX 6700 XT and NVIDIA's RTX 3070 with ease. To push its 1st Gen graphics cards further into the consumer segment, Intel may likely offer very competitive prices against established giants like AMD and NVIDIA. And along with a strong suite of software-level optimizations, they might have a win-win in their hands which will only be pushed forward with future generations of ARC GPUs.
Intel ARC Alchemist vs NVIDIA GA104 & AMD Navi 22 GPUs
GPU Name | Alchemist DG-512 | NVIDIA GA104 | AMD Navi 22 |
---|---|---|---|
Architecture | Xe-HPG | Ampere | RDNA 2 |
Process Node | TSMC 6nm | Samsung 8nm | TSMC 7nm |
Flagship Product | ARC (TBA) | GeForce RTX 3070 Ti | Radeon RX 6700 XT |
Raster Engine | 8 | 6 | 2 |
FP32 Cores | 32 Xe Cores | 48 SM Units | 40 Compute Units |
FP32 Units | 4096 | 6144 | 2560 |
FP32 Compute | ~16 TFLOPs | 21.7 TFLOPs | 12.4 TFLOPs |
TMUs | 256 | 192 | 160 |
ROPs | 128 | 96 | 64 |
RT Cores | 32 RT Units | 48 RT Cores (V2) | 40 RA Units |
Tensor Cores | 512 XMX Cores | 192 Tensor Cores (V3) | N/A |
Tensor Compute | ~131 TFLOPs FP16 ~262 TOPs INT8 |
87 TFLOPs FP16 174 TOPs INT8 |
25 TFLOPs FP16 50 TOPs INT8 |
L2 Cache | TBA | 4 MB | 3 MB |
Additional Cache | 16 MB Smart Cache? | N/A | 96 MB Infinity Cache |
Memory Bus | 256-bit | 256-bit | 192-bit |
Memory Capacity | 16 GB GDDR6 | 8 GB GDDR6X | 16 GB GDDR6 |
Launch | Q1 2022 | Q2 2021 | Q1 2021 |
The post Intel’s Flagship ARC Graphics Card With Xe-HPG Alchemist GPU To Tackle AMD RX 6700 XT & NVIDIA RTX 3070 by Hassan Mujtaba appeared first on Wccftech.
source https://wccftech.com/intels-flagship-arc-graphics-card-with-xe-hpg-alchemist-gpu-to-tackle-amd-rx-6700-xt-nvidia-rtx-3070/