Touring the Intel AI Playground – Inside the Intel Developer Cloud

1

The Hardware Behind the Intel Developer Cloud

There was a ton of hardware installed, and frankly a lot more that was in process. Our visit was in August 2023, and there were pallets of servers, storage nodes, switches, and more that had been unpacked and staged for installation on the floor. In the next photos, you will see where some of these systems will go.

Intel Developer Cloud Pallets Of Servers To Install 1
Intel Developer Cloud Pallets Of Servers To Install 1

In the data center, there are a few main components. First, there are the systems to test, both with hardware that has been released already as well as pre-release hardware. Next, there is all of the infrastructure around that, such as storage and networking. Finally, there is an entire management plane to manage the provisioning of servers as well as things like access to the environment.

Intel Developer Cloud Intel Xeon And MAX Servers Stacks Expanding Networking 3
Intel Developer Cloud Intel Xeon And MAX Servers Stacks Expanding Networking 3

In all aspects, this is an installation that one could tell was growing rapidly. Here was a networking project happening on the day that we visited.

Intel Developer Cloud Intel Xeon And MAX Servers Stacks Expanding Networking 1
Intel Developer Cloud Intel Xeon And MAX Servers Stacks Expanding Networking 1

Until Intel sold its server systems business to MiTAC, it had a division that made servers, including reference platforms. There were a huge number of these racks in the cage. These systems can operate as standard dual Xeon servers, but they have another purpose. They are designed to house up to four double-width accelerators.

Intel Developer Cloud GPU And Xeon Racks 3
Intel Developer Cloud GPU And Xeon Racks 3

These systems had various cards in them. That included Ponte Vecchio and as we can see here “ATS-M.” If you recall Intel brands Arctic Sound-M as Intel Flex Series Data Center GPUs so these systems also house Intel’s visualization and transcoding GPUs.

Intel Developer Cloud Arctic Sound M 1
Intel Developer Cloud Arctic Sound M 1

If these systems look familiar, we used one as part of our Intel Xeon Max review, as this is also the development platform for Xeon Max.

Intel Xeon Max Dev Platform SSDs
Intel Xeon Max Dev Platform SSDs

If you want to check out that video with more views of this server, you can see it here:

Intel had not just Xeon Max, but also standard Xeon servers with different SKUs.

Intel Developer Cloud Intel Xeon And MAX Servers 2
Intel Developer Cloud Intel Xeon And MAX Servers 2

There were also racks of Supermicro SYS-521GE-TNRT. This is Supermicro’s class-leading PCIe GPU server that is slightly taller at 5U but it uses that extra space for additional cooling. We just looked at this system with two other GPUs, but here we can see a unique Intel configuration.

Intel Developer Cloud 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 6
Intel Developer Cloud 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 6

Here we can see eight Intel Data Center GPU Max 1100 series PCIe cards installed with a 4-way bridge on top of each set.

Intel Developer Cloud 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 4
Intel Developer Cloud 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 4

The system itself uses PCIe Gen5 switches in a dual-root configuration so in addition to the high-speed bridge on top of the cards, there is a direct PCIe switch connection between these sets of cards.

Intel Developer Cloud 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 1
Intel Developer Cloud 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 1

That gives Intel the flexibility to provision 1-8 GPUs for a user, and it is straightforward to assign a user one CPU and four interconnected GPUs. A developer might be extra excited about this type of configuration as NVIDIA’s current PCIe enterprise training and inferencing GPU, the L40S, does not have NVLink. The NVIDIA H100 NVL is the dual PCIe GPU solution but with only two GPUs connected. Intel has four connected in a standard PCIe server. Without the Intel Developer Cloud, this would be the type of configuration that would be harder to get access to, and that is why I was very excited to find this one.

Patrick At Intel Developer Cloud With 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 1
Patrick At Intel Developer Cloud With 8x Intel Data Center GPU Max 1100 PCIe In Supermicro SYS 521GE TNRT 1

There is, however, more. As an example, here we can see both Supermicro and Wiwynn OAM AI servers. OAM modules are the open accelerator format that Intel uses for both GPU MAX 1500 series as well as its Habana lineage AI accelerators Gaudi and Gaudi 2.

Intel Developer Cloud Supermicro Storage OAM Wiwynn OAM Intel Xeon Servers 1
Intel Developer Cloud Supermicro Storage OAM Wiwynn OAM Intel Xeon Servers 1

The hottest systems today are the Gaudi and Gaudi 2 systems. These have 100GbE links directly from the accelerators and use Ethernet instead of more proprietary communications standards like NVLink or off-accelerator InfiniBand.

Intel Developer Cloud Wiwynn OAM AI Platform Front 1
Intel Developer Cloud Wiwynn OAM AI Platform Front 1

Intel let us pull out a system that was there but not installed yet.

Intel Developer Cloud Gaudi AI 8x OAM Wywinn On Server Lift 1
Intel Developer Cloud Gaudi AI 8x OAM Wywinn On Server Lift 1

The Gaudi 2 line is in such demand right now that the team let it slip that they need to get each system up as it arrives or they get calls from executives asking when each system gets online.

Intel Developer Cloud Gaudi AI 8x OAM Wywinn Internal 1
Intel Developer Cloud Gaudi AI 8x OAM Wywinn Internal 1

The OAM platforms were split between Supermicro and Wiwynn. Supermicro has the top 19″ AI servers in the industry at the moment, and Wiwynn probably has the top OCP rack acceleration platforms. QCT is also in the conversation as a top-tier AI hardware vendor, but those are generally the big 3 in AI and have been for some time. It is great to see Intel use class-leading hardware here.

Intel Developer Cloud Supermicro OAM AI Front 1
Intel Developer Cloud Supermicro OAM AI Front 1

Overall, there were just racks and racks of systems with Intel Xeon, Xeon Max, Data Center Flex series GPUs, Max GPUs, and Gaudi/ Gaudi 2 accelerators. In this location, in August, there were 20 Gaudi 2 systems installed. We were told many more were incoming.

Next, let us show some of the other parts of the infrastructure.

1 COMMENT

  1. On a couple of the pictures why do they slap so many stickers over the air inlets? Seems like it may be bad for cooling.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.