Dell EMC PowerEdge XE7100 Review An Intelligent 100x HDD System

16

Dell EMC PowerEdge XE7100 Comptue Options Overview

If one recalls the rear of the system we have three large fan modules and two power supplies which are at the top of the chassis. That is all typical with 4U JBODs and systems. What the Dell engineers have done is to create a flexible compute solution to go along with all of this storage. That compute solution effectively occupies the bottom of the chassis behind the expander modules.

Dell EMC PowerEdge XE7100 Rear 3
Dell EMC PowerEdge XE7100 Rear 3

The unit we are using for photos is the 2x half-width compute node solution. These modules may look familiar as the motherboard is very similar to what is found in the C6425 chassis. Indeed, these are two Intel Xeon Scalable CPUs, but we reviewed an EPYC version in our Dell EMC PowerEdge C6525 2U4N server review. Still, Dell is utilizing a high-volume part which increases quality.

Dell EMC PowerEdge XE7100 Half Width Compute Node Overview
Dell EMC PowerEdge XE7100 Half-Width Compute Node Overview

The rear of the nodes has our I/O. We have dual USB 3.0 ports, a DisplayPort output, and a management port for iDRAC duties. Along the bottom we can also see a dual SFP+ Intel X710 10GbE NIC in the mezzanine card slot.

Dell EMC PowerEdge XE7100 Half Width Compute Node IO
Dell EMC PowerEdge XE7100 Half Width Compute Node IO

The top part of the rear I/O has one slot for storage and one for networking. One can see the battery pack on the left side which slots into the airflow guide between the CPUs which we have removed for photos. The right side is a low-profile slot for additional expansion.

Dell EMC PowerEdge XE7100 Half Width Comptue Node PCIe
Dell EMC PowerEdge XE7100 Half Width Comptue Node PCIe

Each CPU socket is limited to a 150W TDP which may be why we do not see the EPYC solution here. AMD CPUs tend to have more I/O and core count which also means more power consumption. We will, however, note that this platform utilizes 8-channel memory and as we have shown with our Installing a 3rd Generation Intel Xeon Scalable LGA4189 CPU and Cooler piece Intel’s new Ice Lake Xeon SKUs will utilize a package that is a similar width but about 1.5mm longer than the CPUs in here. It seems like Dell has thought ahead for its common platform here. Still, for a storage server like this, we may never see an update to the Ice Lake Xeon/ Milan EPYC generation or it may be delayed. These types of systems tend to be less CPU intensive.

Dell EMC PowerEdge XE7100 Half Width Compute Node CPU And Memory
Dell EMC PowerEdge XE7100 Half Width Compute Node CPU And Memory

The half-width node has its own fans. That is important because otherwise, the fans would be in the middle of the chassis and hard to service. We can also see more high-density connectors.

Dell EMC PowerEdge XE7100 Half Width Compute Node Connectors 2
Dell EMC PowerEdge XE7100 Half Width Compute Node Connectors 2

Here is a look inside the chassis to where these connectors mate. One quick note here is that when the two controller nodes are installed we saw two 50x bay nodes. We were told this is used to increase node count by twice the chassis count which can be advantageous in scale-out storage deployments.

Dell EMC PowerEdge XE7100 Internal Node Connectors
Dell EMC PowerEdge XE7100 Internal Node Connectors

The half-width compute node is fairly standard, but there is a lot more going on here. The Dell EMC engineers made a configurable platform. Here, we have a GPU compute node with a NVIDIA Tesla V100 dual-width GPU installed.

Dell EMC PowerEdge XE7100 NVIDIA V100
Dell EMC PowerEdge XE7100 NVIDIA V100

While the dual Xeon compute node is still present, on the dual-sled design the low-profile PCIe riser’s connectivity is exposed to the other side. Here we get a riser that supports a double-width GPU.

Dell EMC PowerEdge XE7100 NVIDIA V100 Node 2
Dell EMC PowerEdge XE7100 NVIDIA V100 Node 2

That side of the chassis also provides GPU power and its own cooling. We are not showing ducting since this is already a large piece, but Dell is ducting airflow from the fans to the GPU.

Dell EMC PowerEdge XE7100 NVIDIA V100 Node 3
Dell EMC PowerEdge XE7100 NVIDIA V100 Node 3

This side can also house four NVIDIA T4‘s. For those that prefer smaller and lower-power accelerators, we get the ability to run up to four low-profile cards. In this shot, we left the airflow guides on both the top and bottom. A quick note is that one can see the battery module in its proper place lodged in the airflow guide between the two CPU heatsinks.

Dell EMC PowerEdge XE7100 NVIDIA T4
Dell EMC PowerEdge XE7100 NVIDIA T4

What is not pictured is that Dell has a DSS FE1 card. This card has a PCIe switch chip and small risers that each has up to four M.2 drives. That means the FE1 can have up to 20x M.2 SSDs on this card.

In summary, the Dell EMC engineers did not create the highest-density storage platform on the market. Dell can qualify it by saying 5U and by citing a “standard” rack dimension of sub-36 inches. In terms of density per U, the ruler SSD units are at 1PB/U and there are 4U systems with more than 100 drives out there that are deeper, but denser per U.

What the engineers created instead was a 5U 100x bay storage platform with a surprising amount of customizability. One can not only get a single, or dual compute nodes. Instead, one also has options to add GPUs and storage in a number of configurations. That practically allows architects to design solutions where AI training/ inferencing happens on the storage node itself instead of being the source of data that must be pushed over the network, processed on another GPU server, and then potentially data returned. Keeping everything local improves latency and also lowers the networking and data center power requirements. That is the true innovation and benefit to what the Dell EMC engineers accomplished.

Next, we are going to look at the management and performance before getting to our final words.

16 COMMENTS

  1. Just out of curiosity, when a drive fails, what sort of information are you given regarding the location of said drive? Does it have LEDs on the drive trays, or are you told “row x, column y” or ???

  2. On the PCB backplane and on the drive trays you can see the infrastructure for two status LEDs per drive. The system is also laid out to have all of the drive slots numbered and documented.

  3. Unless, you are into Ceph or the likes and you are okay with huge failure domains (the only way to make Ceph economical for capacity), then I just don’t understand this product. No dual path (HA) makes it a no go. Perhaps there are places where a PB+ outage is no biggie.

  4. Not sure what software you will run on this, most clustered file system or object storage system that I know recommend going for smaller nodes, primarily for rebuild times, one vendor that I know recommends 10 – 15 drives, and drives with no more than 8TB.

  5. I enjoy your review of this system, definitely cool features, love to the flexibility with the gpu/ssd options and as you mentioned this would not be an individual setup but rather integrated in a clustered setup. I’d also imagine it has the typical IDRAC functionality, and all the dell basics for monitoring.

    After all the fun putting it together, do you get to keep it for a while and play with or once you reviewed it….just pack it up and ship it out?

  6. Hans Henrik Hape even with CEPH this is to big, loosing 1,2 PB becouse one servers fails, rebalancing, network and cpu load on nodes. It’s just bad idea not from data loose point of view but from maintaining operations durring failure.

  7. Great Job STH, Patrick, you never stopped surprising us with these kind of reviews. sine you configured system as a raw 1.2PB can you check the internal bandwidth available using something like
    iozone -t 1 -i 0 -i 1 -r 1M -s 1024G -+n
    I would love to know since no parity HDD is it possible the bandwidth will 250MBpsx100HDD i.e. 25GBps. I’m sure their will be bottlenecks maybe the PCIe bus speed for the HBA controller or CPU limitation. But since you have it will nice to known

    Again thanks for the great reviews

  8. “250MBpsx100HDD i.e. 25GBps” ANy 7200 RPM drive is capable of ~140 MB/sec sustained, so, 100 of them would be ~14 GigaBYTES per second… Ergo, 100 Gbps connection will then propel this server to hypothetical ~10 GB/sec transfers… As it is, 25 Gbps would limit it quite quickly.

  9. A great review of this awesome server! I wish I had a few hundred K$ to spare, and, needed a couple of them!

  10. There is a typo in the conclusion section compeition -> competition.

    But else a really cool review of a nice “toy” I sadly will never get too play with.

  11. James – if you look at Milan as an example, AMD does not even offer a sub 155W TDP CPU in the EPYC 7003 series. We discussed in that launch review why and it comes down to the I/O die using so much power.

  12. While nothing was mentioned about prices, Dell typically charge at least 100% premium for the hard drives with their logo (all made NOT by Dell) despite having much shorter than normal 5yr warranty – so combination of 1U 2 CPU (AMD EPYC) server and 102-106 4U SAS JBOD from WD or Seagate will be MUCH cheaper, faster (in a right configuration) and much easier to upgrade if needed (the server part which improves faster than SAS HDD).
    Color me unimpressed by this product despite impressive engineering.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.