As part of our STH Deep Learning and AI Q3 2019 Interview Series, we are publishing Q&A responses and viewpoints from a number of different industry executives. Supermicro has been extremely active in the deep learning and AI arena for years. This success is due to strong partnerships with companies like Intel and NVIDIA as well as Supermicro’s ability to quickly bring to market innovative platforms based on customer and market demand. We have reviewed many systems including DeepLearning10 and DeepLearning11 from Supermicro and those are systems we see often in Silicon Valley data centers.
Vik Malyala, SVP, FAE & Business Development at Supermicro is a friend of STH, and took time out of his schedule to tackle the Q&A on behalf of the company. Vik is one of those folks that I absolutely love getting to chat with when I get to stop by the Supermicro offices. He has a strong sense of the technical side as well as how new technologies can be applied to customer use cases.
Supermicro on Deep Learning and AI Q3 2019
In this series, we sent major server vendors several questions and gave them carte blanche to interpret and answer them as they saw fit. Our goal is simple, provide our readers with unique perspectives from the industry. Each person in this series is shaped by their background, company, customer interactions, and unique experiences. The value of the series is both in the individual answers, but also what they all say about how the industry views its future.
Who are the hot companies for training applications that you are seeing in the market? What challenges are they facing to take over NVIDIA’s dominance in the space?
AI Training and Inferencing is a technology that will touch pretty much every facet of human life in the coming years in a significant way. At Supermicro being a next-generation infrastructure provider, we are blessed with the opportunity to participate at ground level and partner with various companies both established and upstart in this quest. Neural Networks is nothing new and its study has been relegated largely to academia because until now the technology did not exist to train the algorithms and crunch the data in an effective and efficient way. Today, with the infrastructure platforms designed by the likes of Supermicro on standard x86 architecture combined with accelerators and open software standards is paving the path for these advanced technologies to find real-world applications in mainstream deployments. Intel with Nervana NNP, Habana Gaudi AI training processor to name a couple of companies that are getting traction and software framework and standardization is an obvious challenge.
How are form factors going to change for training clusters? Are you looking at OCP’s OAM form factor for future designs or something different?
At Supermicro, we have always believed that the form factor is dictated by the use case and deployment scenarios. That is the reason why you see so much of flexibility and scalability provided by our portfolio. As AI becomes more mainstream, we are likely to see demand for many variations in the form factor to suit the requirements of training and inferencing data sets. Many of these decisions are likely to be guided by quantity & quality of data sets dictated by the use cases and the HVAC in the datacenters & edge. All of this high-performance computing power throws a lot of heat and it is vitally important that the systems are designed for reliability and thermal stability to ensure longevity. Supermicro with its beginnings as a boutique motherboard design company is uniquely geared for this challenge, as we design, develop and assemble application optimized systems in-house, without third party contracts.
At Supermicro, we do not have a religion and our product portfolio is largely shaped by the customer’s current requirements and their future scalability. As AI deployments increase, we will see Supermicro embracing and enabling many designs that will continue to provide optimized solutions for best performance and TCO.
What kind of storage back-ends are you seeing as popular for your deep learning training customers? What are some of the lessons learned from your customers that STH readers can take advantage of?
Until a couple of years ago, Storage used to be the bottleneck for many of these high performance compute applications. With the launch of next-generation flash technologies like NVMe and persistent memory, data is made available to the CPU with much lower latencies and in higher capacities. Supermicro’s 32 NVMe flash solutions in the next generation storage form factors EDSFF (long and short) and NF1 in 1U form factor are revolutionizing the way the data sets are being stored and retrieved for faster computational requirements. The availability of dense persistent memory & next-generation flash storage in Supermicro’s portfolio of multinode systems is also helping our customers lower the latency and increase the density of storage data sets without increasing their physical footprint.
What storage and networking solutions are you seeing as the predominant trends for your AI customers? What will the next generation AI storage and networking infrastructure look like?
With all the innovation in storage technologies the past couple of years, storage is no longer the bottleneck. The congestion is moving more towards networking. Traditionally, InfiniBand and RDMA have been the go-to technologies when it comes to high compute, low latency applications. However, advances in traditional Ethernet with speeds of 400G and 100G becoming available, we see a lot of next-generation deployments going the route of traditional Ethernet. Added to this, enhancements in the implementation of dedicated queuing systems and other prioritization techniques in network interfaces will accelerate the adoption of traditional Ethernet. Technologies like Intel DC Persistent memory offers very low latency high capacity memory closest to CPU as well.
Over the next 2-3 years, what are trends in power delivery and cooling that your customers demand?
Power and Cooling will be a challenge in the coming years as the newer CPU, GPU, FPGA, and other accelerators are power-hungry and throw off a lot of heat. Supermicro because of its origins in motherboard and system design is able to provide innovative solutions when it comes to designing high performance compute and dense storage systems. However, as the power consumption and thermal emissions increase with these next-generation technologies, Supermicro provides solutions with liquid cooling along with free air cooling in its portfolio. As we see it, at some point in the near future, a balance needs to be struck between density and thermals dictated by the use cases and the location of the equipment – DataCenters with elaborate HVAC or edge deployments with limited available resources.
What should STH readers keep in mind as they plan their 2019 AI clusters?
AI is definitely here to stay this time, thanks to all the innovations in the infrastructure. All of the workflows in every industry will likely get a second look and AI will change many of those workflows towards more automated, efficient and scalable operations. This will likely result in increased productivity. It is important that customers plan for flexibility, scalability, and adopt open standards when they plan on their deployments and it behooves to partner with the right provider that shares the same mindset.
Are you seeing a lot of demand for new inferencing solutions based on chips like the NVIDIA Tesla T4?
There is a huge demand for inferencing solutions both large & small scale in various industries. And this demand is likely to increase many folds in the coming years as the ecosystem evolves to accommodate various verticals.
Are your customers demanding more FPGAs in their infrastructures?
The landscape is still evolving and based on the use case, there will be likely solutions addressing the needs of inferencing through custom FPGA’s to provide the right optimization. New standards for high speed, low latency interconnect will accommodate easier adoption of FPGAs.
Who are the big accelerator companies that you are working with in the AI inferencing space?
The obvious companies we are working with are Intel, Nvidia, and AMD. We are also validating FPGA solutions from Intel and Xilinx in our platforms. Supermicro because of its First To Market in the latest technologies is often the preferred platform for many of the revolutionary solutions. Continuing in the same tradition, many companies and startups in this space consider Supermicro as a partner and provider when it comes to their AI infrastructure needs.
Are there certain form factors that you are focusing on to enable in your server portfolio? For example, Facebook is leaning heavily on M.2 for inferencing designs.
At Supermicro, we have always let our product portfolio to mirror the customer’s requirements and demands. We provide dense solutions based on different form factors like U.2/M.2/ EDSFF etc . These are all based on PCIe interconnect. We let the customer and his/her workloads determine the appropriate form factor.
What percentage of your customers today are looking to deploy inferencing in their server clusters? Are they doing so with dedicated hardware or are they looking at technologies like 2nd Generation Intel Xeon Scalable VNNI as “good enough” solutions?
AI inferencing is still an evolving story and many of the customers are looking at achieving some of it with mainstream server technologies. We have enabled our portfolio of multi-node products including SuperBlade family with the flexibility to use general-purpose CPU’s and also with accelerators to achieve the desired outcomes.
What should STH readers keep in mind as they plan their 2019 server purchases when it comes to AI inferencing?
Again, flexibility and clear scalability path should dictate the purchases. Towards this goal, at Supermicro, we provide a choice of form factors with flexible technology paths to make sure the customer’s investment is protected and the required scale is achieved.
How are you using AI and ML to make your servers and storage solutions better?
We enable several industry leaders in semiconductor manufacturing, EDA industry, and manufacturing (sheet metal to PCB development) to develop better technology quicker and we are strategizing the use of AI/ML in rack integration and solution facilities as well in failure prediction/analytics. This is in its infancy as you may imagine, but, we are constantly evaluating the options to improve quality and customer experience.
Where and when do you expect an IT admin will see AI-based automation take over a task that is now so big that they will have a “wow” moment?
AI-based automation is going to penetrate and ultimately control EVERY layer of the software stack, replacing human engineering with auto-tuning, self-improving, better-performing code, and etc. For example, IT admin can use machine intelligence to replace user-tunable performance options in all software systems, eliminating the need to tweak them with command line parameters. Machine intelligence outperforms hand-tuning. Several startup companies leverage machine intelligence to optimize both hardware and software configurations on commodity servers to deliver the performance of tailor-made systems.
One thing is for certain, as I see racks of GPU servers in the Silicon Valley, Supermicro machines are easy to spot and they are everywhere. Vik and his team have done a great job enabling the GPU-compute markets.
I wanted to say thank you again to Vik for participating in our series. I always learn something every time we chat or I review your responses for articles like this.