STH Tensorflow GAN Training
We are going to have a follow-up piece *hopefully* using MLperf. Our original intent was to use the entire MLperf suite for benchmarking. Little did we realize that many of the benchmarks released to date do not take advantage of multiple GPUs out of the box. While the beta MLperf v0.5 works well for single GPUs, many of the models do not scale using NVIDIA nccl to multiple GPUs. That is important since training on a single GPU or single system is quaint but acceptable for learning. Crunching larger models require faster interconnect speeds.
We took our sample Tensorflow Generative Adversarial Network (GAN) image training test case and ran it on single cards then stepping up to the 8x GPU system. We expressed our results in terms of training cycles per day.
Here is a view of what we miss when we see single NVIDIA GeForce 1080 Ti v. Tesla P100 figures. Here we see an example of where the NVIDIA Tesla P100 16GB SXM2’s NVLink architecture plus stronger performance yields a tangible benefit.
We are going to have more training benchmarks in our subsequent DeepLearning12 piece where we will start using MLperf and other tools. We wanted to provide some continuity between this review and our previous DeepLearning10 and DeepLearning11 reviews.
OTOY OctaneBench 3.06.2
We added rendering to the benchmark suite of the Gigabyte G481-S80 review just to show another aspect that we often get asked for.
Here the performance is good, but it is not going to sway someone to get an 8x Tesla SXM2 machine over a GeForce GTX 1080 Ti machine. At the same time, if you are running mixed workloads, the Tesla P100’s show excellent performance. In a Kubernetes cluster or cloud, rendering is another task that the Gigabyte G481-S80 will do well on.
The Dark Horse: VDI
This ended up as a really interesting reader request. We are not going to do VDI performance testing of the 8x Tesla P100 16GB solution. We will note that if you have the appropriate Quadro vDWS or GRID licensing you can support anywhere two four GPU power users to sixty-four 2GB GPU accelerated VMs.
Again, if you are running infrastructure where you have VDI by day and on nights/ weekends want to train, the Gigabyte G481-S80 is a really interesting option. Consumer-level GPUs do not have this capability. Many of the VDI specific GPUs do not have the larger NVLink implementations that are useful for training. If you are considering a VDI deployment, this may be worth a consideration.
Next, we are going to look at networking and storage performance. We will then show some of the management features and follow that discussion with power consumption before wrapping up the review.