Vector processing has been around for some time and NEC is the current key champion of the technology. Vector processing is aimed at sustained performance rather than the peak performance we see quoted with much of the Top500 supercomputer list. For SC17 in Denver, Colorado, NEC had something different: the NEC Vector Engine in a PCIe co-processor architecture.
The NEC Vector Engine 1.0
NEC showed off no less than five Vector Engine 1.0 variants including active and passive air cooled options as well as water-cooled variants.
The NEC Vector Engine is marketed as 8-core processor that can deliver 2.45 TFLOPS. The Vector Engine uses HBM2 at up to 1.2 TB/second making it competitive with offerings from Intel, NVIDIA and AMD. NEC also markets the chip as a complete CPU capable of running applications, unlike GPUs which it says is a major advantage even in the PCIe offloading model.
NEC had these running at the show in a plexi-glass case. Inside the case was a common server variant based on the Supermicro SYS-4028GR that we used in our DeepLearning10: The 8x NVIDIA GTX 1080 Ti GPU Monster (Part 1) and DeepLearning11: 10x NVIDIA GTX 1080 Ti Single Root Deep Learning Server (Part 1) pieces. In fact, that was one of the central points of the demo. NEC was showing that with this new generation the vector processors could be fit into a standard 4U chassis that is common in the industry.
We were told that the NEC Vector processing units were a key building block of the Japanese exascale computing program. Japan like China and the USA are all racing to build systems in the 1 exaflop range. This race has had casualties such as the Intel Knights Hill part that was canceled as part of the US – Intel exascale program.
You can read more on the pre-show press release.