If you are doing big data analytics in the AWS cloud, the strategy for getting more performance is often adding more instances. Desne storage (AWS EC2 D2) instances were never known for their compute performance. As a result, the strategy to improve performance often required spinning up more instances. At AWS Re:Invent 2017 the company introduced a solution. The AWS EC2 H1 instance is designed to have 2TB hard drive based storage (up to 8x 2TB or 16TB) and more RAM and compute than the D2 instance types to run jobs like MapReduce clusters.
Amazon AWS EC2 H1 Instance Details
Here are the details on the new Amazon AWS EC2 H1 instances including spot pricing from the N. Virgina region:
|Name||vCPUs||Memory (GiB)||Networking Performance||Storage (TB)||$/hr|
|h1.2xlarge||8||32||Up to 10 Gigabit||1 x 2TB HDD||0.55|
|h1.4xlarge||16||64||Up to 10 Gigabit||2 x 2TB HDD||1.10|
|h1.8xlarge||32||128||10 Gigabit||4 x 2TB HDD||2.20|
|h1.16xlarge||64||256||25 Gigabit||8 x 2TB HDD||4.40|
These instances were not highlighted as featuring the newer Intel Xeon Platinum 8175M “Skylake-SP” CPUs that the AWS EC2 M5 and C5 instances are using. Instead on the performance side the launch announcement says:
The two largest sizes support Intel Turbo and CPU power management, with all-core Turbo at 2.7 GHz and single-core Turbo at 3.0 GHz.
Local storage is optimized to deliver high throughput for sequential I/O; you can expect to transfer up to 1.15 gigabytes per second if you use a 2 megabyte block size. The storage is encrypted at rest using 256-bit XTS-AES and one-time keys.
We get a lot of questions about running big data analytics clusters. This is an area where it is still relatively easier to get more performance from local bare metal servers. Cloud egress charges are also prohibitive so organizations are often setting colocated “hubs” then sending data to AWS or other spokes for burst capacity.