Windows Storage Spaces is a great software-defined solution that has been part of the Microsoft stack since Window Server 2012. It provides enterprise storage features such as tiering storage (SSD with Hard Drive) and a resilient file system such as ReFS. It is a very powerful solution combined with SMB 3 and RDMA which deliver great performance even when compared to other software-defined storage solutions. One of the major complaints against the Windows Server 2012 iteration was complicated hardware replacement procedures which require PowerShell functions. These procedures can be an issue with complex arrays and admin who are not familiar with PowerShell. Windows Server 2016 Storage Spaces addressed some of the complex issues and simplified the procedures. Today we are going to cover how to replace a failed hard drive in Windows Server 2016 Storage Spaces.
How to Replace a Failed Hard Drive in Windows Server 2016 Storage Spaces
The following procedure was based on Windows Server 2016 DataCenter Edition Build 1607 to demonstrate how to replace a failed hard drive. The procedure is not as straightforward as some other software-defined storage solutions, so we wanted to show the procedure.
From Windows Event Viewer, there are system logs indicate bad block from hard drive.
In order to find out the actual hard drive info, we need to run the following command to identify the hard drive from the server:
Get-PhysicalDisk | select deviceid,friendlyname,serialnumber,size,physicallocation | sort -Property deviceid
From this output, we can identify the failed drive 9 is in slot 12. Note, this is working on a SATA hard drive sitting on a SAS infrastructure.
We need to locate the physical drive. The easiest way is to go to Server Manager, Storage Pool, Physical Disk then select the hard drive in slot 12. Choose Toggle Drive Light (if your system supports this feature):
Before we remove the drive, we need to place the same size hard drive in the system, join to the pool and set to automatic mode. If you fail to do this, as many do during their first replacements, you will get the following warning.
Instead, you need to add the new hard drive to the system. Then you will need to rescan storage:
Once this is done, you can add the new hard drive to the Storage Pool.
You will want to add the drive as Automatic for allocation.
After the new drive is in the Storage Pool, then remove failed hard drive.
Do not make any changes to the Storage Pool while repair operation is going on.
After some time, the old hard drive will be removed from the pool and the new drive will be assimilated. For the PowerShell fans, you can use get-storagejob to follow the progress of the repair.
After the operation is completed you can safely remove the failed hard drive.
From the procedure above you can see the Microsoft made replace failed hard drive very easy in Windows Server 2016. People with minimum PowerShell knowledge should be able to perform the task without any issues.
Software-defined storage has come a long way in the past couple of years. Hyper-converged storage has become mature with solutions like VMware vSAN, Microsoft Storage Spaces Direct, and Nutanix. We also see how those solution combined with “traditional” software-defined storage such as ZFS and Microsoft Storage Spaces could give the performance, flexibility and reliability to customers who prefer alternative flexible storage solutions from the traditional vendors such as Dell EMC, HPE, and NetApp.
We hope this guide helps a new Microsoft Windows Server 2016 Storage Spaces admin replace a failed hard drive. If you try removing the drive first and get the error shown above it can be an understandably frustrating experience.