One of the biggest storage management advancements in the past years has been the (re-) introduction of thin provisioning. Practically speaking, thin provisioning is one of the many technologies that have transformed some aspects of the age-old problem of storage utilization. Since storage utilization is a rough metric used to determine the amount of data, compared to actual capacity, a storage system is storing at any given time. Users and organizations with high storage utilizations therefore have less waste and are better utilizing a capital asset.
Thin provisioning involves allotting the amount of storage a user (physical user, virtual machine, or otherwise) requires, only when they require the storage. Prior to thin provisioning, users were reserved a specific amount of storage in fixed size volumes or LUNs. Fixed partitioning of a finite resource pool, except in very rare cases, leads to waste and low storage utilization.
Thin vs. Fixed Provisioning – A Practical Example
The best way to illustrate this example is looking at portable device storage (I am flying from New York to San Francisco while writing this piece, which may be the reason for the example.) Each user has either a hard disk or solid state disk of a certain capacity that is fixed into the machine that they are using. Being on a non-internet enabled Boeing 757, the users have no access to the Internet and cloud storage, meaning their storage is finite.
Taking my MacBook Air for example, it has a 128GB SSD with approximately with 88GB free as I type this. My iPad however is running at over 97% of its 32GB capacity. With the fixed, physical partitioning of the solid state drives in each device I am almost out of space on my iPad while I have much unused space on my MacBook Air. Of the 160GB of solid state storage I am carrying with me, I cannot allocate more than 32GB to the iPad and 128GB to the MacBook Air. My only option is to purchase a new iPad with 64GB of onboard storage, a time consuming and expensive affair.
Likewise traditional fixed provisioning of network storage resources faced the same three problems, disparate use, difficulty to expand, and overall low utilization of raw storage capacity. In my example, I, like many of my fellow travellers, have some devices that have too little fixed storage and others with too much. Almost invariably when storage is partitioned into fixed container sizes some containers remain highly utilized while others remain not used much at all showing the disparity in usage. Since a parameter (container size) needs to be re-defined, some process needs to be used to change this parameter. In many organizations this requires a management approval, then a request to IT that needs to be fulfilled showing how this can be difficult to expand. Finally, despite having an iPad that is near capacity, my overall storage use is (40GB + 31GB) / 160GB or 44%. While this is better than many data centers it is not spectacular since I still paid for the 56% of the storage I am not using.
Using thin provisioning techniques, I would have a hypothetical 160GB drive and both the iPad and MacBook Air could use as much space as they needed so long as the sum of usage was less than 160GB. My iPad could grow to use 64GB or more of capacity while my MacBook Air could use 34GB of the 160GB available, dramatically increasing my storage utilization while not requiring me to purchase a new iPad. Looking at a real-world example, take 1TB of usable storage and partition it off to ten users, each with 100GB allotments. In a fixed system, the entire 1TB is used even though the users do not have 1TB worth of data on the system.
As one can see, the system is running at 46% capacity with one user (User 2) almost out of space and no capacity available to allocate to User 2. An administrator would either need to purchase more storage or take storage from User 6, and re-allocate it to User 2. Using thin provisioning, this is not an issue:
In the same way storage vendors have re-thought network storage. Instead of administrators pre-defining containers for users, wasting space, the containers grow with users to fit the ever-changing capacity needs. Thin provisioning allows administrators to greatly reduce the waste. Aside from eliminating waste, thin provisioning greatly streamlines the business processes around storage. Instead of making an initial request for a finite amount of storage, and then at the limit of that finite amount undergoing a management approval then IT ticket process, the storage just grows.
Since IT organizations commonly use chargebacks to other organizations for resources used, an IT organization can right size the initial purchase (with less wasted space). Then, after lowering the initial capital outlay, the IT organization can charge other organizations on an actual storage consumed basis instead of an allocation basis irrespective of actual amount of data stored. The net effect of the lower capital outlay, the pay for actual storage used, and the streamlined business proves is lower operating costs.
The Cloud of Thin Provisioning
As Apple readies its impending cloud service, Google and Microsoft expands their online storage for mail and office applications, and Amazon.com launches its cloud player application the cloud becomes perhaps one of the most important places for thin provisioning of storage. All of the cloud providers previously mentioned use one form of thin provisioning or another. In fact, thin provisioning is not just for physical storage as VMware ESX and Microsoft Hyper-V both allow for thin provisioning of memory (see Microsoft Hyper-V Dynamic Memory article.) Cloud technology is built upon the thin provisioning of resources so it is not just storage, RAM, and CPU power that are being provisioned, but also things like bandwidth.
The cloud’s high-efficiency using thin provisioning is well documented. One of the tricks leveraging the cloud as a theoretically limitless data store is that the storage traffic must traverse more expensive WAN connections. With cloud providers charging by the amount of data transferred (this is like the thin provisioning of bandwidth) and ISPs implementing new usage tiers, the costs to store and retrieve data are not insignificant. Frequently accessed data is still best served over local networks.
What’s the catch?
Probably the most cited drawback of thin provisioning is that, without fixed boundaries, a system may be constantly recalibrating the containers for each individual. It is well known that in most applications thin provisioning does have a negative impact on the performance of a LUN or volume. A general rule of thumb for VMware ESXi and Microsoft Hyper-V is that thin versus fixed provisioning is an approximately ten percent differential. In an industry obsessed with performance, why then has thin provisioning taken hold in a major way? Simply put, as systems scale, it is less expensive to get the additional ten percent performance (if required) elsewhere than it is to manage fixed storage provisions in a lot of cases.
For the time being, thin provisioning is here to stay. Many users will look at the performance penalty of thin provisioning and think that the technology is not worth the performance decrease. Other administrators will look at thin provisioning as an opportunity to lower capital equipment and operating costs. At the end of the day, the entire industry is looking to the cloud as the thin provisioning resource model everyone tries to emulate. The struggle becomes how does one achieve cloud-like resource utilization in local environments and that is something that enterprises are still figuring out.