Keeping that virtual server lean and mean
- TAGS:array, block, controller, data, Management, optimization, performance, snapshots, storage, thin provisioning, tuning
- IT TOPICS:Storage
As I mentioned in a previous post, I recently did a webcast along with Jeff Byrne, another Taneja Group analyst, on the topic of storage challenges in server virtualization infrastructures.
Secondary to the backup challenges I described in that other post, we also are frequently questioned on the topic of thin provisioning. Some of these questions are along the lines of disbelief - "is this a real technology, why don't more vendors have it," etc. Some of them are along the lines of questioning how to manage capacity in a thin environment, the effectiveness of thin over time, and what the impact of thin is on array performance.
Most of these questions come up because thin isn't a simple technology. That's why you see vendors like 3PAR talking about it in very differentiated ways, with 3PAR messaging about how they've built thin into their latest generation ASIC. On the surface, it sure seems simple - basically chop the empty space off any volume. But in reality, thin is pretty complex, and you better be ready to put your vendor on the spot before you bite.
The reason thin isn't straightforward, is because it takes a whole new level of block management inside the array. We've long been accustomed to just dumping blocks on disk, and letting the filesystem do most of the work. Thing is, we developed file systems with the expectation they would be able to dump their stuff on a pretty big set of unintelligent blocks. Now, for the first time, we're trying to make both sides work together, and we need to be able to relate stuff happening at the file system level to the block level. This is not unique to thin, but is at center stage with de-dupe too (more to come on that soon).
When you start employing thin inside an array, you have a couple of issues.
- You need to be able to identify potential candidates for thinning, in real time, with no performance impact - this is not a simple task, and takes an ability to look inside the storage block, and figure out what the file system is doing.
- You need to be able to provide performance from fewer spindles as thin may substantially increase the number of hosts you attach to some set of spindles. Look at it this way, you might use the same 100 spindles, but instead of chewing up all of your available space with overallocations to 20 hosts, now you've optimized your utilization, and can cram 150 hosts on those same 100 spindles. If those are high IO virtual hosts, you might be in trouble, if your controller can't crank up the heat. And back to issue number 1, your controller needs to be able to manage thin not for 20 hosts, but for the optimized 150.
In my opinion, the guys that are best able to do thin and scale it are the folks we've previously recognized as Next Generation Block Storage. These are vendors like 3PAR, Pillar, Xiotech, XIV, the long experienced storage virtualization vendors like Falconstor, SVC, Datacore, and a handful of other vendors that optimize the placement and performance of every storage block in their system.
Further representative of my arguments here, 3PAR has recently taken thin to the next level, by serving up an API and integrating it with VxFS so that filesystem can tell the array to reclaim space when it is freed up. That points out how complex some file system operations can be, and how much they can still interfere with thin technologies over the long run. So look for deep reaching, deeply integrated thin in your VMware infrastructure storage, and ask your vendors what their roadmap looks like for thin innovation. It takes deep integration in an array to deliver on the promises of thin.

