Industry


Ads by TechWords

See your link here


More on the Storage Controller of the Future

Hu Yoshita, the CTO of Hitachi Data Systems, reacted to my blog entry on the Storage Controller of the Future in a recent blog entry. I suggested that the enterprise-class storage controller of the future would look more like one of the emerging commodity-based clustered storage controllers (e.g. Equallogic, Exanet, Isilon, Lefthand Networks) than the currently shipping enterprise-class solutions from EMC, HDS and IBM (e.g. the HDS TagmaStore USP). Hu suggested that while emerging clustered controller architectures seem appropriate for tier 2/3 storage needs, mission-critical enterprise-class applications are ill-suited for these platforms. I believe that depending on the time frame we are talking about, and the way that the architecture is characterized, we both may be right.

For the next three to five years I agree with Hu and believe that commodity-based clustered storage controllers will continue to make headway for use in tier 2/3 storage applications, but will make little progress in enterprise-class deployments. Beyond the five year time frame however, I believe that enterprise-class controllers will start to look less like the custom hardware platforms of today and more like a commodity-based clustered architecture.

My career developing storage controllers dates back to the 80's when it would have seemed preposterous that mission critical applications running on mainframes would soon rely on arrays of affordable 5 1/4" disk drives protected with RAID algorithms. At that time no one could fathom how the performance and reliability of mainframe class drives and storage systems could be met using a bunch of relatively cheap SCSI drives.  Yet by the early 90's EMC had led the way and was eating IBM's lunch in enterprise accounts. Intelligent RAID and caching algorithms were being used to not only provide excellent reliability, but also to deliver storage performance levels needed by the most demanding enterprise-class applications. So what was the formula that changed the way enterprise-class storage systems were architected in the 90's? Commodity components (in this case SCSI hard drives) plus intelligent algorithms running on a massively parallel, high speed connected array of purpose-built hardware.  

Hu got it right in his response when he stated that "a global cache is key to a scalable, enterprise storage controller." I agree wholeheartedly and believe that cache coherency is the biggest challenge when developing an enterprise-class solution using a cluster of commodity components.  I've spoken with storage architects and have tested many of the emerging clustered storage controllers. Cache coherency, scalability and enterprise-class reliability are the design goals for each of these emerging solutions. Most are well on their way and have proven that intelligent software running on commodity servers glued together with FC, Ethernet, and even Infiniband can be used to implement a cache coherent single system image with near linear scalability.

Hu states that the "biggest problem with clustered solutions is that each node in a cluster has its own cache, and write data has to be replicated across the cluster of caches in order to maintain write consistency." While I agree with the second statement (cache must be mirrored), I disagree with the first statement. In most of the emerging clustered storage architectures that I have been exposed to, the concept of a logically isolated cache per node has been scrapped. Instead a distributed service-oriented approach with a universally agreed upon, but truly distributed, addressing and lock management scheme has been implemented.

If you opened up an enterprise-class system today I believe you'd find a mix of roughly 80% custom hardware and 20% commodity components (excluding drives). I believe that advancements led by the clustered storage controller community over the next five to ten years will tip the balance more towards 80% commodity and 20% custom. For example consider the custom hardware being used today inside enterprise-class controllers for high speed communication between processors and shared memory. Today that's custom hardware. Five years from now the commoditization of 10GigE or Infiniband which has just begun in the clustered server space could be leveraged for use in enterprise-class storage architectures. So what's the formula that will change the way that enterprise-class storage system will be architected in the future? In my opinion, the formula for cost optimized success is commodity server components glued together with emerging high speed interconnects plus a healthy dose of extremely intelligent software.