25x backup compression at HDS
- IT TOPICS:Storage
Much coolness over at HDS this week as they announced their OEM deal with Diligent Technologies. As I've noted at StorageMojo.com, I am a big fan of Diligent's 25x backup data compression technology that is part of their Virtual Tape Library product.
Fuzzy Logic, Meet Fuzzy Marketing
Oddly, these shy-and-retiring ex-EMC'ers don't call their product data compression. Yet doesn't something that reduces your backup data volume by 25x over time sound like a compression product?
The Feature That Dare Not Speak Its Name
So instead of compression, the jargon word is de-duplication, that, besides being unnecessary, sounds like a file-based scheme, which it isn't. Diligent's, Data Domain's and Avamar's backup compression products all work on byte streams, so they are independent of file systems, operating systems or backup software. There are file-based schemes, but they offer neither the compression or system independence of the byte-stream versions.
What You Need To Know
Fundamentally all byte-stream products (including those from Data Domain and Avamar) look at a stream of bytes, compare them, and, when they match, put a pointer to an earlier chunk. The more matches, the higher the compression ratio. As you back up more data, the compression ratio rises, assuming the data doesn't change that much. A Diligent exec also told me that when two blocks are similar, they store the differences, but that is not how their white paper describes it.
What these guys do is, in principle, not unlike MPEG-4. Given similar frames MPEG-4 stores the differences to achieve great compression. The big difference from video: the backup guys compare a new data segment to ALL stored segments, not just the prior one. Each of the companies maintains they have the absolute best secret sauce, but who cares? They all work.
That said, I think Diligent has the coolest version since they don't tie you to hardware the way Data Domain does, but maybe you prefer the "one throat to choke" model. Avamar has a slightly different take that emphasizes network traffic reduction and that too may float your boat.
Enabling D2D Backup
This is interesting only because it confers a real benefit: 25x compression makes disk-to-disk backup just as cheap as tape. And moves tape one step closer to that big archive in the sky.
One More Thing
Despite the coolness, I believe any backup compression is a feature, not a product. Enterprise backup vendors should license a version, or build their own, and add it to their backup tools. There should be an open source project as well, just to keep everyone honest.



