As the cloud environment moves from its conceptual state to a real living breathing reality, it brings with it a deep dependency to an area of IT that I predict will become central to how effectively the capability of the cloud is utilized. This same dependency underpins datacenter growth and change as well.
That area is information management.
Total capacity of shipped disk storage systems has grown by 14.8% in the last year(1). While profitability for solutions vendors has reduced as the storage market has finally started to feel the impact of the economic downturn, it is clear that demand for storage capability is not slowing down. The reality is the continual growth in the amount of data being created and needing to be stored is driving that demand. This can be seen in the associated rising costs associated with management of that data.
Even at a personal level, how many readers have in the last two years gone from managing their personal data stores by using DVDs to running terabytes of storage for movies and music and other media on home networks? Are you already using cloud based services such as Hotmail and the associated free storage, not to mention Facebook, Flickr etc. How many of you expect that use to slow down or diminish? How do you protect yourself from disaster? How do you deal with replication? What’s your personal strategy on de-duplication (do you really need multiple copies of that R&B album spread across multiple machines) ? Not to mention have you considered and understood the privacy and ownership issues that relate to web based storage?
Any business faces a similar reality - they need to fully understand how they will manage the information that they own. That management load includes not only simply storing their data, but also the challenge of managing capture, duplication, retention, access, and data protection. It’s clear that products for management of information are constantly improving - but at the same time the amount of data being stored and used grows as well. And with the widespread adoption of virtualization - storage and associated data protection form a complex environment that may create a significant potential source of frustration and challenge for an organization. Not to mention costs.
In my recent research it’s clear that there is a significant omission in the guidance available regarding information management and the associated storage solutions. While there are multiple claims of "cost savings", the majority of offerings do not define how organizations can measure any related benefits that are realized and delivered. The cost models that are publicly available are at best limited and simplified - they focus on generalizations and simple benefits related to technology. Doing further research regarding the state of the storage world this gap has become very clear - and I'm now also left wondering due to the lack of any clear data - what is missing in terms of impact on some of the other costs in the datacenter.
As our work continues to further understand the costs in the datacenter, it has become clear for information management that there are both benefits that can be delivered (stability via baseline measurement, agility through standardization) and investments that need to be made (quality of service by building and enabling standardization, risk avoidance through enabling backup facilities) that all assist in defining where the information management costs related to your data center will fall. There is however little or no coverage, or impact, in those models that is reflects the issues or need to have storage and its associated capabilities in place or used. That needs to change given the above.
So a clear goal in this area, and a focus of further work, is to create a cost model that reflects the major issues and challenges. If I can build that structure then an overall assessment of how information management impacts datacenter, and ultimately cloud, related costs will emerge. It should prove an interesting journey.
I'm interested in your feedback regarding information management, related storage implications, storage management and any associated cost models - I'd be more than happy to debate my position on the lack of well-defined cost models.