By Henrik Pedersen, Senior Technical Account Manager, CommVault Systems
According to a survey conducted by Gartner, 47% of large enterprises identify data growth as the biggest data centre hardware infrastructure challenge. On average, the data capacity in enterprises is growing at 40% to 60%. Research further shows that more than 52% of an organisation's digital content is unstructured data such as files, documents, image files, video etc., while just 31% is structured.
Over 70% of this content is generated by end-users within the organisation. Employees often store personal data on company resources as they know that it will be securely maintained and regularly backed up. Thus, the data pool contains a mixture of data which is business-critical as well as data which has lesser or no business value.
Even business related data can get stale over a period of time. It becomes inactive and has low business relevance. The failure to analyse data means that all data is treated in the same manner which leads to ineffective utilisation of company resources.
The real challenge organisations face lies not in having to deal with data growth, which is inevitable, but in the effective and strategic management of data. After all, while data growth is projected at 40-60% per year, growth in IT budgets is estimated to be just 2.6%, which is significantly less.
Factors contributing to unnecessary data growth
Long term retention is a factor which complicates the overall data management process. Retention may be for business reasons, historical reasons, end-user driven requirements and policies and regulations which may be prescribed by the government or the organisation itself. As the number of retention policies, both government and home-grown, add up, the organisation and storage of data becomes more complex.
Maintaining multiple copies of the same data is both inefficient and expensive. Apart from causing inconsistency and placing a large overhead, redundancy can affect long-term processes such as backup. Although the cost of storage devices is reducing, having redundant data on these devices increases the time taken for backup.
This causes significant increase in the network overhead and bandwidth requirement. Furthermore, most large organisations with multiple locations globally generate large volumes of data on a constant basis.
Due to this global dispersion, backup windows are constantly reducing and so only critical and business relevant data should be identified and selected for regular backup. So what techniques do organisations employ to reduce their data storage requirements and effectively utilise resources?
Resource acquisition - the quick fix
The most common, tactical reaction to solving the data growth problem is to simply buy more storage. Given the reducing cost of storage, this knee-jerk reaction proves to be the quick-fix but often reflects the lack of the ability to carry out predictive capacity planning. The hoarding of data is further complicated by the infinite retention policies as the data is stored without consideration for the actual content.
Data Archiving, along with data tiering, is considered to be an effective data reduction technology. But blind archiving, without first gaining insight into the data landscape or applying any governing policy simply translates to moving data between the tiers and does not contribute to any reduction in the total volume of data being managed.
DeDuplication - beating the bloat
Finally, DeDuplication or dedupe for short probably the most talked about data management strategy.