TDWI Articles

How to Control Your Enterprise's Data Deluge with File Analysis

These five steps can help you clear out redundant, obsolete, or trivial data, stop data hoarding, and optimize storage.

As the sheer volume of corporate data increases at an exponential rate, new approaches are urgently required to ensure this data does not become an overwhelming problem.

For Further Reading:

Future-Proofing Your Data Storage 

Executive Q&A: Software-Defined Storage

Machine Learning that Automates Data Management Tasks and Processes

Although day-to-day operations generate an increasing amount of data, the challenge is compounded by what's been called "data hoarding." This occurs when staff members retain data no longer required or keep multiple copies of the same file in different locations. It can also occur when employees store personal data on corporate machines.

Dealing with data hoarding may bog down your IT teams, which may divert time and resources from more important projects.

Additionally, by storing large amounts of data, you may have to compromise on performance. Otherwise, your costs will spiral and consume an ever-greater proportion of your overall IT budget. Additionally, there are security concerns; when large volumes of data are stored and copied across multiple locations, sensitive data can end up in insecure locations.

To manage the influx of corporate data, the following steps can prove helpful:

1. Identify all redundant, obsolete, or trivial (ROT) data

The types of data that cause the majority of your issues can be categorized as being ROT. ROT data adds to the expense of providing an ever-growing storage capacity across an organization. You can reclaim your data storage capacity by using file analysis tools.

Such tools can be configured to examine all data stores and identify any files that fall into the ROT category. Duplicate files should be automatically deleted while the master copy is retained. Any files that require human evaluation should be shifted to a different location for further analysis.

2. Optimize your disk space

Monitor disk space use across your organization and alert your IT team if free space falls below a predefined limit. This allows management to take preemptive steps before there is an impact on performance.

3. Perform an analysis of the metadata

To lower operational costs and reduce infrastructure managers' workloads, you can use file analysis tools to automatically analyze the metadata associated with stored files. This allows you to determine whether such files are still relevant. If not, they can be archived or deleted.

4. Identify and analyze your employees' data storage patterns

Monitor and analyze data storage patterns by your users. After establishing a baseline of normal use, you can flag anomalies and take the requisite remedial actions. When combined with AI and ML, some file analysis tools can provide impressive results.

5. Assess who is accessing which files and when they are doing so

It's important to assess how often and by whom certain files are being accessed. By doing so, you can determine whether to reduce permissions for certain data sets. It's likely that access should be reduced to a smaller number of staff members -- at least for some data.

A Final Word

The effective deployment of file analysis tools can have a significant positive effect on how data is managed across your organization. By taking the time to carefully evaluate your existing data infrastructure, you will be much better prepared to manage your ongoing influx of corporate data. Some tools incorporate AI capabilities, allowing you to automate many tasks, which can further lessen the management burden and keep data hoarding under control.

About the Author

Ranjith Raj Gnanaprakasam is a product manager at ManageEngine, the IT management division of Zoho Corp. After nearly a decade at the company, he has developed significant expertise in enterprise security, data-centric protection techniques, as well as deep knowledge about how to discover, monitor, and protect the most sensitive data stored by organizations.


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.