Avoid Falling Into the Data Hoarder Trap

As the amount of data available to businesses continues to grow, it may be tempting to indiscriminately gather additional data in hopes that infusing more data into analytics will uncover additional insight.

Furthermore, the cost of data storage continues to drop, making it easier for a company to become a “data hoarder,” storing everything without deciding what data is relevant.

However, data storage does require resources for maintenance tasks like provisioning, and backing up and recovering data, according to a MIT Sloan Management Review blog post.

“…hoarding data interferes with existing data since it diverts scarce analyst and managerial resources that may be better applied elsewhere,” the post notes. “If actionable insights are the proverbial needle in the haystack, adding more data may just make the haystack bigger, and the needle that much harder to find.”

Avoiding the trap of becoming a data hoarder is even more important as companies report Big Data analytics is changing the competitive landscape of their industries.

According to a recent survey from Accenture and GE, 76 percent of companies believe Big Data analytics will redefine the competitive landscape of their industries within the next three years. Moreover, 89 percent of companies believe the companies that do not adopt a Big Data analytics strategy in the next year do so at the risk of losing market share and momentum to competitors, the survey found.

The MIT Sloan post recommends companies take several precautions to avoid becoming data hoarders in their analytics initiatives. First, adding data should not obscure information that is valuable or create distractions that would interfere with ongoing data analysis.

Companies should ensure that added data has a defined purpose in the analysis by tackling questions such as how the new data will reduce uncertainty, allow more precise measurement, or provide information from a critical new source.

“Data collectors use experience or sampling to add data rich with potential to further the purpose of the analysis; data hoarders add data fearfully and speculatively,” according to the post. “The fear of making a wrong decision leads to keeping everything—and this can sometimes be counterproductive.”

In addition to ensuring that adding data does not obscure valuable information, companies should only add data if other data is not sufficient for successful analysis.

“All data is not created equal; there is considerable variation in quality and usefulness,” the post notes. “Data collectors consider multiple available alternatives to meet the purpose of the analysis; data hoarders add additional measures to the stack.”

The post goes on to recommend that companies only add more data if its addition does not layer on existing bias to the data.

“All data is biased in some way,” according to the post. “Adding more data can add more bias, undermining overall data quality. Data collectors seek novel perspectives; data hoarders pile on convenient data and reinforce bias.”

Finally, the article suggests that adding data should not harm the overall analytical process. To avoid this, companies need to tackle these questions:

  • As new data is added, what is the process for updating the data and ensuring quality?
  • Is that process sustainable?
  • Is the value added by the new data more than the long-term costs of maintenance to ensure a positive ROI?