Top Three Challenges for Data Miners

Reading Time: 3 minutes

top three challengs for data minersAn article in the June issue of Data Analytics magazine discusses the results of the 2010 Rexer Analytics annual data mining survey. Different from surveys that target analytics executives, the Rexer survey goes direct to the data miners themselves. In the 2010 survey, data miners identified their top challenges and many discussed how they’ve tried to work around or overcome obstacles. In both cases, a clear theme emerges that has very little to do with statistics and a lot to do with engaging and communicating with business users.

1. Dirty Data
It’s no surprise to Rexer that dirty data tops the list, because it has been at the top of the list for the past several years.  It’s probably no big surprise to anyone who reads our blog either, because we’ve discussed in several posts how dirty data can derail data analytics and business intelligence projects. In the Rexer survey, many data miners provided input as to how they’ve tried to overcome the problem, and a clear theme emerges: involve business users. Data miners use descriptive statistics and visualization to assist business users in understanding their data and identifying problem areas.  Helping users understand their data “hands on” helps everyone gain a shared understanding of the quality of the data. This can help manage expectations about the potential results of a data modeling exercise given data quality and convince data owners to create action plans to improve quality.

2. Explaining Data Mining To Others
The second challenge for data miners was explaining it to others. Some data miners expressed frustration with executives who don’t support solutions because they don’t have the background to understand data mining, but at the same time refuse to sit through more than a brief presentation on the topic. Data miners recommended finding support one level down from the executive — identifying someone who is willing to invest time to understand the solutions and willing to champion solutions with the senior executive. Other data miners went even lower in the organization, and convinced key users to identify a problem and work interactively with the data miner on the solution. This allows the business users to see the power and capability of data analytics first-hand and to be able to get answers to questions “on the fly.”

3. Difficulty Accessing Data
The number three challenge was difficulty either accessing data that exists, for example because it is scattered throughout an organization, or, more commonly, accessing data because it does not exist. Data miners generally agreed that difficulty accessing data is due to the lack of a plan or strategy for data — what data is needed, how it can be obtained, how quality can be assured or improved and how it can be maintained. Again, data miners suggest working directly with business users to match business problems with data requirements, and to use this as way to begin developing a broader plan for data collection and data accessibility.

Six Things To Do Next

1. For more tips on how to effectively communicate with business users check out my post on  “Six Essential Soft Skills for Data Analytics and BI Professionals“.

2. To review a list of ideas and suggestions from data miners on how to overcome the top challenges, go to the Rexer website.

3. To learn how to develop your own enhanced visualizations designed to influence business users, download our recent complimentary webcast titled  “What’s New with Spotfire version 3.3″.

4. For a free copy of the 37-page summary report of the 2010 Rexer survey email DataMinerSurvey@RexerAnalytics.com .

5. To participate in the 2011 Rexer survey, go to surveymonkey.com and enter “INF28” for the access code.

6. To stay informed on upcoming data mining and data analytics topics and trends, subscribe to our blog.

Steve McDonnell
Spotfire Blogging Team