What is a Citizen Data Scientist?

A citizen data scientist is a knowledge worker without formal training in advanced mathematics and statistics that uses applications to extract high-value insights from data. A citizen data scientist uses data and analytics on a daily basis to solve specific business problems with a point-and-click interface. They rely on tools to abstract much of the difficulty from tasks like data abstraction and automate much of the work of modeling and detecting patterns in data.

Citizen Data Scientist Diagram

Digital transformation initiatives have impacted every aspect of how organizations do business today. These data-driven changes have led to more and more business leaders turning to citizen data scientists to fill the gap between the demand for data and analytics and limited supply of skilled data scientists in the market today. Citizen data scientists are able to meet this skills shortage. They are able to create data science models using advanced and predictive analytics without a background in statistical analysis.

Why is There a Growing Demand for Citizen Data Scientists?

The citizen data scientist role is at the heart of getting more out of advanced analytic technology without spending large sums of money to hire well-trained data scientists. The citizen data scientist is the organization’s best chance to groom scarce modeling and analytical skills that will allow them to meet urgent business demands and turn data into action. Smart organizations today employ data science teams that include a combination of both data scientists and data scientists. The goal of citizen data scientists is not, however, to replace data scientists, but to complement them and fill in skill gaps in understanding both the data and the business.

The Rise of the Citizen Data Scientist

The Rise of the Citizen Data Scientist can be Attributed to:

  1. How strong of an asset citizen data scientists are proving to be. They are a cost-effective option to expert data scientists, easier to find and less expensive to hire, but are able to complement those data scientists’ work.
  2. How data science as a field is more accessible to non-experts. Modern analytics and business intelligence (BI) tools are enabling users across the business to engage and better understand data. Solutions related to augmented analytics and machine learning (ML) are helping citizen data scientists more easily complete data discovery and analytics tasks that were once only accomplished by expert data scientists.

How to Empower Citizen Data Scientists

Advanced analytics and machine learning are becoming increasingly important in today’s connected world.

Driving value from these technologies relies on organizations empowering citizen scientists to develop models around advanced data analytics, machine learning and algorithmic business. And then delivering those models to the line-of-business (LOB) managers and business users who need them to make better decisions.

Citizen data scientists are the key to getting the most value out of your advanced analytics investment without spending too much on expert data scientists. When empowered by the organization, citizen data scientists without formal training are still able to extract valuable insights from data. They employ a variety of tools to make data science tasks less difficult, such as automation tools for data preparation, modeling, and pattern recognition.

Organizations can Empower Citizen Data Scientists with a Combination of People, Processes, and Technology

People

Most definitions of citizen data scientists are broad enough to encompass LOB staff, business analysts, and employees in business intelligence (BI), and even IT. With such a broad reach, the citizen data scientist plays a valuable role in what analyst Howard Dresner calls “information democracy,” ensuring that data and insights are shared across the business. Companies can no longer get by without BI and analytics applications. It is critical to get valuable information into the hands of the business and other stakeholders, instead of just the data scientists and other data experts.

Process

The process by which data scientists and citizen data scientists make better use of data and analytics is underpinned by a deeper question about the organization as a whole: Does it have processes for sharing anything?This is not always a given in companies that have grown quickly, have grown through mergers and acquisitions, or have begun to shrink. If the culture has never embraced or fostered the notion of transparency and sharing, then whatever process the company may put in place to use software to publish analytical models and the data they harvest is unlikely to succeed.

Once the citizen data scientists have stepped forward and the data scientists have qualified them, the process of carving up work begins.

The goal of engaging citizen data scientists is not to replace data scientists but to complement them with a set of power users who can use your applications to pick up where scientists leave off and fill in any skill gaps. Given that the optimal use of big data requires knowledge of coding, statistics, machine learning, database management, visualization techniques and industry-specific knowledge, the best way to pull it off is by combining multiple skill sets. At the very least, citizen data scientists offer the greatest value in the area of LOB knowledge, something it would be inefficient for a data scientist to stop and learn to any useful degree.

Once a process is in place, the traditional barriers that data scientists face to gain buy in—both upstream to management and downstream to staff—begin to decrease as information democracy puts more data into more hands. Beyond arriving at insights that boost revenue or lower costs in the short run, the promise of data science lies in applying those insights in ways that beneficially shape the company’s direction in the long run. The smoothest way there is by linking the efforts of trained data scientists and citizen data scientists.

In practice, it makes sense for data scientists to stick to the work of advanced analytics and statistics for which they are trained, creating workflows for data preparation and modeling. When those workflows are ready to test or take into production, the data scientists use your analytics software to push them to the citizen data scientists, who run them and ensure they work as designed. In time, the citizen data scientists can assume greater responsibility, using you application to modify workflows and create their own.

Technology

Most analysts reach reflexively for a spreadsheet program to crunch numbers and arrive at insights. The intuitive, trusted, row-and-column format makes immediate sense and is infinitely flexible. However, spreadsheet software eventually runs out of gas, either in collaborating, sharing, combining disparate data sets, performing advanced analytics, or executing repeatable workflows.

Data scientists know that it is futile to impose raw math and statistics on people who are not adept at them. The goal is to get an analytics platform into the hands of people who can build the models for use all around the organization. Every analytics platform claims ease of use, but that is not enough. It must be sufficiently powerful to meet the needs of data scientists yet easy enough for non-technical staff to use automated, sharable workflows across the business.