The development of big data, artificial intelligence, and predictive analytics has created extravagant expectations for enterprise productivity growth—and aroused popular anxiety about intelligent information systems taking jobs from human workers. It is ironic, against that backdrop, that what is holding back widespread adoption of these technologies is, of all things, a manpower shortage.
Big data and advanced analytics are the products of data science. What keeps companies from putting them to effective use is an acute shortage of data scientists.
The US alone is facing a projected shortage of 140,000 to 190,000 data scientists by 2018. Nor are there enough managers or analysts with the professional training in mathematics or analytics to make sense of big data or put these tools to practical business use.
This isn’t actually a new problem. I started my career in semiconductor manufacturing—one of the most complex and data intensive manufacturing environments imaginable. Processes, systems, and tools are highly standardized. Semiconductor companies spend inordinate amounts of money collecting and storing data used to standardize and automate manufacturing processes. But the analytics used to understand and optimize these processes were rarely standardized. You might find 30 different people in their separate offices and labs analyzing essentially the same problem 30 different ways.
You see this across multiple industries. Visionary analyst reports depict revolutionary changes in business strategy driven by cutting edge analytics. But in the real world, many complex analysis tasks are being done in Excel—or, in some industries, with pencil and paper.
But a forward-thinking approach to the shortage of data science talent is emerging. Companies are identifying line of business people—smart employees who are not specifically trained in math or statistics but do have insightful perspectives on the business problems for which they hope to apply big data solutions. They are grooming these individuals to develop and administer models based on predictive or prescriptive analytics, and giving them wizards and templates developed for specific kinds of business analyses, and to interpret the results for the benefit of other line of business users.
These people are being developed into specialists whose expertise sits between that of the data scientists and the business users. We like to call them Citizen Data Scientists.
The emergence of citizen data scientists is part of a general democratization of data in large organizations. In order for the promise of the new analytics to be realized, these tools must be as broadly available as possible, both for individuals to draw from big data resources and to contribute to the pool. It is impractical for the analytics to be administered only by a small priesthood of experts. There has to be a role for the kinds of people we used to call “power users” to pose, and solve, critical analytics problems.
Organizations will need the data scientists and will have to deal with the scarcity of these resources. One of the responsibilities for the data scientists, however, will be to mentor citizen data scientists who lack the deeper quantitative skills but whose business expertise actually gives them deeper perspective on the marketing or operational problems to be resolved.
Organizations rarely have a formal way of recognizing power users, but executives informally know who to go to for smart hacks. The development of a role for citizen data scientists is a level up from this tradition. It is unlikely to work in organizations that have not already developed a relatively open information sharing culture. Rigidly hierarchical, top-down organizations will have a harder time creating a role for citizen data scientists and seeing to it that they are effectively recognized and trusted.
However, those organizations that succeed at this can achieve a successful blending of skill sets, so that those individuals who have invested years in acquiring first-hand knowledge of statistics, machine learning, database management, visualization, and coding do not also have to be deep in the practical business issues specific to the company operates. Nor do the people with deep knowledge of the product, the market, the regulatory environment, and the peculiar whims of customers also need to be artificial intelligence wonks.
The citizen data scientist role is not for everyone. It is best suited for individuals with an affinity for information systems, obviously, but also patience, good communication and consultative skills, and an ability to translate between the business problem and the technology tools that can be used to solve it—a grasp of both the competitive requirements of the business and the practical constraints of the IT infrastructure. As industries have digitalized, and information systems have become more widely recognized as strategic assets, many organizations have gotten better at identifying the people who have these qualities.
There is an important caveat for organizations interested in creating a role for citizen data scientists. Typically, the individuals identified as candidates already have jobs. They have defined responsibilities, and specific performance metrics for those responsibilities. These individuals live by those metrics; their performance against them can be critical to their individual career advancement. The additional responsibilities that come with their citizen data scientist roles could easily conflict with the responsibilities they already have, unless it is made clear that they will be taking on a new role, with new metrics, as opposed to taking on additional responsibilities on top of those they already have.
In a period of scarcity of data science expertise, the evolution of the citizen data scientist role could be your organization’s best route to operationalizing predictive and prescriptive analytics. Statistica offers a practical white paper, entitled “Embedded Analytics Empower the Citizen Data Scientist,” that helps to make the business case for this emerging role. Download your copy.