You can argue that data modeling and predictive analytics is as much art as it is science. There are many different approaches that can be taken when trying to come up with the best solution to a problem. That’s why some organizations have begun to “crowdsource” their modeling projects by making the data available to the public to see what solutions others can devise. We’ve posted here about the contest to win Red Sox Gear as well as the Heritage Health Prize that’s worth $3 Million to the winner.
If you think you have a data modeling problem that you’d like to crowdsource, there’s a platform available for you to do it. It’s called Kaggle — “a platform for data prediction competitions that allows organizations to post their data and have it scrutinized by the world’s best data scientists.”
Kaggle has hosted 17 different competitions, covering a wide range of topics such as World Cup outcomes, HIV progression, chess ratings and travel time predictions. Prizes have ranged from $100 to $10,000 and also include the $3 Million prize for the Heritage Health Prize, which is one of three currently active competitions on Kaggle.
The site has about 10,000 members who are registered to compete in the competitions. Members come from all over the world and have quantitative backgrounds such as computer science, statistics, math, engineering, finance, operations research and actuarial science. A chess rating competition sponsored by Deloitte with a prize of $10,000 has some high profile competitors – three of the team members who won the Netflix movie rating prize, the Microsoft team that devised the Xbox Trueskill rating system and Mark Glickman, the developer of the “Glicko” rating system.
The contest sponsor participates in an active, ongoing dialogue with contestants through the forums on the site. The dialogue itself can provide insight into different approaches for solving the problem, new questions that hadn’t been considered and insight into the data. Some companies are interested in using the platform as a way to find talented recruits for their internal data modeling efforts.
You can read more about the individual contests, including some of the winning algorithms and how they were developed, on the Kaggle blog.
To learn more about how Spotfire can help you use predictive analytics to gain insight from data, view the March 25, 2011 recorded webcast “Predictive Analytics with Spotfire”.
Steve McDonnell
Spotfire Blogging Team