Predictive Analytics and Social Media: Predicting the Unpredictable

University researchers have discovered a new way to predict what topics on Twitter will be popular hours before they are identified as trending topics, offering a novel method to analyze information that changes over time.

MIT professor Devavrat Shah and his student Stanislav Nikolov have developed a new algorithm that they say can, with 95% accuracy, predict the Twitter topics that trend, or suddenly explode in volume, reflecting their popularity.

Twitter determines the trending topics based on its own algorithm that analyzes the number of Tweets and those that have recently grown in volume, according to an MIT report on the research.

Shah notes that his research differs from the standard approach to machine learning in which researchers develop a general hypothesis about a pattern and specifics about that pattern need to be inferred.

“You’d say, ‘Series of trending things . . . remain small for some time and then there is a step,’” Shah says in the MIT article. “This is a very simplistic model. Now, based on the data, you try to train for when the jump happens, and how much of a jump happens. The problem with this is, I don’t know that things that trend have a step function. There are a thousand things that could happen.”

With the method that he’s developed, the data decides, he adds.

Shah and Nikolov compare changes over time in the number of Tweets about new topics to a sample set of data. Sample data where the statistics are similar to those of the new topic are given more weight to predict whether the topic will become a trend or fade away.

In essence, the comparison to the sample data set allows the sample set to “vote” as to the likelihood that the topic will trend on Twitter. The method can be applied to any sequence of measurements that’s performed at regular intervals such as ticket sales for movies or stock prices, according to MIT.

This is not the first time researchers have used predictive analytics to tap social media data to predict seemingly unpredictable trends.

A professor at the University of California Riverside (UCR), and other researchers, have created a model that uses data from Twitter collected on a particular day to help predict how often a stock will be traded and at what price the following day.

A trading strategy that’s based on the researchers’ model, “outperformed other baseline strategies by between 1.4 percent and nearly 11 percent and also did better than the Dow Jones Industrial Average during a four-month simulation,” according to UCR Today.

“These findings have the potential to have a big impact on market investors,” says Vagelis Hristidis, an associate professor at the Bourns College of Engineering, who has helped to develop the new model. “With so much data available from social media, many investors are looking to sort it out and profit from it.”

The researchers have found that stock price correlates with the number of connected Tweets about a company – those Tweets about distinct topics that relate to one company.

Facebook has also been targeted by data scientists attempting to use predictive analytics to predict the fluctuating stock market. Arthur J. O’Connor, who has worked on Wall Street in risk management for a couple decades, has developed a method that uses data analysis to analyzes if likes on Facebook affect consumer brand stock prices.

“My theory was, you know, it’s like in high school,” he says in a NPR report. “Does being really popular help you win friends [or] help you enhance your performance? And it turns out that, yeah, popularity does seem to help brands.”

O’Connor has spent a year tracking the likes of 30 brands with the most followers on Facebook, while also tracking their daily share prices.

“So, 99.95 percent of the change could be explained by the change in fan counts,” he adds.

The admiration a company gets on social media seems to be a good predictor about stock market performance, according to O’Connor.


Heather Harreld
Spotfire Blogging Team