The term “big data” is a bit of a misnomer. That’s because while the size of the data streaming into companies from social networks, online shopping, mobile devices and sensors attached to machinery is massive, the most important aspect of big data is the potential for how it can be applied to solve some of the world’s most vexing problems.
That’s the assertion of a new article from Harvard magazine that notes that the most compelling aspects of big data are its ability to create new knowledge by linking datasets as well as its creative approaches to visualizing data.
The article notes multiple applications of big data to improve people’s lives including: predicting where and when crimes will occur to allocate police resources; linking air quality with health; or using analysis to develop crops for drought resistance.
Weatherhead University Professor Gary King developed and implemented “what has been called the largest single experimental design to evaluate a social program in the world, ever,” notes Julio Frenk, dean of Harvard School of Public Health (HSPH) and former minister of health for Mexico.
When Frenk took office in 2000, more than half of Mexico’s health expenditures were being paid out of pocket, with four million families decimated by catastrophic expenses for health care.
Frenk led a healthcare reform that created a new public insurance scheme, and King invented methods for analyzing it.
After 10 months, King’s study showed that the public insurance successfully protected families from catastrophic expenditures, and his work guided additional needed improvements such as promoting the use of preventive care.
“People are literally dying every day” simply because researchers are not sharing data, King notes in the article.
Nathan Eagle, an adjunct assistant professor at HSPH, has tapped data analysis to help predict disease outbreaks in Rwanda. He has linked mobile phone records to create models of commuting patterns, called “radius of generation,” with data on cholera outbreaks.
“We could even predict the magnitude of the outbreak based on the amount of decrease in the radius of generation,” Eagle notes. “I had built something that was performing in this unbelievable way.”
Scientists also are using visualization to understand very large datasets.
Hanspeter Pfister, Wang professor of computer science at Harvard, for example, has worked to create a visualization for oncologists that connects a patient’s gene sequence with the type of cancer he has so that new treatments can be designed.
“The data themselves, unless they are actionable, aren’t relevant or interesting,” Eagle concludes. “What is interesting is what we can now do with them to make people’s lives better.”