Is 'big data' the next big thing? Big Blue thinks so.

Mining vast amounts of unstructured data will produce innovative solutions

Can Twitter or other social media sites help prevent the outbreak of an epidemic or pandemic? It’s not possible now, but probably will be within the next decade or two as the concept of big data takes hold, says Bernard Meyerson, vice president of innovation at IBM Corp.

“Big data is an opportunity of a lifetime for the scientific and technical business communities,” Meyerson said, discussing the 100-year history of the company widely known as Big Blue.

He said the concept of big data is the latest example of innovative thinking that has been the company’s hallmark since its founding in 1911, and now includes “the Watson,” an artificial intelligence computer capable of answering questions posed in natural language.

“The idea is to be able to extract knowledge from vast amounts of data from a multitude of sources,” he said. “But the problem historically has been getting that data together, particularly if it’s what I call ‘unstructured data.’”

Meyerson said an example of unstructured data would be a group of doctors taking notes on a patient’s condition but not sharing their diagnoses, prognoses prescriptions and expectations for recovery.

Mining vast amounts of unstructured data to arrive at discoveries that benefit society in healthcare, environmental protection and other issues, he said, “That’s an incredible breakthrough that today is almost impossible.”

Facebook, Twitter and other social media sites represent an untapped fertile field for data mining, Meyerson added.

“You could basically see a pandemic coming because you’ll see an uptick in the number of calls on social media for treatment of a particular symptom,” he said. “Things like that is breakthrough [technology] – and it’s just starting.”

Sensors all over the planet are already providing data that can be used to alert motorists to traffic congestion on major highways. But with big data mining the sensors soon will be used to predict and prevent tie-ups before they occur and proactively reroute traffic.

“The ability to collect that data and more effectively manage both private and public transportation will be fantastic,” Meyerson said.

“All of these things revolve around the use of what I call ‘big data,’” he added. “And if you’re asking me what’s going to be the huge thing coming down the road in the next five [or] 10 years, it is the ability to collect it, analyze it and proactively use it for the betterment of society.”

“It’s things like that that really change the way we think of information technology,” Meyerson said.

Big data is beginning to take on the attributes of a tool rather than an exotic concept, he added. “We’re finally getting to think of it as something as fundamental as a shovel, a basic part of life. If you’re going to dig holes, you need a shovel.”

IBM, of Armonk, N.Y., ranks No. 17 on Washington Technology’s 2010 Top 100 list of the largest federal government contractors.

About the Author

David Hubler is the former print managing editor for GCN and senior editor for Washington Technology. He is freelance writer living in Annandale, Va.

Reader Comments

Tue, May 17, 2011 Garcol

One would be hard-pressed to find a computer-based technology which has not been turned to service for social control -- Adm Poindexter's Total Information Awareness project (renamed but obviously not abandoned) still thrives: Case in point - 'big data'

Tue, May 17, 2011

To summarize this article, IBM has done nothing and built nothing related to big data. They don't plan to build anything related to big data. The writer throws in buzz words like Big Data, Twitter and Facebook and offers no useful information about anything. And somehow this is a newsworthy article. I got your next article: Facebook Twitters Cloud Computing. That's good writing.

Tue, May 17, 2011 john Ellingson VA

There is a huge misconception that data needs to be assembled to be analyzed. There are large commercial data operations today that consist of hundreds of billions of records that are widely dispersed and unindexed, yet are used for sophisticated data analysis and mining operations that are performed with subsecond response times. We deploy such a system.

Please post your comments here. Comments are moderated, so they may not appear immediately after submitting. We will not post comments that we consider abusive or off-topic.

Please type the letters/numbers you see above.

WT Daily

Sign up for our newsletter.

Terms and Privacy Policy consent

I agree to this site's Privacy Policy.