Inside the engine that mines Twitter

Datasift evolved its RSS analytics offering into a social media big data analytics tool that can mine Twitter and Facebook and deliver intelligence to its customers.

In the aftermath of Superstorm Sandy, Splunk4Good was actively involved in the relief efforts, and tracked social media activity to get help to storm victims.

The company married social media with big data to figure out where the relief was needed, and what kind of relief it needed to be, but it couldn’t have done this without the underlying technology from a company named Datasift.

Datasift is a big data company that provides developers and third parties with social media data, including feeds from Twitter, Facebook, and from other sources.

The company has been around for a little over five years, and was founded on the basis of wanting to do data curation, said Nick Halstead, Datasift founder and chief technical officer.

The company started out in the world of RSS feeds, which allowed users to keep track of frequently published works like blogs and news items.

“We analyzed streams so you could read them by topic, rather than by content source. That quickly changed when Twitter came along,” said Halstead. “We basically took all the technology that we developed, and we brought it into the world of Twitter.”

The company was able to sort out the top news that was being shared on Twitter through a service called TweetMeme, which grew to over 10 million users during its first month.

While working on TweetMeme, the company actually invented the original “retweet” button, which is used around 1.5 billion times a day.

That put Halstead and his crew on the social media world’s radar, and from that, Datasift starting coming together.

“Although we were a consumer site at the time, we were making a lot of money by selling the analysis of the data that we were actually collecting, and that sort of pushed us towards what then became Datasift,” Halstead said.

On a daily basis, Datasift will collect just over 1.5 billion items a day from Twitter, Facebook, other social media websites, forums and blogs, and will process and perform analysis on top of that data.

Services that the company provides include “natural language processing, language detection, trend analysis, we do deep demographic analysis of public data to try to understand who people are and what they’re sharing,” Halstead said.

“What we’ve seen in the market in the last year is the expansion of where social data is being used,” he said. One of the opportunities that the company connected with this past year was its partnership with Splunk.

“We started to see a lot of other areas of businesses, and started to see that social was important to them within their business, so we started getting involved with a range of business intelligence companies where their customers were saying, 'We see these analytics going on within the social. Can you bring that data into the platform of what we’ve already bought from you?'" Halstead said.

“That’s really how we got involved with Splunk,” he added.

The company has a data connector for Splunk Storm, the company’s cloud, and Splunk Enterprise, and pushes data into both of their core platforms through it.

Going forward, the company continues to see the social technology market growing at a healthy pace, so it will continue to support it, Halstead said.

“To us, the syndication of the data is the smallest part of the business,” he said. What really matters, what customers really care about, is “how you can get this data ready so that I can use it in my business,” he said.