Stop waiting – Start analyzing big data yourself

Over the past several months, with the industry abuzz with the importance of big data analytics and how it will be shaping the change in market research over the next several years, the big question on the minds of most market research executives is where to turn to help drive this aspect of their business; which company out there is going to provide the technology answers that market research so sorely needs right now?  There are a few companies that aim to help with just that.

A couple of start-ups are competing neck-and-neck with each other in the business intelligence sphere of big data analytics at the moment, both with several big-name investors and partners who are very interested in how they will continue to develop.  Datameer, according to their own marketing, does on its own what previously it would have taken 3 separate vendors, processes, and teams to do.  Datameer combines data integration (the combining of disjointed and disconnected data sets), dynamic data management (creating relationships among that data), and data analysis into one single package.  It presents an easy-to-use package that allows a data analyst to fairly quickly integrate and visualize data from multiple sources and find correlations between datasets that would otherwise be very difficult and time-consuming.  In one of their easist hands-on examples, they show you how to upload an example file containing some data about several individuals (Name, age, location, etc), how to easily strip away the data you don’t need, and then create a bar graph showing the ages of the individuals in Chicago by their name.  Now, this example and of itself is not impressive.  Anybody with intermediate Excel skills could accomplish the same result.  What’s impressive is where you can go with this, and how each of the simple features along the way can be modified to make the product live up to its name of “big data” analytics.  The results that you can pull from the previous example need not be static – if you had a database with a constantly updating set of metrics, like an online database of users, you could connect it to Datameer to dynamically continue polling for new names, locations, and ages, and update the graphs accordingly, so you always know how your data is changing.  For a panel company, this could be a truly fantastic tool to keep your panel numbers up to date, so you know exactly the kinds of demographic split you have: where your members are located, how old they are, which industry they work in.  Go one step further, and you have dynamic graphs illustrating the precise breakdown of the ages of all your panel members who work in the mining industry, for example; all without having to re-run your queries.  As soon as a new member signs up and fills out the registration, your graphs update automatically with the results, and all you do is send your clients a link to the breakdowns, so they can see in real-time how your panel looks and drill-down into specific metrics if you allow them to do so.  No more updating and sending PDFs with millions of different numbers on them.

Additionally, there are features that allow you to combine all of these data sources that are within your control with social data as well.  Using the Twitter integration feature, you’re able to pull all the tweets for a particular user and use that data together with your other sources.  As in Datameer’s example, you can pull filmmaker Michael Moore’s Twitter handle and get everything he has ever posted, and check it against the FBI “Monitored Words” list to see how frequently hot-button words are mentioned by him.  You can even graph that over time, to see which words are suddenly “on the rise”, or which topics were more or less popular during a particular time.  Think about this on a bigger scale – being able to deliver “small data” survey results to your clients along with “big data” general trends to form an “overall picture”.  Your yearly tracker survey shows that 5% of the population is no longer drinking sugary carbonated drinks.  You want to know why.  Well, your survey respondents said “they don’t like the taste anymore”, “it’s too unhealthy”, and “I prefer coconut water now”.  You have a lot of little responses, and you can put them into a nice pie chart saying that 31% of respondents said it’s too unhealthy, 24% said they prefer other drinks, and so on.  Does that really give your client actionable data?

What if, when you looked at your “big data” analysis from the Twitter universe, you saw a timeline of the tweets that were hastagged under the topic “#healthyliving”, and saw that compared to last year, most of the comments this year regarding carbonated sugary drinks were more negative than positive.  And what if you could pinpoint the top 10 “influencers” (people with the largest amounts of followers) who were making these comments?  Wouldn’t you then have a more constructive game plan to give your client, with some exact goals and suggestions on who should be engaged to potentially reverse these damaging trends?

There is a whole realm of possibility here, and this is just scratching the surface from an initial analysis of the platform.  The barrier to entry is not that high: a personal account starting at a very-affordable $299/year, limiting the buyer mainly in the amount of users allowed (just 1) and the amount of data volume used in the system (100 GB/year).  Datameer uses the spreadsheet model of data presentation, so anybody familiar with Excel should feel quite at home.

Just about all of these big data analytics solutions are built on the open-source framework Hadoop, which is a white hot topic right now in very many industries, as it features an easily-scalable, commodity hardware-based database management and work distribution system that is very well-suited to working with large, distributed datasets that are not easily related using the traditional relational database model.  This makes it especially suited for analyzing things like social media traffic, where data is “all over the place” and may not be logically joined by any keys or identifiers.  Hadoop is used as the backbone for their databases by many internet giants, like Twitter, Google, and Facebook.