Wednesday, August 8, 2012

9 Questions to Help Uncover Your Big Data Requirements

The whole concept of Big Data projects can be overwhelming, 'though the promise is compelling. Whether you are analyzing Social Media Data or digging through corporate data,  it's not just about processing huge amounts of data. Just like any other new technology project, it is easy get caught up in the vortex of the hype and lose the bigger picture of what’s involved.  You don't want to find out after the swirling starts that you may be swimming in unwelcome growing tech debt. If you understand the type of functions your solution will need to handle, you will be better equipped to select the most appropriate tools to solve it in ways that incur the least tech debt.

 Here is a quick methodology that will help you develop your own perspective on the Big Data opportunity at hand. Of course, you do need to understand what you really want to accomplish with your BD project, but let's assume you already know the objectives. This will help reveal how complex it will be to handle the data capture, manipulation, and analysis and put the Big Data part of it into perspective of the overall project.

Print this out. Cut out the nine Big Data game cards. Now put them all on a flat surface, turn off your ipod, close the door, and consider each one carefully. Pick "blue" or "red" for each, whichever best describes the data you will be dealing with. Set aside any that you really want to answer "both" or "purple."

 The Big Data Game

As you handle and shuffle the cards, you will see some interdependencies across the cards, and perhaps you start lining them up in the order of processing. If you have the inclination to throw one out completely, set it aside to think about again.

 Now, when you're done, if your answers are a loud and clear "Blue!" on every front, you have the most straight-forward Big Data situation - one that is just about Big Data without the noise of most realities that magnify the project dramatically. Does your table look like this, with all the blues marked?


Most likely not.  Hopefully you have identified lots of ancillary tasks that will be necessary and that make this look like a data integration project as much as a Big Data project.  You will have to deal with other issues like:
·         Data security
·         Data transformation
·         Data federation
·         Data cleansing
·         Data capture
·         Data migration
·         Data updates
·         Data latency

These are all known problems, with solutions, of sorts. All of these requirements incur additional steps, and are often solved via staging of the data.  More than likely, with this exercise, you are contemplating that you will either need to have multiple staging of the Big Data (3 times Big Data is Big Big Big Data).  This is a huge driver for your company to adopt agile integration software (AIS), an imperative to such projects. Complementing Hadoop, AIS handles federation, inline cleansing and analytics, transformation and other processing without multiple steps along the way. Its transformation engine works directly across multiple sources, orchestrating and merging in their native modes as opposed to requiring intermediate conversion to XML, as XSLT engines do. Secure write-back to sources offers more degrees of freedom to the way you can think about Big Data problems.

 Enterprise Enabler® represents a new paradigm of integration, tremendously streamlining the creation of a Big Data processing environment, eliminating separate steps along the way. Enterprise Enabler is a leading edge federation and virtualization technology, combining EAI, ETL, ESB, and data orchestration to keep up with a constantly changing Big Data environment.