Big data helps granular thinking - for Pfizer it is about analyzing the clicks in a sales representatives presentation to decipher customer interest and sales competency; for insurance companies it's about knowing your next travel destination and serving you dynamic insurance quotes to cover the 'Now' in your life; or for retailers it is to serve products recommendations by knowing your preferences perhaps even better than your family!
Big Data toolsets helps crunch voluminous data and identify trends that are transforming varied industries. A clear upfront business case on the benefits of Big Data analytics is the first step, prior to embarking on a Big Data program. This will prevent a futile research endeavor and will help define the right architecture. Technologies for processing, analyzing and visualizing data will impact architectural and investment decisions. These are a few considerations:
- Type of queries - Real time Vs Batch
- Understanding data source systems and velocity of data arrival
- Determining if data will be owned for perpetuity by the enterprise or monetized
- Determine techniques for statistical analysis i.e. Genetic Algorithm, Data Mining, Association Rule Learning, Pattern Recognition etc.
- Type of data visualization techniques needed for analysis
- Setting up an enterprise owned environment versus getting quick results from cloud based big data services providers like Google BigQuery or Amazon EMR etc.
Hadoop — an open source platform, first gained popularity among startups analyzing petabytes of unstructured data with its potential competency to tinker with challenges of an early product. Not having to dish out dollars for software licenses is an important consideration for startups especially. A whole ecosystem of competency is now mushrooming around Hadoop as it finds applicability to transform various industries such as Healthcare, Retail, Upstream oil exploration, Public services etc. Silicon Valley-based startups such as Cloudera and Hortonworks are creating a support and service model for Hadoop and enriching it with performance monitoring and configuration management tools to ease enterprise adoption.
Enterprises are rallying around SAP Hana for its ability to drive in-memory computing in a preconfigured software environment orchestrated to run as a perfectly engineered data warehouse appliance. The operating word for Hana is 'quick' as it analyses data from Random Access Memory (RAM) and delivers lightning speed results for humongous structured data. Hana is an easy evolution for SAP's existing customers. SAPs last quarter results saw tailwinds from Hana and we hear the pipeline remains robust too.
Data Visualization is the next big bet, as rendering data graphically using techniques such as cluster gram, tag clouds etc. will be vital to making sense of processed data. This space still needs considerable work from data scientists and cultural anthropologists. Data analysis and visualization will call on reconfiguring the next generation with a new set of skills significantly different from software engineering to decode the human mind and its preferences. Privacy advocates can be outraged - there is no stopping corporate Big Brother now as enterprises get heady with the power of granular marketing!??