Whenever I have a chitchat or formal talk with a BI or Analytic person, the most widely asked question is
‘what is the point of Hadoop?’.
It is a more fundamental question than ‘what analytic workloads is Hadoop used for’ and really gets to the heart of uncovering why businesses are deploying or considering deploying Apache Hadoop. There are three core roles:
- Big data storage: Hadoop as a system for storing large, unstructured, data sets
- Big data integration: Hadoop as a data ingestion/ETL layer
- Big data analytic: Hadoop as a platform new new exploratory analytic applications
While much of the attention for Apache Hadoop use-cases focuses on the innovative analytic applications it has enabled and high-profile adoption at Web properties. Initial adoption of Hadoop at traditional enterprises and later adopters are more likely triggered by the first two features. Indeed there are some good examples of these three roles representing an adoption continuum.
We also see the multiple roles playing out at a vendor level, with regards to strategies for Hadoop-related products. Oracle’s Big Data Appliance, for example, is focused very specifically on Apache Hadoop as a pre-processing layer for data to be analyzed in Oracle Database.
While Oracle focuses on Hadoop’s ETL role, it is no surprise that the other major incumbent vendors showing interest in Hadoop can be grouped into three main areas:
- Storage vendors
- Existing database/integration vendors
- Business intelligence/analytic vendors
This is just a small instance I took to showcase how the major DATA players are slowly adopting this new technology to harness its capabilities to retain there position in the major players list.