You have heard about Big Data for a long time, and how companies that use Big Data as part of their business decision making process experience significantly higher profitability than their competition.

Now that your company is ready to embark on its first Apache Hadoop® journey there are important lessons to be learned. Read on and learn how to avoid the pitfalls and missteps so many companies fall into.

Pitfall 1: We Are Going To Start Small

It is natural for companies in general, and IT organizations in particular, to start their Big Data journeys under conditions where they can manage the risk by determining the viability of the technology. However, we have learned that the more data you have, the higher the likelihood of finding new and exciting insights.

In case after case, the size of the initial cluster is a good predictor of the success of the first Hadoop project. In other words, businesses that start out with cluster sizes of ten nodes or less generally do not have sufficient data in their Hadoop environment to uncover significant insights.

Best Practice: Start out with a cluster of at least 30 nodes. Outline business objectives and then bring in as much data as your infrastructure can comfortably store and process to meet them.

Pitfall 2: Build It And They Shall Come

Another common mistake that companies make is to build their Hadoop cluster without having a clear objective that is connected to deriving real business value. It is true that a number of companies start out with the objective to reduce the operational cost of their existing data infrastructure by moving that data into Hadoop. However, the cost benefits of such projects are largely limited to IT organizations.

To make a positive impact on your company’s revenues, profitability or competitive leverage through Big Data then you must partner with business to come up with concrete use cases that will drive such results. These use cases must outline the key business metrics and identify the data sources and processing steps required to achieve the desired business results.

Best Practice: Start out with a use case built around achieving concrete business results. Even if building a prototype keep an eye on rolling it out to production. Succeed or fail quickly and communicate success to the broader organization.

Pitfall 3: We Need To Hire A Team Of People With Hadoop Background

Many companies at the start of their Hadoop journeys hire an architect to simply install and configure their Hadoop cluster. A Hadoop architect is an expensive resource whose expertise are better utilized down the road when security architecture, governance procedures and IT processes need to be operationalized.

Hadoop is a unique technology that cuts across infrastructure, applications, and business transformation. It is ideal to have a Hadoop-centric practice which is part of the broader analytics organization, however finding personnel with background in Hadoop infrastructure and its various components is a tall order. Hadoop requires a unique set of skills that few companies have in place at the onset of their journey.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *