BMW Group (BMW) is a German luxury vehicle, motorcycle, and engine manufacturing company founded in 1916. It is one of the best-selling luxury automakers in the world and is leveraging deep learning with HDP to save on manufacturing and costs.

Three weeks ago, at the DataWorks Summit in Munich, we announced the Data Hero winners for the EMEA region. The winner in the Data Architect category was Tobias Bürger, Lead Big Data Platform & Architecture at BMW Group. You can read the announcement here.

BMW manages structured, sensor, and server log data. From that data, BMW produces batch, interactive SQL, streaming, and AI/Deep Learning analysis. Hortonworks Data Platform (HDP®) is one of the enabling technologies for BMW.

The team at BMW, under Chief Architect Tobias, has implemented over 100 HDP uses cases, including the generation of autonomous driving insights from sensor data, cost savings in research and development, streamlining the manufacturing process, and improving after-sales customer care. Once HDP was brought into BMW, it spread quickly and far beyond the original central users.

Additional BMW HDP use cases have brought architectural improvements around its technology stack, bringing in analytical capabilities never before possible.



You have heard about Big Data for a long time, and how companies that use Big Data as part of their business decision making process experience significantly higher profitability than their competition.

Now that your company is ready to embark on its first Apache Hadoop® journey there are important lessons to be learned. Read on and learn how to avoid the pitfalls and missteps so many companies fall into.

Pitfall 1: We Are Going To Start Small

It is natural for companies in general, and IT organizations in particular, to start their Big Data journeys under conditions where they can manage the risk by determining the viability of the technology. However, we have learned that the more data you have, the higher the likelihood of finding new and exciting insights.

In case after case, the size of the initial cluster is a good predictor of the success of the first Hadoop project. In other words, businesses that start out with cluster sizes of ten nodes or less generally do not have sufficient data in their Hadoop environment to uncover significant insights.

Best Practice: Start out with a cluster of at least 30 nodes. Outline business objectives and then bring in as much data as your infrastructure can comfortably store and process to meet them.

Pitfall 2: Build It And They Shall Come

Another common mistake that companies make is to build their Hadoop cluster without having a clear objective that is connected to deriving real business value. It is true that a number of companies start out with the objective to reduce the operational cost of their existing data infrastructure by moving that data into Hadoop. However, the cost benefits of such projects are largely limited to IT organizations.

To make a positive impact on your company’s revenues, profitability or competitive leverage through Big Data then you must partner with business to come up with concrete use cases that will drive such results. These use cases must outline the key business metrics and identify the data sources and processing steps required to achieve the desired business results.

Best Practice: Start out with a use case built around achieving concrete business results. Even if building a prototype keep an eye on rolling it out to production. Succeed or fail quickly and communicate success to the broader organization.

Pitfall 3: We Need To Hire A Team Of People With Hadoop Background

Many companies at the start of their Hadoop journeys hire an architect to simply install and configure their Hadoop cluster. A Hadoop architect is an expensive resource whose expertise are better utilized down the road when security architecture, governance procedures and IT processes need to be operationalized.

Hadoop is a unique technology that cuts across infrastructure, applications, and business transformation. It is ideal to have a Hadoop-centric practice which is part of the broader analytics organization, however finding personnel with background in Hadoop infrastructure and its various components is a tall order. Hadoop requires a unique set of skills that few companies have in place at the onset of their journey.


With the San Jose DataWorks Summit (June 13-15) just two months away, we’re busy finalizing the lineup of an impressive array of speakers and business use cases. This year our Enterprise Adoption Track will include Nick Evans and Kevin Brown from ExxonMobil with Wade Salazar from Hortonworks.

Big Data is driving major advances in the oil and gas industry, resulting in increased productivity and cost savings throughout the extraction and production cycles. Advances in instrumentation, process automation, and collaboration are generating data from myriad new sources, including sensors, geolocation, weather, and seismic data. Combined with human-generated data, such as market feeds, social media, email, text, and images, a wealth of new analytical insights are transforming the industry as a whole.

Join Nick, Kevin, and Wade as they present:

The Evolution of Streaming and Data Lake Shared Services at ExxonMobil: Lessons from a Fortune 10 Adoption

Abstract: Analytics applications grow more powerful as they leverage new types of data from sensors, machines, server logs, clickstreams, and social media. The Hadoop-based Data Lake enables that analytic potential, but the shared service supporting it must scale efficiently and enable deep insight across a large, broad, diverse data set to a variety of consumers. Come learn how ExxonMobil created its first Big Data shared service across an enormous enterprise – from data ingestion at the edge using Hortonworks DataFlow to long-term storage in Hortonworks Data Platform, culminating in data exploration and analysis with business intelligence tools.

About the Speakers

Nick Evans, ExxonMobil
Nick Evans is the Big Data Service manager for Data & Analytics at ExxonMobil with a team of developers and engineers focused on embedding world class analytics to solve big data opportunities across the corporation.  Mr. Evans has 15 years of experience at ExxonMobil in a variety of roles in information technology.  He is very excited to work in Data & Analytics and Big Data due to its broad application and impact across all business lines.  Mr. Evans holds a BBA in Management Information Systems from Texas Tech University. 

Kevin Brown, ExxonMobil
Kevin Brown is the Big Data Service Platform Engineer for Data & Analytics at ExxonMobil with a team of data architects and engineers focused on embedding world class analytics to solve big data opportunities across the corporation.  Kevin’s previous experience in software development and Linux administration played a critical role in helping pioneer a Big Data platform at ExxonMobil.  Kevin holds an Information Technology degree from Brigham Young University.

Wade Salazar, Hortonworks
Wade Salazar serves Hortonworks as a Solutions Engineering in Houston TX.  Educated as an electrical engineer, fluent in many programming languages, and having worked in the control systems trade for over ten years before joining Hortonworks Wade enables those looking to apply big data tools to industrial processes and equipment.  Outside of work Wade is passionate about technology, the outdoors, cooking, dogs, horses and Texas lore.