Turning Big Data into building intelligence to help organizations

By Rajeev Nayar – Associate Vice-President –  Head of Big Data  and Architecture, Infosys

Big Data at this stage in the industry has many interpretations. Many view Big Data as the technology that enables the processing of large volumes of data which in turn helps to discover hitherto unknown patterns in the data that can create significant outcomes.

On the other hand Big data is viewed as uber umbrella under which all data in an enterprise both internal and external to the enterprise is brought together to generate business insights. It is this larger definition that I believe is the essence of what all of us are trying to achieve. This puts business insights at the front and center of big data initiatives.

Broken into its constituents this definition has the process of aggregating all data relevant to the enterprise and the data disciplines to develop the analytics and insights from this data. There is a lot of work done by most companies around the aggregation of data. The most common implementation of this is the data lake where people have aggregated the data from the enterprise .This only solves one part of the problem of data availability. I find that many of these companies are struggling with developing the insights from this data which is really the oil that drives business.

There are many reasons why this is happening

1. Lockness monster problem: You have created a deep data lake but have no idea what is in the lake. How then do you find the lockness monster in your data lake?

2. It may not be possible because of availability, timing or regulatory issues to bring all the required data physically into a single data lake.

3. Data quality is a problem

4. Inability to interpret the contents of the lake in a consistent manner (semantics)

5. Lack of governance to improve the quality and clean the lake

Solving this problem requires not just aggregating data but getting it ready for analytics and the analytical process of developing the insights.

At the enterprise level this can be done through:
1. A data platform that takes the various types of data in the enterprise (data at rest, data in motion) and solves the problems mentioned above making it available for analytics.
2. An analytical workbench that can leverage this data in the platform to develop the business insights

We are used to building data platforms that follows the data supply chain approach. The fundamental premise of this approach assumes the knowledge of consumption patterns. In the data platforms of the future we will have to drop this assumption which means that we need to store data in its true form i.e. in the raw form. The interpretation of the data or the semantics is moved to the edges or closer to the consumption. Governance of the data is done at the layer of truth or the raw data layer.

In addition to the fundamental shift in approach, such a data platform needs to support the following features
1. Collect all data in the raw form
2. Clean the data for missing values and ranges
3. Manage the metadata for the data in the platform
4. Provide the semantics for the data at the end points at the time of consumption
5. Capability to access data in a distributed manner
6. Provide the security and governance to maintain the sanctity and quality of the data

The physical realization of a platform like this can take many forms. A very common pattern is that of a datalake. The key point to understand is that a platform like this only enables analytics but does not guarantee deriving business value from your data. This is the unfortunate situation in which many companies find themselves today.

Developing the business intelligence from this data requires an analytics platform or workbench that allows you to iteratively work with this data to discover from it, create insights, share these insights and embed these insights into process or application all done through the self service paradigm.

The data platforms and analytics workbenches are only enablers. Deriving intelligence from your data that drives significant business values goes back to business fundamentals

1. Understanding the information that can improve business process e.g. customer intimacy requires interaction information across channels.
2. Understanding your business capabilities and the data they needs.
3. What are the critical data assets and what are you supporting data?.

The largest business value is usually derived by understanding your critical data assets e.g. customer, product etc. and their interplay with the rest of the data echo system.

Building your data platforms and analytical workbench to easily be able to perform data experiments on your critical data assets allows you to get to your business changing insights very quickly. This is also the fundamental premise of enabling an organization for analytics.

Finally the outlined steps also provides a path for the customers who have developed a data platform like a Data Lake but are looking to derive value from the accumulated data. The first step in this case is to inventory the assets in your data lake to understand what is there. This is done through metadata and governance tools e.g. waterline. The next step is to identify the critical data assets and develop the semantic context that will allow this data to be interpreted in the context of a particular area e.g. finance, marketing.

Develop discovery exercises to understand the relationship of the critical data to the rest of the data echo system. Finally define business experiments that are related to improving business processes and start demonstrating business value from these improvements whether this is about reducing risk on predicting business outcomes. This will start the snowball towards developing business intelligence from your big data that will drive significant business value.

big dataInfosys
Comments (0)
Add Comment