Data is the New Oil! and it is upto the smart ones to mine it!

This just about sums up the relevance of data today. In this article, we discuss about understanding what is Data and how to mine data, which brings us to the next relevant topic - how to mine data?

Analytics holds the key to transform or mine data into information. We discuss about the broad domains where Analytics may be applied and try to understand what can be achieved with analytics. Before jumping into advanced analytics, we discuss the importance of Exploratory Data Analysis,a topic which is often not given its due importance by geeky nerds willing to jump the gun into advanced algorithms.

If Analytics is the so called 'Light Sabre' then certainly Data Scientists are the new Jedi!

So what is Analytics? Broadly the term “Analytics” covers a wide umbrella of domains such as Prediction, Forecasting, Credit Risk & Fraud Detection, Market Segmentation, and Sentiment Analysis to name a few. Analytics may be considered as a layer of operation on data to produce meaningful insights and predictions. Thus, for example, if we had the entire Employee Data for a certain company, the HR team could, on the basis of historical data, predict a typical potential candidate for resignation or give an estimate of the percentage of attrition prediction for the next quarter.

This information could prove extremely valuable to the senior management to plan let’s say their upcoming recruitment cycle.

This brings us to the more fundamental question: What is data?

The answer being very simple! Everything and anything around us is a potential data source! The very fact that we are meeting with students & faculty discussing analytics in a B-School could be a data point for “number of Industry Expert Lectures hosted by this B School in this year”.

Analytics can be applied over data at various levels, and at the very base, lies “reporting”, which summarizes and presents information of “that which has happened”.

As we go up one level and use forecasting and predictive machine learning on that which has happened, we can predict “that which can happen”, given similar occurrences in the past. As we keep bumping up the technology level, we are nearing the domain of Artificial Intelligence (AI) wherein we aim to create systems which can actually “think” and learn themselves: Enter the worlds of robots and automation!

Should we be scared? Does this mean that the white collar jobs will be redundant?

Actually both Yes and No! Some repetitive manual jobs on the lower side of technology will be redundant. But at the very same time, newer opportunities in advanced analytics will open up.

Data Science being a multidisciplinary subject would require skilled resources from various domains such as Statistics, Engineering, Computer Science to name a few. This is the decade of Data and the world is moving towards becoming smarter and leaner in terms of technology, both of which mean tremendous opportunity for aspiring Data Scientists / Engineers/ Analysts in today’s businesses.

As an example of how Advanced Analytics can help businesses take smarter decisions, we look at some transactional data from a typical retail store. (See Table 1)
Table 1 : Transactional Customer Data (Part File Shown)

Can we mine this data to understand and segment the customers such that we can target a select group of customers based on their profile?

The answer, as you guessed, is: Yes we can! The Transactional data can be used to “create” more informative computed attributes (see Table 2) which tells us how frequently a customer visits the store, the interval between visits , the Total amount purchased, etc.
Table 2: Computed Information aggregated at a customer level (Part File shown)

This information, if aggregated at a customer level, can be fed to a typical “machine learning” Algorithm for segmentation, resulting in a few categories of customers. The categorized data can then be summarized to arrive at attributes ( See Table 3) on which the strategy team can take decisions on further sales promotion:
Table 3: Final Summarized Segmentation of Customers

From Table 3, we easily find out that the Customers belonging to Category 3 result in 58% of the Revenue but form only 10% of the customer base!

We instantly find out our potential high value customers.

So, we started from a lengthy transactional file consisting of almost a quarter million rows of data; and ended up with just 4 lines that carry all the values that we require. This shows how advanced analytics is simplifying strategic decision-making in businesses! So now it would be a good time to grab that machine learning book and starting off on the most “sexy” career path of this decade!

Next, we briefly look into how Artificial Intelligence (AI) technologies can enhance insights for business which can lead to higher productivity and revenue.

Recent successes in Deep Learning accelerated progress in the traditional AI tasks. Deep Learning is a technique which takes inspiration from how a human brain works, and models “learning” through neural nets.

In simple words, a deep neural network is an excellent function approximator or pattern recognizer. Hence any problem which can be seen as mapping an object X to object Y can be solved quite efficiently by this method.

For example, X can be a set of “features” such as the height, weight, age of the person and Y could be life expectancy. From lot of examples, a neural network based architecture can “learn” complex patterns in the data and would be able to predict with fair accuracy the life expectancy of a person with completely different (and unseen) set of physical features.

This could be very useful in the insurance industry to propose a highly personalized premium for their customers.

The real power of AI for businesses, however, comes from its ability to do remarkably well at perception tasks.

These tasks can be broadly categorized as vision, speech and language tasks. The ability to meld AI with regular business analytics can be loosely termed as cognitive analytics.

If you look at the example dataset presented earlier, you can see that there is a column which is the age. A question to ask, is how can the personal attributes of the customers be collected in the most unobtrusive manner?

While big retail outlets can afford to do a comprehensive data collection, how about the smaller shops? Do they have to lose out in the data revolution?

The answer is, AI can be used to accurately predict the gender, age, sentiment of the person to a fairly large degree (refer to the Figure 1). These data can be used to generate hidden insights which can directly impact the business topline and bottomline.
Figure 1: Automated detection of people and their gender, age, and sentiment from an image

The ultimate dream of AI for retail, is that the system can provide a fully personalized interface to the customer, which can understand the customer's habits, and preferences and can adeptly negotiate and interact with the customer in their own language leading to high retention and repeat engagements.

This is already possible to some degree through the use of Natural Language Understanding (NLU) and its prevalent use in conversational agents a.k.a chatbots. The Natural Language Processing (NLP) methodologies can also be used to automatically extract sentiment from reviews and feedbacks which can be used to improve services delivery. Similarly, Automatic Speech Recognition (ASR) technologies can be used to understand spoken conversations and be used to trigger specific actions.

The time is ripe for businesses to invest in data analytics and AI to generate value for their organizations. The risk of being averse to this disruptive technology is losing competitive edge in the market.

This also presents a unique opportunity for the young task force to start new ventures in the area which can give back rich dividends. It is also a wake up call for working and new professionals to upskill themselves with relevant capabilities to remain relevant in the job market.

Editor's Note:
We are building TowardsAI with an intent to educate and spread awareness about AI in business. Please share this article with your colleagues and friends in Retail and Insurance companies. Kindly support our mission by using the 'Comments' section below, to interact with Dr. Anish and Dr. Jacob. Your comments will help us publish more relevant content suited to your interests. Thank you for your support. :-)

Jacob and Anish

Dr. Anish Roy Chowdhury is a Data Scientist at HP Inc. working in Supply Chain Analytics. Dr. Jacob Minz is a Staff R&D Engineer at Synopsys, India. He is an avid Deep Learning & AI practitioner.