The Basics of Business Analytics - Part 1
By Chris Hedge
Techopedia defines data analytics as the “qualitative and quantitative techniques and processes used to enhance productivity and business gain”. Typically, data is collected and used to identify and analyze patterns via means ranging from simple visualization to sophisticated analytical models. Unfortunately, the term “analytics” has emerged as a rather generic term that vendors like to apply to products to differentiate their products or services from those of their competitors.
Business analytics can be broken down into four categories: descriptive, diagnostic, predictive and prescriptive. In this two-part blog series, we'll first focus on descriptive and diagnotics analytics.
Descriptive analytics is the examination of collected data, be it financial (i.e., cost, revenue, etc) or operational (i.e., miles of pipes, number of employees, failure rates, etc.), or any other quantitative (or qualitative) data to understand “what happened” over the period where the data was collected.
In finance this entails using spreadsheets, databases and visualization tools to compare numbers, drill down into costs/revenue categories, or apply basic statistics. Often, we see this in company reports that state, “Revenue grew 3% compared to last quarter” or, “Costs were down 6% from the same quarter last year”, and even, “The average cost per unit, increased/decreased”. We have all been in that meeting when the accounting person stands up and starts reciting the numbers “…Compared to last year, last quarter, plan, etc…”
Source: River Logic Inc.
Don’t get me wrong, understanding what has happened and doing some basic comparisons is valuable.
Take, for instance, a rail car utilization project we did for one of our customers. Anyone who moves a lot of product by rail knows that tracking the railcars is always a challenge, so the first thing we did was implement rail car tracking (utilizing the railroads' passive tracking system) to track the behavior of the railcar and determine if it is in use or is idle at either the origin (our plant) or at the customer’s site (destination). While this system doesn’t pinpoint the exact location at all times, it does provide enough of an indication to the owner of the railcar where that car happens to be at any given time and what it was doing at that particular time. Knowing these events and the details around these events helps ascertain when a customer has received the railcar and how long they are holding the car.
Diagnostic analytics takes business analytics to the next level and examines data to answer the question “why” – “Why did revenue grow 3% percent?” or “Why were costs down 6% from the same quarter last year?" How many times have you sat in a financial review meeting, some talking head stands up and begins to drone about “Revenue increasing", but never really getting into the “why” did revenue increase?
"Establishing causality is probably the most difficult task in the field of analytics"
In my mind diagnostic analytics is all about causality. Where the descriptive analytics gives us the effect, the diagnostic helps us understand the cause. Establishing causality is probably the most difficult task in the field of analytics and a topic better left for a discussion of its own. A simple example would be if you can establish a link (correlation) between running a marketing campaign and an increase in sales. Say that the historical data shows that every time you run a marketing campaign you see an 2% uptick in sales and therefore when you get into the financial meeting, a better explanation would be that sales are up 2% compared to last quarter because we ran a marketing campaign.
I would throw in a cautionary side note regarding spurious correlations. A spurious correlation is when you have two variables that appear to be closely correlated but are in reality not casually related. While attending Harvard Law School, Tyler Vigen illustrated the pitfalls in his book, "Spurious Correlations", and has a website dedicated to highlighting some of the funnier ones. Did you know that the per capita cheese consumption has a 94.7% correlation with the number of people who died by becoming entangled in their bedsheets?
While this is easy to spot, many are not. Researchers must be diligent to not draw causality where none exists.