Tuesday, 12 August 2014

Why Bother With Data Analytics?

“I know my market, how it works, the seasonality; my clients and I can do a better job than any of your analytics nonsense!” If you hear someone say that today, especially in light of what analytics can really do for an organization’s growth, you know that someone is a bit of a living monument!

For though no one disputes a business expert’s instinct, data analytics, with its roots firmly within the ambit of statistical intelligence,  is completely free of human bias and assumptions that could creep into seemingly smart decisions. But that is understating the advantages. The actual juice of analytics is to construct the big picture from seemingly meaningless little ones. Analytics synthesizes information from experiential historical data that adds better credence and credibility in a management’s decision making process. "Optimization" is often at the "core" of most that Analytics seeks to deliver; and what organization in an age of information overload - be it through the conventional customer cell, social media, POS, emails and  internet-would not want that? ‘None’ would and should be the obvious answer yet the surprising fact is that there remain a host of sleeping and lumbering giants.

Though every organization worth its salt today understands the holy-grail that is "information"; most continue to  grapple with the problem of how to leverage it. Many scratch the surface by employing overly simplistic methods to glean, screen and interpret data in ad hoc bursts of well-intentioned yet mis-guided enthusiasm- all this while choosing to disregard the fact that their data (and analytics practitioners) can do far, far better. Instead of getting easily satiated with the obvious and apparent, these organizations need to be able to unlock real secrets-those that will make the real difference.
Wouldn’t it be nice if there was a steady flow of meaningful and relevant information right into the lap(top)s of key decision makers- who could then integrate these insights to their advantage?  Of course it would be!

The first step towards this is  ensuring the data being captured is  useful, clean and workable ( and taking corrective measures if not). This will form the foundation of all data analysis that companies wish to do .So if they can’t do this piece themselves consistently and properly, they shouldn’t wait to call the analytics experts- who will then work the data in a turn key fashion- right from preparing it in the way it should be to tinkering around with it in ways only they know. The  information that will finally provide will –as stated before – be free of bias- based purely on cold hart facts- this information can improve operational efficiencies, identify patterns and subsequent opportunities and help companies manage and plan resources, processes and risks in the way they don’t even realize they can.

But to reach this stage they need to first accept that there is only so much their own instinct or the ‘nose’ for business can do; that there is a reason why there are analytics experts today; that there is a reason why they are busy.

Monday, 21 July 2014

Data Cleaning-How Important is it?

Data cleaning in itself is never the end goal. Instead it is the key step towards one. Organizations spend a lot of time, money and resources into gathering data through a variety sources. The primary reason for this effort besides legal, compliance and customer service requirements is to help make quality management decisions. If the database for whatever reason is not error free, the resultant analysis and interpretation is most likely to be of poor quality. And when this recorded data is unable to support this crucial requirement, the huge investment of time, effort and manpower remains sadly largely wasted.

So what really constitutes the ‘dirty data’? While not necessarily always incorrect, any data that suffers from errors related to formatting or is out dated, partially captured, captured once too many times, or no longer relevant is labeled ‘dirty’. The reasons for this can generally be traced to how data is entered ( is the staff trained, is the system user friendly?) to how it is stored (are there regular audits, is it formally recorded and maintained?)

Most organizations maintain several databases that may or may not talk to each other. In order to interpret insights however they need to merge at some point. This becomes a huge challenge as each source may carry different representation of the data, many duplication errors and also redundant information. The job of data cleaning includes ensuring clear consistency across these different types of sources  to allow for easy merging when needed and hence a comprehensive and easy overall analysis.

An interesting but lesser known (or appreciated) fact in the process of analyzing data is that a major chunk of the overall allotted time is used to simply clean the  data and beat it into the shape and state that  is conducive for interpretation. It could be possible, for example, that the data available in a company may be factually correct but difficult to process through the analytics systems. Data cleaning then has to ensure that such data is captured in a manner that makes it easy to be used in data analytics.

Deduplication, column segmentation and matching of records are the basic methods of data cleaning. However depending on the data type, the methodology is likely to vary. On a basic process level we begin with inspecting data samples to get a handle of the kinds of errors that need attention. This will then lead to creating detailed workflows for improving data quality, laying down thought out processes of error correction, listing rules for future data capturing etcetera. Once this is done, the corrected data is retested for errors missed and its usability for analytics.

Laborious and at times, time consuming though it may seem, data cleaning certainly is the most crucial step towards procuring in depth  insights that empower the management with all the relevant information to improve the quality and speed of  decision making.