Step 1 – Problem Definition and Expectation Setting
The first and foremost step is planning, which includes problem definition, obtaining buy-ins from key stakeholders and executive sponsorship.
Albert Einstein once said “If I had only one hour to save the world, I would spend fifty-five minutes defining the problem, and only five minutes finding the solution.”
A good start will be to identify the problem and approach it from different perspectives. This would uncover aspects that might have otherwise been overlooked, resulting only in a partial understanding of the situation in hand. Therefore, it is imperative that we break-down and define the problem in granular terms.
Let us now explore why this seemingly intuitive step has proven to be critical in our deployments and in delivering results.
For one of leading financial services group, the problem was loosely stated as detecting data quality issues in their financial transactions data. And here’s how we approached it: Our first step was to lay down the scope more tightly and involve SMEs with the objective of describing the business problem in data terms. Thereafter, our second step was to identify areas of data where quality is vital and could have the largest financial or regulatory impact. And finally, we moved on to identifying the critical attributes.
The crux of problem identification was also to understand the root cause for poor data quality that usually spans across data entry errors, duplicate creation, incorrect mapping, ETL jobs running incorrectly or ineffectual migration from legacy systems.
Considering the nature of today’s business context, a single business problem could potentially have varied manifestations across organizations. Henceforth, it also remains key to clearly understand ‘the Why?’ of the problem and contextualize it in order to avoid gaps in outcomes. And while doing so, it is also important to note that stakeholder buy-ins along with their willingness to accept new technologies to solve age-old/traditional problems is essential for the program to triumph and deliver meaningful results.
Having contextualized the problem, the subsequent steps of data handling and tweaking detection models follows a well thought-through path – which we will explore in the next post.