October | 2015
We started working on our anomaly detection suite of solutions two years back, with the platform being live at Wipro and our customers for well over a year now. Let us look back at how the product has evolved since we began. The ride started with working on Wipro datasets across process domains – to uncover anomalies and frauds in areas such as procure to pay, travel and expense claims, payroll processing, intellectual property theft and credentials misuse to name a few. The results were encouraging, led to significant effort savings and plugged leakages that previously went unnoticed. Below are some lessons learnt along the way
Challenges in data gathering
The first and biggest challenge in developing a robust Machine Learning offering is availability of real data that can be experimented with. As we expected, even with enterprise data available within the organization, there was need for jurisdictional regulations to be cleared, logical access to be enabled, identifying the right datasets, and getting access to it. We learnt to deal with messy data, data with quality issues and working with proxy data.
Moving from data to insight
With the business problem defined and datasets identified, next step was to get into data nuances and build business logic and detection models to find the anomalies. Data handling consumes substantial effort – from building mechanisms to ingest the data, to cleansing, transforming and tokenizing sensitive information. The core of the engine is the detection layer, which is a combination of business rules and machine learning algorithms. For each fraud type, our data scientists worked closely with the domain specialists to iterate between Machine Learning models, features and parameters to arrive at the best performing combination and get meaningful insights. Multiple models such as Logistic Regression, Modified Multi-Variate Gaussian, Boosting, Bagging using Random Forest were built and tested for their performance on various data sets and scenarios.
Flavors of machine learning
The different models mentioned above delivered varying levels of precision on different datasets. Given the innate attributes of a dataset and its quality, it is unrealistic to state a specific algorithm’s suitability and performance for the given scenario. However, on back-testing the performance of models and their ability to identify new fraud schemes, it can be concluded with a high degree of confidence that the suite of models applies well in multiple process and industry domains. We have, over the last few months, expanded the use case bucket to include line of business scenarios such as identifying fraudulent insurance claims, anomalies in master data using pattern recognition and marketing and advertising risk.
Application of models for different business settings and priorities requires customization and close interactions with various stakeholders, post which the red flags have a high hit-rate. Given the promising response from businesses, we see these developments changing the way risk, audit and operations teams work with significant gains in process velocity.
© 2021 Wipro Limited |
|
© 2021 Wipro Limited |
Digital Operations and Platforms
Engineering, Construction & Operations
Pharmaceutical & Life Sciences