New Ways of Working - How to manage and elevate test data for emerging technologies
How to manage and elevate test data for emerging technologies
October | 2018
Organizations with cutting-edge DevOps tools and practices still find it difficult to provision right data at the right time for continuous testing framework. Test Data Management (TDM) is a pre-requisite for the success of DevOps and Continuous integration/Continuous delivery (CI/CD) methodology.
TDM is no longer a traditional support and back office function, but a core and critical function that acts as a business enabler for security, agility and cost efficiency. There is a need for new TDM methodologies and strategies supported by efficient tooling strategy that can solve data provisioning challenges and help adopt DevOps and CI/CD methodologies.
How organizations manage test data today?
Organizations manage test data in multiple ways, right from maintaining them in simple excel sheet or CSV file to temporary database on shared environment:
Test data management has evolved over the past few years. Figure 1 illustrates the different phases of progression in test data management.
Figure 1: Stages of development in test data management
Early TDM approach was project-centric applications. This repurposed DWH/Dimension tables to serve as a test data repository with lack of focus on quality and governance of test data. The next phase, TDM 1.0 was designed to be more aligned to domain-centric approach i.e. Department/BU focused projects/local applications with awareness about data quality and reactive data governance. It was mostly focused on a domain and was more of an IT centric Initiative.
TDM 2.0 increased focus on data governance with focus on stewardship and business workflows besides core TDM functionalities. Interest on multi-domain TDM started emerging and there was increased business and IT alignment on TDM. With TDM 3.0, the overall TDM ownership and interest shifted towards the business stakeholder. TDM became proactive data governance and the overall architecture maturity shifted from consolidation to centralization.
TDM 4.0 and beyond
The current trend, TDM 4.0, is the expansion and optimization phase, which includes enabling DevOps, IoT, etc. In this phase, Machine Learning and Big Data will drive TDM strategy and the rise of Chief Data Officer in enterprises will mean that data management will have greater visibility and transparency in the enterprise data initiatives supported by new technologies like Database Virtualization, Service Virtualization, and Containers.
TDM strategy for DevOps and other emerging technologies should be built on the approach that we call the 4 Dollar, 3 Penny (4D,3P) principle (See Figure 2).
Figure 2: 4 Dollar, 3 Penny model for TDM
Along with the 4 Dollar, 3 Penny principle, the below given tenets should also be implemented for a good TDM strategy:
Secure at source
Data copied from production should be masked before landing in non-prod environments. On the fly masking should be adopted complying with regulatory and local laws. There are multiple TDM tools that adopt this process.
DevOps, Agile adaptability
Test data creation and provision should support DevOps and Agile methodologies leveraging test-driven data creation frameworks. This requires faster turnaround of data refresh, masking and creating ad hoc data requests to support ongoing testing activities.
Self service
Data request should be catered to all stakeholders as on-demand service delivery. The stakeholders can be developers, users, testers, data base administrators, data SME. This self service should help to reduce the turnaround time for refreshing the entire data/selected data set, provision only masked data, do periodic refreshes to golden copy, archive and backup the data. All these services should be provided as a service and catalogue based delivery model.
Synthetic data adoption
Programs should adopt ways and means to create dynamic data or synthetic data to cover out layered scenarios or negative scenarios that cannot be tested using existing production data. These synthetic data can be created using licensed tool/open source tools/simple scripts using .Net/Java platform.
Support to all testing phases
Test data should cater to all types of testing including functional, regression, automation, performance and user acceptance testing. Framework should support the development team to populate required data to unit test environments as well.
The future of test data management
Test data is traditionally dependent on production data with exceptions to use sensitized data in test environment. With improvements in technology, we see many open source companies and third party tool vendors adopting synthetic data, as it is more convenient and useful to create data for Agile/DevOps programs. Service virtualization supports creation of test stubs and complements TDM tools. Dependency on production data should be reduced as it involves huge turnaround time including masking sensitive data. The overall focus should be on leveraging the power of synthetic data generation, supplemented with service virtualization methodologies. We can still leverage production data to replicate any critical scenarios that need to be recreated in lower environments.
Sathiya Narayanan G
Senior Consulting Manager - Data Assurance, Wipro
Sathiya has over 18 years of experience in the IT industry with technical expertise on Data Warehouse, Test Data Management, and Big data systems. He is PMP and OCP certified, and is currently working on providing solutions on Data, Environment and Cloud assurance platforms to global enterprises.
© 2021 Wipro Limited |
|
© 2021 Wipro Limited |
Digital Operations and Platforms
Engineering, Construction & Operations
Pharmaceutical & Life Sciences