A cloud-native infrastructure is defined in large part by its modularity. This decoupling of the entire architecture has led to innovation and agility, but it has also created many moving parts that must allow programmatic access and greater observability to achieve the highest quality assurance. This adds a new dimension to testing cloud-native architecture, one in which traditional quality engineering (QE) roles can no longer be compartmentalized to the development cycle. Modern quality engineers (QEs) must now work more closely with site reliability engineers (SREs) or even evolve into SREs themselves.
Wipro’s Quality Engineering in a Cloud-Centric World report found that 70% of organizations have a cloud strategy defined, and quality engineering and assurance form a critical part of that strategy. But cloud-native applications require a higher degree of service discovery, automation, observability, traffic management and security, and production-like test strategies must be used earlier in the software development lifecycle (SDLC) to reduce defects. This has caused a paradigm shift for quality engineering. To succeed, today’s cloud-enabled QE must become proficient in test automation and be able to understand monitoring, observability, microservice patterns and architectural principles.
Three Principles for Cloud-Native Assurance
The new quality engineering paradigm must address the decoupled nature of the cloud. Testing should include verifying scalability and the capacity of the infrastructure for horizontal pod autoscaling. Organizations should also validate the system’s ability to bounce back from failure and persistence in both data and state. It is also critical to test the ability to undertake risky deployments without hamstringing production, and to assure that requests and limits are configured properly to handle production or production-like traffic
Some of the principles organizations need to adopt for cloud-native assurance include testing uncertainty through autonomy, shift right with QAOps and proselytizing testers to quality engineers.These tenets need to be assured along with the set of functional/regression tests traditionally performed by API testers.
- Testing uncertainty through autonomy
One of the key bottlenecks of testing in a monolithic architecture is the QE’s dependence on developers and ops engineers. A cloud-native QE will need to transcend the role of an application tester and begin to test the resiliency of a cloud-native architecture, decoupling themselves from the trappings of a typical SDLC. This can be done by including autonomy in squad-building and skill set alignment (“feature toggling”). Follow the adage: To build a resilient architecture, one must first learn to break it!
There are approaches such as role-based access control where testers can be given autonomy to test microservices within a designated namespace on the cloud in specific environments without apprehension of affecting production. This enables testers to validate rolling deployments (Canary, Blue-Green, Dark Releases, etc.) along with having visibility on service mesh, CI and telemetry pods. There are good tools in the market like Prometheus, Grafana, ELK, and Kiali (on Istio) that can be used for root cause analysis for system failures/alerts.
While DevOps in isolation ensures agility, it does not ensure high quality. This is because most defects in cloud-native architectures are caught in production and are ops-related. This is where QAOps can be used in tandem with DevOps. Utilize local environments to pull rolling deployments automatically. For example, using Minikube to conduct deployment tests in local environments brings “right” tests to the “left.”
- Proselytizing testers to quality engineers
Current test environments have failed in building test specialists by assigning specialized testing skill sets to generalized roles like functional testing or API testing. Cloud-native testers need to specialize in testing distributed systems on the cloud with two main objectives: validating the microservices applications and testing end-to-end through integration testing. This is where QAOps works in conjunction with DevOps rather than just testing at the endpoints.
This requires a significant change in organizations’ skilling strategies. Currently, according to Wipro’s State of Quality report, 72% of an organizations’ testing workforce is T-shaped engineers (one functional and one technical skill). The new structure of cloud systems requires an upgrade to more cross-functional engineers. When it comes to meeting the demands of niche cloud QE skills, our research in the same report found that 83% of organizations prefer upskilling to hiring cross-skilled QEs from the market. This benefits the organization, because engineers have a familiarity with systems and culture, and better learning opportunities lead to higher retention.
Resilient Cloud Architecture Requires a Paradigm Shift in Quality
Creating a truly resilient architecture on the cloud will require a shift in the way the IT testing landscape is viewed. The current delivery squads (BA-Dev-QA) and their corresponding swim-lanes have streamlined Agile/DevOps responsibilities in the service-oriented architecture era, but they must be reconfigured in the cloud-native world to deliver high-quality solutions.
The first step is enabling QEs to try and break the architecture throughout the SDLC for continuous improvement, rather than perform application-level tests only during the system integration testing phase. Furthermore, re-contextualizing what testing on the left or right means will help redefine testing to create truly auto-scalable, efficient and resilient deployments in the cloud.
The new cloud paradigm is evolving rapidly. So is the traditional role of quality engineering. Organizations must evolve their QE accordingly to thrive in the cloud-everywhere environment.