The lifecycle of IT services has mostly been plan, build and operate – in that sequential order. In a site reliability engineering-enabled agile enterprise with programmable infrastructure across private and public Clouds, it becomes plan to operate, build to operate and create an operations architecture for new gen operations. Operations is not a sequentially different phase. Enterprises cannot afford to think of operations as an after-thought. This is due to the introduction of humongous changes, elasticity and complexity in running distributed services and too many short-lived moving parts. It is not about incident management any more, but incident avoidance.
Design2operate is a philosophy embedded into the plan, design and build phases so that the services are operable not just from day 1, but from day –n, 0, 1 and +n. Enterprises on large scale digital and cloud transformation programs are looking at ‘runnability’ (aka operability) as a first step of thinking even before implementing a ‘Change’. Operability planning needs to be embedded into the Cloud adoption journey all through, right from planning the portfolio and treating applications for Cloud, to ensuring reliable and fearless change management.
Mutations of the operations spectrum
Programmability of infrastructure has changed the way services are introduced, operated and destroyed. The following transitions (See figure 1) emphasize the need for implementing Design2operate thinking in a transformation program for every cloud adoption journey:
Mutable to immutable infrastructure: Patching and upkeep of configuration of a CI (configuration item) is history. Once a service is deployed, that instance of the service is never updated. Every deployment remains immutable for its life. Any change in the configuration triggers a new deployment. This demands infrastructure to cater to increased speed and frequency of deployments.
Incidence response to incidence avoidance: Response times and SLAs are history. Avoidance, auto-heal and Business Level Agreements (BLAs) are the measures of the new order. The instrumentation for operability to meet the BLAs are baked into the code defining the infrastructure environment. This demands that infrastructure architects and developers work closely with the service and application owners right from the planning phase.
Figure1 : Changes in the operations spectrum
DR to AlwaysOn: Disaster Recovery (DR), Recovery Point Objective (RPO) and Recovery Time Objective (RTO) are history. AlwaysOn is the new norm. The infrastructure environment is coded with auto-scale and auto-recover mechanisms. This enables scaling out the service on demand or relocate itself to a new region in disaster scenarios.
Human labor to digital labor: Tower-based headcount for operations is history. Bots enabled by artificial intelligence would completely invalidate the tower-based model for operations. The digital labor in an estate would be measured and enhanced more than human labor. Automation moves from tasks/process level to cognitive levels
RFP/PO-free sourcing and fulfillment: Self-service enabled sourcing and automated fulfillment across multiple providers will slowly eliminate long RFP and PO processes.
Waterfall to agile: Most Infrastructure automation and operations work will move away from a Waterfall to Agile model. The team composition, hence, has to be in small 8-10 cross-skilled members and aligned to services than towers.
In-source to crowd-source: The idea of ‘jobs’ is changing from full-time-employment to task-based-employment. As services get more and more commoditized, there will be more opportunities to atomize the jobs into smaller chunks which are delivered by people without being employed full time.
Key components of Design2operate paradigm
Like any other major transformation, all three aspects of people-process-technology need to be sufficiently considered to implement Design2operate paradigm. The specifics of these three aspects may vary based on the maturity levels of the organizations, but irrespective of the quantum of the change, there is a need for a shift in thinking about the operational model.
Technology
Some of the key technologies that enable Design2operate are:
People
Organization structure has to significantly change to adapt a Design2operate paradigm (See figure 2). Some of the new skills required are:
Figure2 : Sample organization structure to enable Design2operate
Process and governance
Some of the new processes that need to be developed and implemented are:
Conclusion
Enterprises looking at large transformation programs, especially in the recent surge of Cloud adoption and acceleration, often overlook the operability of the dynamically changing services and increase their hidden costs over time. Design2operate thinking right from the planning stage of transformation can significantly reduce the costs and anxieties that crop up mid-way into the program.
Murthy Malapaka
Head of Transformation Services - North America, Wipro Ltd
Murthy has 28 years of experience as a technology innovator and change agent. Over these years, he has assumed various technology leadership roles across application and infrastructure architecture domains, specializing in availability and reliability. He has been providing consulting services to CIOs and CTOs in their journey from client server to on demand infrastructure services. Murthy can be reached at murthy.malapaka@wipro.com.
Govindaraj Rangan
Practice Director and Head of Cloud Transformation Services, Wipro Ltd
Govind has 22 years of industry experience across the breadth of the technology spectrum – Application development to IT operations, UX design to IT security controls, presales to implementation, converged systems to Internet of Things, and IT strategy to hands-on. He has an MBA from ICFAI University specializing in Finance, MS in Software Systems from BITS Pilani, and BE (EEE) from Madras University. Govind can be reached at govindaraj.rangan@wipro.com