The Temporal Engine: Linking Change to Impact

In Part 2, we built out the dual-core brain of autonomous operations - the Network Knowledge Graph (NKG) as the spatial map of how everything connects, and the Temporal Knowledge Graph (TKG) as the timestamped logbook of everything that changes. We showed how linked identifiers bridge the identity gap across siloed systems, turning abstract ticket IDs into traceable network elements.

Now we put it to work. The most revealing test of any autonomous system is not how it handles a clean, obvious failure - it is how it handles the messy, ambiguous situations that currently take human experts hours to unravel. Change-related incidents are exactly that test.

The Risk Window After a Change

Every network operations team knows this feeling: a maintenance window closes successfully, the on-call engineer stands down, and then - 45 minutes later - something starts going wrong. Not on the element that was changed, but somewhere else. Something adjacent.

This is the risk window after a change, and it is one of the most expensive problems in network operations. The change looks fine in isolation. The degradation looks unrelated at first glance. And by the time a team of domain experts – across domains has manually correlated timestamps across four different systems, significant customer impact has already occurred.

This is the scenario we will walk through in this part.

Midnight: The Change Is Executed

At 12:00:00 AM, during a scheduled maintenance window, a Method of Procedure (MOP) initiates a software upgrade on a core aggregation router - a critical node carrying traffic for multiple services and customer segments. The upgrade itself takes time - configuration changes, service restarts, validation checks. By 12:07:00 AM, the MOP completes. Health checks pass. The on-call team sees no alarms. For the next ten minutes, everything looks stable. From every indication, the maintenance window is a success.

The TKG silently logs the full execution window with precision: (MOP-ID-X, Alters, Aggregation-Router-A, Start Time 12:00:00 AM, End Time 12:07:00 AM).

Background monitoring agents continue their watch.

12:17:00 AM: Something Shifts

Ten minutes after the MOP completes, the KPI Drift Monitor detects subtle but statistically significant dips across several performance indicators - throughput dropping on specific service paths, latency creeping up on a subset of sessions, and a slight uptick in packet retransmissions. None of these individually cross an alarm threshold. Together, they tell a different story.

To a human observer reviewing separate dashboards, these look like minor, unrelated fluctuations across different services. The ten-minute gap between the MOP completion and the KPI dips makes the connection even less obvious. The temptation is to log them as noise and investigate in the morning.

The TKG records each drift point with sub-second precision: (Aggregation-Router-A, Experiences, KPI Degradation, Time 12:17:00 AM).

The Three-Step Reasoning Chain

With the NKG and TKG working in tandem, the autonomous system performs a three-step reasoning process that would take a human team hours to complete manually.

Step 1: Pinpointing the Moment of Drift

The KPI Drift Monitor runs Change Point Detection (CPD) - a statistical technique that identifies when a metric deviates from its historical baseline. It flags 12:17:00 AM as the change point across multiple KPIs and immediately cross-references this against the TKG: a MOP was executed on Aggregation-Router-A between 12:00:00 AM and 12:07:00 AM, completing ten minutes earlier. The temporal proximity alone is not proof - but the co-occurrence of multiple KPI dips pointing to the same network segment as part of the NKG is enough to trigger deeper investigation.

Step 2: Resolving the Identity Gap

The Change Management Agent takes the MOP ticket identifier - CI-ID-xxxx - and traverses the NKG to resolve it to the actual network element:

CI-ID-xxxx → Linked_To → Aggregation-Router-A → Identified_As → NE-ID-yyyy

What was an abstract ticket ID is now a specific, known aggregation router with a precise position in the service topology - one that sits on the path of multiple services and customer segments.

Step 3: Validating the Functional Path

The RCA Agent asks the critical question: "Are the services experiencing KPI degradation at 12:17:00 AM dependent on the aggregation router that was upgraded at 12:00:00 AM?"

It traverses the NKG and confirms that Aggregation-Router-A sits on the service path of the affected traffic flows. The software upgrade introduced subtle changes in traffic handling behaviour - queue scheduling, buffer allocation, or forwarding table updates - that only manifested under live traffic conditions after the maintenance window closed. No other changes occurred in the same window. No other anomalies are present to confuse the picture.

The system elevates this from correlation to causality - with traceable, verifiable evidence.

From Diagnosis to Action

Once the causal chain is established with high confidence, the system does not stop at diagnosis. The Remediation Agent presents a structured evidence artifact to the on-call engineer:

What happened: Aggregation-Router-A upgrade executed between 12:00:00 AM and 12:07:00 AM introduced changes in traffic handling behaviour
What it caused: KPI dips across throughput, latency, and packet retransmissions detected at 12:17:00 AM across dependent service paths
Who is affected: The specific customer segments and network slices traversing Aggregation-Router-A
Recommended action: Rollback Aggregation-Router-A to the pre-change configuration

The engineer reviews the evidence - not a vague alert, but a traceable causal narrative anchored in the NKG and TKG - and approves the remediation. The rollback executes. All of this within minutes of the initial drift, before customers experience meaningful impact.

This is the shift from reactive triage to proactive assurance in practice.

Conclusion: Why This Matters Beyond the Scenario

The midnight maintenance scenario illustrates something important: the hardest problems in network operations are not the obvious failures. They are the ones where the cause and effect are separated by time, by domain boundaries, and by the fragmented identity of data across systems.

Change-related incidents are disproportionately represented in major outages precisely because they exploit these gaps. The NKG and TKG, working together through a coordinated agent workflow, close those gaps - not by making humans work faster, but by giving them the right evidence at the right time to make confident decisions.

In this part, we have covered:

The "risk window after a change" as a real and costly operational challenge
How the TKG captures change and anomaly events with sub-second precision
The three-step agent reasoning chain: drift detection, identity resolution, and topological validation
How the system moves from diagnosis to human-approved remediation in minutes

Looking Ahead: Part 4 – The Anatomy of an Autonomous Agent

We have now seen what agents do. In Part 4, we will open up the hood and look at how they actually work - the dual-core design that combines generative AI for understanding unstructured data with deterministic reasoning for verifiable conclusions. We will also look at the trust layer that ensures agents operate within defined boundaries, and why that governance is not a constraint on autonomy but the very thing that makes autonomy possible.

About the Authors

Balakrishnan K
General Manager and Senior Practice Partner, Autonomous Network, Wipro Engineering

Balakrishnan K heads autonomous network at Wipro Engineering. He focuses on enabling clients across numerous industries to advance their network operations strategy and digital-transformation journey.

Ravi Kumar Emani
Vice President and Practice Head, Connectivity, Wipro Engineering

Ravi has more than 25 years of experience helping global enterprises realize their connectivity goals. He is currently responsible for the Connectivity Practice Unit for NEPS and the Communications portfolio for Wipro Engineering. Ravi has authored numerous articles on 5G and is a Distinguished Member of the Technical Staff (DMTS) at Wipro.