Is Incident Management a Value Stream?

In this, the first of a series of blog posts from the Value Stream Management Consortium, our experts weigh in on, “Is THIS a value stream?” First up, we’re looking at Incident Management.

Is Incident Management a Value Stream?

Helen Beal, Chair of the Value Stream Management Consortium and Co-Author of the Value Stream Management Foundation Course & Certification:

I say no, Incident Management is not a value stream. It does not deliver value to a customer, nor is it a step or process in the end-to-end lifecycle of creating value for a customer. I’m not saying it’s not important, but the role of Incident Management in VSM is to reduce or eliminate the unplanned work that results in friction, interruptions, and delays to flow in the value stream when something goes wrong. Incident Management is a supporting process to the core value stream.

VSM cares about Incident Management because incidents interrupt flow. It seeks to increase the Mean Time Between Incidents (MTBI) and reduce the Mean Time to Discover and Recover (MTTD/MTTR). If DevOps is a toolkit of interventions that support VSM, here are some experiments to support VSM’s goal of optimizing flow in the face of problems and incidents:

Pay down technical debt to increase stability and measure this in the work types in the value stream
Practice Chaos Engineering to improve antifragility
Use automation, CICD pipelines, DevOps toolchains, and the ‘shift left’ concept to reduce incidents in production
Implement observability and AIOps to reduce MTTD and MTTR and identify improvement candidates
Use intelligent swarming and other techniques to promote team autonomy and “we build it, we own it”

Patrice Corbard, Founder of SD ReFocus and Influencer Member of the Value Stream Management Consortium:

As defined by ITIL, Incident Management is a process or a service management practice that aims to manage the lifecycle of all incidents (unplanned interruptions or reductions in quality of IT services).

On the other hand, Incident Resolution can be viewed as a value stream (or a value streamlet) with :

A clear trigger event: incidents or system outages
A clear value delivered: incidents resolved
Its own flows of work, material, and information
Performance metrics: MTTD and MTTR (one of the four key metrics of Accelerate)

A “Software Delivery Value Stream” can be mapped as a sequence of steps from Customer Requests to Delivery of a software product or service. We can also visualize the extended value stream with pre-request and post-delivery works.

Usually, we associate the “software delivery value stream” with its main objective of delivering new features.

But, this is only one of the several types of customer requests and types of work that the value stream has to manage. Feature requests are only the tip of the iceberg!

Fixing bugs, resolving incidents, anticipating and fixing vulnerabilities, responding to support requests, reducing technical debt, etc. are works that, when accomplished, bring value to users and whose flows are different from features.

To address this complexity and visualize this reality, a “software delivery value stream” can be decomposed into ‘value streamlets’:

The “Incident Resolution value streamlet” can be mapped using the same visual representation as the “Feature Development value streamlet”.

We can use the power of the value stream mapping technique to make visible the specifics of these types of work: their own input rate, information and workflows, timeline, and performance metrics.

Finally, the different value streamlets form a network that contributes to the whole software delivery value stream and to the delivery of value to its customers :

“New feature development” and “incident resolution” are two external value streamlets directly delivering value to customers, while “vulnerability remediation” is an internal supporting value streamlet serving others value streamlets and processes like “App Security Testing” and “Incident Management”.

Because the value delivered by our software is not only measured by the features deployed to users but also by the quality, security, and reliability of the systems running in production, we need to seriously consider “incident resolution” as a value stream(let).

Steve Pereira, Board Advisor and Value Stream Lead at the Value Stream Management Consortium and Co-Author of the Value Stream Management Foundation Course & Certification:

Generally, I’m a fan of applying models until they fail to provide value, or a superior alternative exists. Let’s borrow from a recent post where Helen went back to a canonical source of value stream wisdom, James Martin’s The Great Transition:

“An end-to-end collection of activities that creates a result for a 'customer,' who may be the ultimate customer or an internal "end-user" of the value stream.”

Definition, check. Let’s just look at the collection itself to see ‘how’ it creates a customer result. As Patrice laid out:

Incidents occur (these are indistinguishable from stories, ideas, tasks)
The incident is registered in a tracking system
It’s classified, sized, assessed
It’s diagnosed
It’s resolved (customer result created)
It’s closed
It’s reflected on (I hope!)

If it has customers, can incident management be treated as a productized service? When work comes into a product team, it is analyzed, sized, prioritized, and structured so that it can be estimated and acted on effectively. Incident responders benefit from having a high degree of clarity provided by distinct classification (work profile), impact (value), complexity (roughly sizing), and context that allows the response to be estimated and acted on effectively.

One aspect of incident management that does seem to differ from many value streams is that it heavily favors single-piece flow. We don’t want a batch of incidents to amass before we start to act on them, at least before they’ve been prioritized.

Let’s look at a sample of common flow metrics to see whether IM can be measured the same way we measure a value stream:

Lead time: Mean time to resolution, we have a clear start and endpoint.
Cycle time: We can measure how long each stage takes to complete.
% Complete and Accurate: How often are we completing each stage successfully?
Value added time: How much time in each stage are able to work without delay, how much of it is contributing to resolution?
Work profile: How many of each type of incident is flowing through the system over time?

I believe it could qualify as either a core (direct paying customer) or supporting (internal indirect paying customer, supporting core) value stream. Let’s look at two examples to test the idea:

Core: Incident management as a service, managed infrastructure, professional services

A SaaS vendor supplying an incident management product that automates the vast majority of an incident response capability and facilitates the rest.
As a CISO, I pay a provider to supply incident management capabilities to my product teams.

Supporting: Internal support for core infrastructure, supporting product or other dependent teams

As a product owner, I rely on our internal incident management team to facilitate our incident response by supplying tooling, observability, guidance, and helping us run effective postmortems.
An internal platform team can rely on an incident management team for support resolving incidents, and supporting its internal response capability.

Incident management capability support and methodology can be provided as a product that can be consumed by any core or supporting team as its own supporting stream.

We could debate whether incident response is a value stream or merely a subprocess, but surely incident management would produce superior results by following an ‘as-a-service’ or even productized model.

It fits the definition of a value stream, it looks like a value stream, behaves like a value stream, can be measured like a value stream, can be treated as a value stream, and can be improved like a value stream. So, we’re settled, yeah?

Not so fast. Let’s see how it fits some of the largest frameworks:

ITIL4: A series of steps an organization undertakes to create and deliver products and services to consumers.

ITIL focuses on ‘consumers’ here, not even merely external customers. I believe they reference processes for everything that doesn’t reach consumers. “A set of interrelated or interacting activities that transform inputs into outputs. Processes define the sequence of activities and their dependencies.”

iSixSigma: The value stream can be defined as the set of activities that occur to add value to your customer from the initiating step to the final realization of value by your customer. The value stream can begin as early as the development of the concept, run through various stages of development, and end with delivery and ongoing support. A value stream always begins and ends with your customer.

iSixSigma again only mentions external customers.

SAFe: Value Streams represent the series of steps that an organization uses to implement Solutions that provide a continuous flow of value to a customer. A SAFe portfolio contains one or more value streams, each of which is dedicated to building and supporting a set of solutions, which are the products, services, or systems delivered to the Customer, whether internal or external to the enterprise.

SAFe is the latest entrant to the game here, and I think their definition captures a more complete representation of the potential of value streams by mentioning internal or external customers. “... steps … to implement Solutions that provide a continuous flow of value …” seems to support incident management as a value stream, yet a better fit would be ‘support or provide’.

I think that only applying a value stream model to your customer-facing business is missing a big opportunity to leverage thinking, practices, measurements, and tools that drive superior customer outcomes - from the inside, out.

Depending on what lenses you’re using to look at the world, definitions are debatable, but my advice to you is that if a model provides value, don’t hesitate to use it. Just make sure you test it often to ensure it’s still the most useful one you could be using.

STILL WITH US?!

We invite all of our members to contribute content to the Value Stream Management Consortium to share with our community. Find out more about how to become a member here.

Steve Pereira

Steve is obsessed with making tech human, and leveraging it to deliver continuous value. For the past 20 years, his focus has been guiding ambitious and struggling teams towards their true north. He's a former startup CTO, agency consultant, systems and release engineer, finance IT manager, tech support phone jockey, and pizza maker. All focused on the flow of value, all the time.

Is Incident Management a Value Stream?

Steve Pereira

Comments 0

Is Release Management a Value Stream?

What is a Value Stream?

Learning About Value Stream Management

Is Incident Management a Value Stream?

Steve Pereira

Comments 0

Is Release Management a Value Stream?

What is a Value Stream?

Learning About Value Stream Management

Subscribe for updates!