SRE Config-Induced Failures: The Incident That Starts With "Nothing Changed"

Author:

| Jun 24, 2026

Blog Banner_ SRE Config-Induced Failures_ The Incident That Starts With _Nothing Changed

“We didn't deploy anything.” It's one of the most common things an on-call SRE says at the start of an incident — and one of the most misleading. Because while no code deployed, something almost certainly changed.

Config-induced failure (FM-10) is 9% of classified unplanned incidents in 2025. It's structurally similar to deploy-induced regression (FM-09) but harder to detect — the change management trail is weaker, and modern IaC tooling has dramatically larger blast radius when it goes wrong.

Full data: stackgen.com/state-of-reliability-2026.

What Is Config-Induced Failure in SRE?

Any incident triggered by a non-code change: a configuration value, feature flag, environment variable, IAM policy, ACL, network rule, DNS record, quota adjustment, or capacity policy update. Distinct from FM-09 in one key way: the change is often not in the same audit trail as code deploys.

The Two High-Profile Cases

Cloudflare — November 18, 2025

A ClickHouse permissions update caused a Bot Management feature file to double in size. When the main Cloudflare proxy received the updated config, it crashed — affecting 56 downstream companies in the SSOR dataset. The change was innocuous-seeming: a database permissions update with an unexpected downstream effect on file size with an unexpected downstream effect on proxy behavior. Three hops, each fine in isolation.

AWS us-east-1 — October 20, 2025

A DNS automation race condition was triggered by an automated config write — not a human clicking in the console. An infrastructure automation process writing a config value created a race condition that produced an empty DNS record. This is the IaC-era shape of FM-10: lower frequency than manual config changes, but dramatically higher blast radius.

The IaC Paradox

Before IaC: Config changes were frequent, manual, often undocumented. High frequency, low blast radius, weak audit trail.

With IaC: Config changes are less frequent, version-controlled, reviewable. Lower frequency, much higher blast radius — one Terraform apply can atomically modify dozens of security groups, IAM policies, DNS records, and routing rules.

The answer isn't less IaC. It's more rigorous change review and blast-radius-aware deployment strategy for IaC changes.

Why “Nothing Changed” Is Almost Never True

Automated config writes: infrastructure automation, self-healing systems write config values constantly
Third-party vendor changes: your vendor updated their API behavior, changed a default, or deprecated an endpoint
Certificate expirations: a time-bound config validity that expires
Quota / limit adjustments: cloud provider changes that don't show up in your deployment tooling

Key Takeaways

9% of 2025 incidents — systematically harder to detect than deploy regressions because the change trail is fragmented
Cloudflare Nov 2025 and AWS Oct 2025 are the clearest high-impact FM-10 examples: both trace to config changes with unexpected cascading consequences
IaC expands blast radius: the same rigorous rollout discipline you apply to code deploys should apply to IaC changes
The highest-leverage investment: change-data integration — surfacing all config change signals in the same telemetry stream as your alerts and metrics

stackgen.com/state-of-reliability-2026 | LinkedIn webinar

Add as preferred source on Google

About StackGen:

StackGen is the pioneer in Autonomous Infrastructure Platform (AIP) technology, helping enterprises transition from manual Infrastructure-as-Code (IaC) management to fully autonomous operations. Founded by infrastructure automation experts and headquartered in the San Francisco Bay Area, StackGen serves leading companies across technology, financial services, manufacturing, and entertainment industries.

Know more

Platform Overview

MCP Server

Integrations Overview

Aiden for SRE

Aiden for Infrastructure

Aiden for Observability

Agentic Developer Experience

Brownfield Applications

Greenfield Applications

Managed OSS Observability

Developers

DevOps

Engineering Leaders

Platform Engineers

SRE

About

Newsroom

Contact Us

Careers

Analysts

Blog

Videos & Webinars

Whitepapers, E-books and Brochures

Events

Stacked Up

Documentation

Case Studies

SRE Config-Induced Failures: The Incident That Starts With "Nothing Changed"

What Is Config-Induced Failure in SRE?

The Two High-Profile Cases

Cloudflare — November 18, 2025

AWS us-east-1 — October 20, 2025

The IaC Paradox

Why “Nothing Changed” Is Almost Never True

Key Takeaways

About StackGen:

AGENTS

Solutions

COMPANY

RESOURCES

Platform Overview

MCP Server

Integrations Overview

Aiden for SRE

Aiden for Infrastructure

Aiden for Observability

Systems Don't Lie: Director of Engineering, Pocket FM on Reducing Uncertainty During Incidents

Agentic Developer Experience

Brownfield Applications

Greenfield Applications

Managed OSS Observability

Developers

DevOps

Engineering Leaders

Platform Engineers

SRE

Systems Don't Lie: Director of Engineering, Pocket FM on Reducing Uncertainty During Incidents

About

Newsroom

Contact Us

Careers

Analysts

Systems Don't Lie: Director of Engineering, Pocket FM on Reducing Uncertainty During Incidents

Blog

Videos & Webinars

Whitepapers, E-books and Brochures

Events

Stacked Up

Documentation

Case Studies

Systems Don't Lie: Director of Engineering, Pocket FM on Reducing Uncertainty During Incidents

Stackgen 2025 Year-End Letter: The Year We Started Building the Future of Infrastructure

Systems Don't Lie: Director of Engineering, Pocket FM on Reducing Uncertainty During Incidents

Stackgen 2025 Year-End Letter: The Year We Started Building the Future of Infrastructure

Stackgen 2025 Year-End Letter: The Year We Started Building the Future of Infrastructure

SRE Config-Induced Failures: The Incident That Starts With "Nothing Changed"

What Is Config-Induced Failure in SRE?

The Two High-Profile Cases

Cloudflare — November 18, 2025

AWS us-east-1 — October 20, 2025

The IaC Paradox

Why “Nothing Changed” Is Almost Never True

Key Takeaways

About StackGen:

Download Brochure