If you want to find the most common cause of service incidents, look at what deployed 30 minutes ago.
Deploy-induced regression (FM-09) is the second most frequent failure mode in the SSOR 2026 dataset at 19% of classified unplanned incidents. It's also the most fixable at scale: the detection pattern is consistent, the remediation is well-understood, and the autonomy potential for AI SRE tooling is the highest of any failure mode in the taxonomy.
We analyzed 178,000+ status page incidents and 1,037 engineering post-mortems. Full data: stackgen.com/state-of-reliability.
FM-09 has a three-step fingerprint that is entirely automatable from telemetry:
No business-logic judgment required.
Anthropic: 73% of classified incidents map to application errors on specific model versions \u2014 \u201cElevated errors on Claude Opus 4.6,\u201d \u201cIncreased errors on Sonnet 4.6.\u201d Each one is a deploy-correlated regression on a named model, resolved by routing back to the prior version.
OpenAI: 49% of classified incidents trace to application errors with a growing MTTR trend (2023\u20192026 median +93%) \u2014 workload complexity outpacing rollback discipline.
Deploy regression MTTR ranges from under 15 minutes to over 8 hours for structurally similar incidents. The variance comes down to three things:
Model deploy regressions (FM-17) follow the same operational mechanics as code deploy regressions (FM-09). Canary discipline applies to model deployments too: evaluate on a traffic sample before full rollout. Teams that treat model version upgrades with the same discipline as application deploys show lower FM-17 rates.
The single biggest MTTR improvement: moving from \u201cdebug then fix forward\u201d to \u201crollback first, debug later.\u201d In the post-mortem corpus, this pattern adds an average of 90+ minutes to FM-09 MTTR when rollback was ultimately the resolution anyway.
stackgen.com/state-of-reliability | LinkedIn webinar