Your monitoring is green. Your code hasn't changed. Your infrastructure looks fine. And yet your status page is lighting up.
Welcome to the cross-org cascade (FM-01) \u2014 the most common single failure mode in the StackGen State of Reliability 2026 dataset, at 22% of all classified unplanned incidents across 360+ online services in 2025. It's not your bug. It's your dependency.
We analyzed 178,000+ status page incidents and 1,037 engineering post-mortems to understand how cascades work, how fast they spread, and what separates teams that resolve them in 30 minutes from those that spend four hours investigating the wrong service. Full data: stackgen.com/state-of-reliability.
A failure inside a shared infrastructure provider propagates to downstream operators. The downstream operator's own systems are not the cause. The primary lever is recognition speed \u2014 the faster your team identifies this is a cascade, the faster you shift to the right posture: communicate, degrade gracefully, and wait.
In 2025, FM-01 accounted for 4,516 incidents. Median recovery time: 309 minutes \u2014 3.2x longer than an internally-caused config failure (~97 minutes). The gap isn't because cascades are technically harder to fix. It's because recognition time is slow.
A DynamoDB DNS automation race condition created an empty DNS record. Within 24 hours, 137 downstream companies posted status page incidents \u2014 the largest single-event cascade in the dataset. Most incident titles named \u201cAWS us-east-1\u201d verbatim. Recovery ranged from under an hour (multi-region failover) to 8+ hours (teams that first spent time ruling out internal causes).
A faulty content configuration update to the Falcon sensor caused Windows BSODs. Airlines, banks, hospitals, and government agencies all posted separate status page incidents tracing to the same third-party update. The clearest illustration of how a security tooling dependency creates cascade exposure at cloud-infrastructure scale.
A networking misconfiguration in Azure's European infrastructure cascaded across dozens of fintech, SaaS, and communications operators using Azure Front Door, most within the same 2-hour window.
| Upstream Type | Examples | Why It Cascades |
| Cloud | AWS, GCP, Azure | Broad dependency across every service tier |
| CDN | Cloudflare, Fastly | Traffic routing; outage hits end users directly |
| Identity | Okta, CrowdStrike | Auth-wall; outage blocks user access entirely |
| AI Provider | OpenAI, Anthropic | Growing as AI APIs embed in product features |
| Dev-Tooling | GitHub, Docker Hub | Deployment pipeline; blocks releases |
stackgen.com/state-of-reliability | LinkedIn webinar