State of Reliability 2026
What 174,000 incidents reveal about how online services break
The largest public analysis of online-service reliability ever assembled. We studied 174,000 real incidents across 364 companies — drawn directly from public status pages, not surveys or self-reported data — to answer one question: is online reliability getting better or worse? Download the full report for the complete findings, methodology, and industry benchmarks.
- The most common failure modes, ranked — and why cross-company cascades are now the single biggest category of incident
- The root causes behind the outages — and why third-party dependencies, not your own code, top the list
- How AI-related incidents grew 6x in three years, now spanning failed AI services, model-quality issues, and autonomous agents taking destructive actions
- Recovery time varies by service type — apps ~1.7 hrs, infrastructure 3–4 hrs, AI providers under 1 hr.
- The six incident archetypes — find out which pattern your team matches, and why it predicts your reliability better than your industry does
of incidents are now caused by a provider you don't control
Cross-company cascade is the single largest failure pattern, at ~21% of all disclosed incidents — and it takes 309 minutes to resolve, 3.2x longer than a failure you caused yourself and can simply roll back.
of all incidents are now AI-related — up 6x in three years
AI-related incidents climbed from 1% to 5% of disclosed incidents between 2023 and 2026, spanning failed AI services, model-quality issues, and autonomous agents taking destructive action.
Findings by Role
52%
of classified incidents are cross-company cascades, the most common pattern SREs face
74%
of the time, dependency-heavy teams' only fix is to wait for an upstream provider to recover
21%
of all incidents trace to an upstream provider you don't control
14%
of all incident fixes are simply "wait for upstream" — the single most common remediation
5%
of all incidents are now AI-related, up 6x in three years
61%
of teams stay in the same incident archetype year over year — reliability patterns are predictable and improvable.
Next Steps
Get the complete State of Reliability 2026 report — 174,000 incidents across 360+ companies, the six failure archetypes, and the architectural playbook for cutting your slowest incidents.