Fundamentals of Reliability Engineering


Diagram showing SLOs, error budgets, and reliability targets working together

Understand reliability engineering fundamentals: how to define SLOs and error budgets, design reliable systems, balance reliability with innovation, and make data-driven decisions about system reliability.

Fundamentals of Incident Management


Diagram showing incident lifecycle from detection through resolution with runbooks, alerts, and automation

Understand incident management fundamentals: how to respond effectively when systems fail, build runbooks that work, create actionable alerts, and prevent incidents before they happen.