AI Icon DevOps

Course Details Image

Limited Time Offer

Enrol now and save $0 on your course fee

03 Days 03 Hours 03 Minutes 03 Seconds

Building reliable systems requires a shift in mindset and a focus on scalability, automation, and resilience. This course introduces participants to the core practices of Site Reliability Engineering (SRE), blending software engineering with IT operations. They will explore service-level objectives, error budgets, and modern operational strategies.

Learning Outcomes:

  • Understand the principles and origins of Site Reliability Engineering

  • Describe the relationship between reliability and system performance

  • Apply SRE tools and techniques such as SLIs, SLOs, and error budgets

  • Analyse incidents and implement learning through blameless postmortems

Key Topics:

  • History and principles of SRE

  • Monitoring, observability, and incident response

  • Automation and toil reduction strategies

  • Service-level indicators and error budget policies

 

Exam Details

This course is designed to build participants’ understanding of key concepts and practices covered in the DevOps Institute (DOI) SRE Foundation certification.

The course includes the official SRE Foundation certification exam, which is bundled with the course fee. Participants will explore real-world case studies and engage with topics such as error budgets, service level objectives (SLOs), service level indicators (SLIs), monitoring strategies, toil reduction, and observability — all aligned with the SRE Foundation exam content.

The course has been developed by referencing key SRE sources and contributions from industry thought leaders and organisations actively adopting SRE practices.

To maximise success, participants are strongly encouraged to complement the course with additional self-study, revision of course materials, and dedicated practice before attempting the exam.

Module 1:  SRE Principles and Practices

  • What is Site Reliability Engineering?
  • SRE and DevOps: What is the Difference?
  • SRE Principles and Practices

Module 2:  Service Level Objectives and Error Budgets

  • Service Level Objectives
  • Error Budgets
  • Error Budget Policies

Module 3:  Reducing Toil

  • What is Toil
  • Why Toil is bad
  • Doing something about Toil

Module 4:  Monitoring and Service Level Indicators

  • SLI's - Service Level Indicators
  • Monitoring
  • Observability

Module 5:  SRE Tools and Automation

  • Automation Defined
  • Automation Focus
  • Hierarchy of Automation Types
  • Secure Automation
  • Automation Tools

Module 6:  Antifragility and Learning from Failure

  • Why learn from Failure
  • Benefits of Anti-fragility
  • Shifting the Organisational Balance

Module 7:  Organisational Impact of SRE

  • Why Organisations embrace SRE
  • Patterns for SRE adoption
  • SRE Job Description
  • Sustainable Incident Response
  • Blameless Postmortems
  • SRE and Scale

Module 8:  SRE, Other Frameworks, Trends

  • SRE and Other Frameworks
  • SRE Evolution
*Important Note : Fees are subject to Singapore's prevailing Goods and Services Tax (GST).
Course Details Image
[Course Title]

Explore Other Courses

We couldn’t find any result
based on your selection.
Please wait a moment
while we retrieve the data

Have Question?

We’re here to help — reach out anytime.

By submitting this form, you consent to be contacted via email and/or your mobile number regarding your enquiry. You consent to the collection, use, disclosure and processing of your personal data in accordance with our Personal Data Policy.