25.9.10
This website uses cookies to ensure you get the best experience on our website. Learn more

Introduction to SRE and Essential Tools

Skillsoft issued completion badges are earned based on viewing the percentage required or receiving a passing score when assessment is required. Site reliability engineering (SRE) is based on a set of principles and practices used to monitor and observe software reliability in a production environment. In this course, you will dive into the fundamentals of SRE and the evolution of SRE over the years. Next, you will examine the site reliability engineering role and find out how to suitably find, place, bootstrap, and distribute site reliability engineers. You will discover the SRE principles that organizations should strive for, key SRE metrics, the importance of error budgeting, and the essential tools used in SRE. Then you will compare and contrast SRE to traditional IT operations, explore the SRE lifecycle from planning to operation, and investigate the process of incident response and postmortem analysis. Finally, you will focus on the cultural impacts of SRE within an organization, set up and configure a basic monitoring tool, and create a simple dashboard using Grafana.

Issued on

July 4, 2024

Expires on

Does not expire