Blog

SRE resources

28 April 2020
- en
- Blog
- Sre

SRE resources

I’ve started to assemble some resources on the topic of Site Reliability Engineering in order to pass this to colleagues and friends who want to dive into the topic and are in need of good starters.

Aside from specific SRE resources it does make sense to have a common understanding on how infrastructure is run in the modern age. And to be honest: if you think it is simply about the running of infrastructure: nay - think again. We’ve come long ways since the early 2000s. While not on the topic of SRE specifically, I’d recommend giving The Unicorn Project by Gene Kim a whirl. Packaged in an interesting novel lot’s of modern paradigms are passed onto the reader.

Disclaimer: the list of resources is not exclusive and should simply offer a head start into the topic.

Reading material

Books

One could refer to this book as the bible of site reliability engineering. It is the groundwork piece that is actually referred to in a lot of places:

Site Reliability Engineering

The subtitle “How Google Runs Production Systems” clearly states what it is about. It is fairly dense. I found it to be well consumed alongside with the book Seeking SRE.

Seeking SRE is made out of chapters written by different authors. Each one of them from the industry and having had some major experiences with SRE in the past. It goes from telling the story of SRE at Spotify, to soundcloud as well as featuring anti-patters of SRE.

Blog Posts

The google cloud blog published an article explaining the SRE fundamentals: SLIs, SLAs and SLOs. On the subject of defining SLOs and the pitfalls associated with it Femi Agbabiaka wrote a nice post: SLO pitfalls

Audio material - Podcast episodes

There are many podcast episodes on this subject out there.

The New Stack Makers has a piece on the The evolution of the Site Reliability Engineer.

The Cloud Cast

The Cloud Cast actually has quite a few good episodes that touch the topic.

Software Engineering Radio

Björn Rabenstein on Site Reliability Engineering

Google Cloud Platform Podcast

Last, but certainly not least the Kubernetes Podcast from Google has an episode entitlled: SRE, with Tina Zhang and Fred van den Driessche