Site Reliability Engineering principle and how to

Before start

SRE

  • Embrace and control risk. If you afraid risk to deny any change that will go to the worse case. In the future, your service won’t handle new busniess requirement.
  • Simple. Simple is really import. The complex design will cause more trouble. People should keep design simple as you can.
  • Security. As a SRE should keep security design. Otherwise, security leak will cause huge problem.
  • Automation. if you service is changed by manual step that is high risk problem. SRE should keep implementing automation.
  • Visibility. Before your service release, SRE should build all of telemetry. If your telemetry is built when product is online. That is to say you could not measure your product metrics.
  • Reduction. Sometime your might implement some thing new for your service. You might skip some thing because of release schedule. I would suggest you keep to review your method to reduce legacy.

How to

Plan

Build

Continuous Integration

Deployment

Operation

Continuous Feedback

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store