Site Reliability (SRE) makes sure the PollEverywhere.com production environment is stable for customers. They also make sure Poll Everywhere engineers have the tools they need to provision arbirtrary environments for load, stage, and usability testing.

Responsibilities

Metrics

Metric Requirement
Uptime 99.99%
Commit → CI → Staging → Production < 60 min
Provision new environment < 30 min
Disaster recovery from only DB-snapshot < 6 hours
Incident response time < 15 min