Site Reliability Engineer
You are passionate about uptime. You can't sleep when you don't understand the technical underpinnings of a site anomaly. You are comfortable in the codebase and in the infrastructure. Logs tell you a story.
If these sentiments resonate with you we would love to talk.
We are looking to hire someone who will:
- be responsible for the uptime of our sites.
- review our overall architecture and make changes in the code and in the infrastructure to improve the reliability and performance of our sites.
- tune our alerting to ensure that we know when there are anomalies while keeping a high signal to noise ratio.
- curate our monitoring so that we have actionable forensics when there are anomalies.
This is a senior role. Requirements
- Expertise building scalable, reliable distributed systems, both from the coding side and the infrastructure side
- Expert knowledge of server-side languages and data stores (our system is built with Go and PostgreSQL)
- Strong knowledge of cloud infrastructure
Nice to have
- Experience with message-based, loosely coupled architectures (we use RabbitMQ and Redis)
- Knowledge of e-commerce platforms, like Magento, Shopify, Demandware, or others
If this is you, submit a cover letter and resume to apply!