Senior Site Reliability Engineer at Braze
WHO WE ARE
Braze delivers customer experiences across email, mobile, SMS, and web. Customers, including Seamless, HBO, Disney, Urban Outfitters, and Venmo, use the Braze platform to facilitate real-time experiences between brands and consumers in a more authentic and human way. And we do it at scale – each month, tens of billions of messages are sent to a network of over 2 billion active users through Braze.
Need more proof? Braze was named a Leader in the Gartner Magic Quadrant for Mobile Marketing Platforms in 2019. The company has also been named on the Forbes Cloud 100, Inc. Magazine’s 2019 Best Places to Work, and Crain's 2019 Best Places to Work in NYC lists.
WHAT YOU'LL DO
Braze is at an inflection point in our maturity, where a key focus of our engineering work is on Scalability, Observability, and Reusability. The mission of the SRE team is to increase confidence in changes to the Braze production environment with a focus on performance and uptime for each service at Braze.
The Site Reliability Engineering (SRE) team at Braze is the team that provides the guidance, expertise, mentorship, and education to the entire Engineering organization on how to build, test, monitor and deploy massively scalable applications. The SRE team is the center of excellence for modern engineering operational best practices such as incident management, postmortems, technical debt management, and the culture champions for the development of clean, reusable, and scalable code. SREs are aligned with product engineering teams, know how the services their teams are responsible for function, and can work directly in the codebase.
- Lead and mentor junior engineers in SRE best practices, software engineering, and agile project leadership.
- Solve live performance and reliability issues and prevent their recurrence.
- Write and review code, educating engineers and building a culture of reliability.
- Practice sustainable incident response and blameless postmortems.
- Define and enable standards for monitoring, reliability, and performance.
- Bridge the gap between our infrastructure and platform engineering teams.
- Support and improve services by planning for scale and reliability.
WHO YOU ARE
- Experience leading projects end to end and mentoring junior engineers.
- Experience working in a SRE or DevOps Culture, with great communication and organizational skills and a proven ability to partner with other engineering teams.
- Experience implementing and overseeing observability:
- We use Jira, Git, Jenkins, ELK, Papertrails, Datadog, and Wavefront.
- However… It’s the mindset that matters, not the specific tools.
- Experience in developing, debugging, and optimizing code, at enterprise scale:
- We are a Ruby/Java/Golang shop but are language-agnostic!
- We use Redis/Sidekiq for queueing, and Mongo/Postgres for data storage.
- We use Git/Jenkins/Buildkite to build and deploy.
- Conviction and curiosity empowering a knack for troubleshooting hard problems
- Interest in designing, optimizing and troubleshooting large-scale services
WHAT WE OFFER
- Competitive compensation that includes equity
- Flexible time off policy to balance your work and life, including paid parental leave
- Free daily lunch in the office, including snacks and beverages
- Competitive medical, dental, and vision coverage for you and your dependents
- Collaborative, transparent, and fun loving office culture
If you are a California resident subject to the California Consumer Privacy Act, click here to understand how Braze processes your personal information and how you can exercise your rights.