Senior Reliability (SRE) Engineer
SeatGeek Enterprise is our innovative open-ecosystem enterprise ticketing software that allows teams, venues, and promoters to efficiently run their businesses and captivate fans.
We're proud to partner with some of the most recognized names across the globe including the Dallas Cowboys, Brooklyn Nets, and Liverpool F.C., as well as the NFL, MLS, half of the English Premier League, and theaters across NYC's Broadway and London's West End.
As a Senior Reliability Engineer, you'll improve our observability systems, alerting, and other infrastructure that supports SeatGeek's production systems. You will also bring your experience running production services at scale to other teams at SeatGeek. Reliability is essential as we continue to grow our ticketing platform, and as such, you will play a major role in shaping the experience that delights our users.
You will be part of our Resilience Engineering team, which has members working from all over Europe and the United States. The team's mission is to ensure that our services are monitored and that we're setting and meeting the right Service Level Objectives. We achieve this by building tools, processes, and providing hands-on consulting services to other engineering teams at SeatGeek.
What you'll do
- Own the incident response process at SeatGeek, modeled after industry best practices
- Run blameless incident reviews to help us learn from our mistakes and improve SeatGeek's systems and processes
- Build and maintain tools that our developers use to observe their applications. We provide best practices in logging, monitoring, distributed tracing, and alerting
- Work cross-functionally with the various teams in the organization. You will help establish SLOs and then help teams consistently achieve those SLOs. This includes identifying and recommending areas in which there is insufficient observability. Our engineers are on call for the services they build, and you will help them be successful in doing so
- Create systems that will automate the ingestion of alerts, deployment events, and other relevant information about our services
- Participate in an on-call rotation
Who you are
- You have on-call experience in a Software Engineering or Operations roleYou have experience with observability systems
- You are a curious person, eager to learn the intricacies of the systems we support and how they work at scale and are not afraid to get your hands dirty
- You work with service owners, helping them to apply best practices in running systems in production
- You can simplify complex topics to better suit a non-technical audience
- You have solid experience in at least two programming languages like Python, Go, C#, or Java. Other languages are fine too
- You have a good understanding of storage systems such as Postgres, MySQL, DynamoDB, and Redis, including complex queries and optimization
- You are familiar with technologies that power distributed systems, such as Kafka and RabbitMQ
Perks
- Equity stake
- Flexible work environment, allowing you to work as many days a week in the office as you'd like or 100% remotely
- A WFH stipend to support your home office setup
- Flexible PTO
- Up to 16 weeks of paid family leave
- 401(k) matching program
- Health, vision, dental, and life insurance
- Annual subscriptions to Headspace, Ginger.io, and One Medical
- $120 a month to spend on tickets to live events
- Annual subscription to Spotify, Apple Music, or Amazon music
SeatGeek is committed to providing equal employment opportunities to all employees and applicants for employment regardless of race, color, religion, creed, age, national origin or ancestry, ethnicity, sex, sexual orientation, gender identity or expression, disability, military or veteran status, or any other category protected by federal, state, or local law. As an equal opportunities employer, we recognize that diversity is a positive attribute and we welcome the differences and benefits that a diverse culture brings. Come join us!