Site Reliability Engineer at Latch
Latch's Site Reliability Engineering team is being built from the ground up. This team will cover all aspects of Latch’s SRE strategy and development including: reliability, scalability, operational tooling, automating workflows, architecture & design reviews, investigating system failures, & more. This DevOps engineering role for SRE will help build and run the future of Latch’s large-scale distributed system. We are firm believers of bringing in smart people who can define the roles for themselves, so come join us and start creating greatness.
Smart access isn’t about locking doors, it’s about opening up new possibilities. Latch is the world’s first fully-integrated hardware and software system dedicated to bringing seamless access to every door in a modern building. We’re looking for the curious and the creative to join our team and help us continue to change the way we access our most valued spaces.
- Design and implement infrastructure & systems that balance scalability, reliability, and performance with security and operational quality
- Lead migration efforts off legacy infrastructure
- Evangelize a DevOps culture of automation, self-service, and engineering best practices to enable development teams to go from development to production with minimum effort
- Serve as subject matter expert in building, designing, and managing cloud-based infrastructure, with an emphasis on AWS
- Participate in on-call rotations and lead or conduct post-mortems of production failures, as needed
- Work closely with development teams, product, and customer service groups to identify, triage, and mitigate issues
- Stay current with technology trends and developments in the industry
- 5+ years of relevant experience with DevOps or SRE practices
- 4+ years hands-on experience with Cloud Infrastructure such as AWS and GCP
- Managing cloud infrastructure as code by leveraging provisioning and configuration management tools such as Terraform and Ansible
- Experience coding with any of these languages; Python, Go, or Java
- Experience with container orchestration technologies; preferred experience in ECS and/or EKS
- Comfortable using common CI/CD tools such as TravisCI, Jenkins for application and infrastructure automation tasks
- Experience with monitoring/analytics tools such as Datadog, Sumo Logic, and PagerDuty
- Experience working closely with Security teams to drive compliance with SOC2, PCI, or similar certifications
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
- Strong work ethic with the ability to work independently, and work in a remote environment.
Founded in 2014, Latch is a venture-backed startup building the world’s first complete smart access system. We now boast over 200 employees, all of whom are all passionate self starters with unique backgrounds and unexpected stories. We are located just a quick walk from Penn Station in New York City, and a ten minute walk from CalTrain in SOMA San Francisco.
We offer unlimited Paid Time Off, a competitive health package, and an office environment where employees are surrounded by creative, empowered, and dynamic peers. There is no better time to join us. As we grow as a company, we are excited to see our employees grow with us.