Senior Site Reliability Engineer

Fetch

| Remote

Sorry, this job was removed at 11:20 a.m. (EST) on Saturday, May 14, 2022

View 1470 Jobs

Find out who’s hiring remotely

See all Remote jobs

View 1470 Jobs

Apply

By clicking Apply Now you agree to share your profile information with the hiring company.

Save job

What we’re building and why we’re building it.

Fetch is a build-first technology company creating a rewards program to power the world. Over the last 5 years, we’ve grown from 0 to 14M active users and taken over the rewards game in the US with our free app. The foundation has been laid. In the next 5 years we will become a global platform that completely transforms how people connect with brands.

It all comes down to two core beliefs. First, that people deserve to be rewarded when they create value. If a third party directly benefits from an action you take or data you provide, you should be rewarded for it. And not just the “you get to use our product!” cop-out. We’re talkin’ real, explicit value. Fetch points, perhaps.

Second, we also believe brands need a better and more direct connection with what matters most to them: their customers. -- Brands need to understand what people are doing, and have a direct line to be able to do something about it. Not just advertise, but ACT. Sounds nice right?

That’s why we’re building the world’s rewards platform. A closed-loop, standardized rewards layer across all consumer behavior that will lead to happier shoppers and stronger brands.

Fetch Rewards is an equal employment opportunity employer. This position can be based remotely in the United States or hybrid in Birmingham, Boston, Chicago, Madison, New York, and San Francisco.

The Role:

Fetch’s next step in evolving the shopping experience will require a Senior Site Reliability Engineer.

The Site Reliability Engineering (SRE) team combines software and systems engineering to build and run distributed, fault-tolerant systems at scale. SREs ensure that Fetch’s services — both our externally visible and internally critical systems — have reliability and uptime appropriate to our users’ needs. In addition, we keep an ever watchful eye on system capacity and performance. We’re proud to be our engineers’ engineers, and much of our software development focuses on optimizing existing systems, building infrastructure, and eliminating work through automation.

Fetch’s culture of diversity, intellectual curiosity, problem solving, and openness is key to our success. Our organization brings together people with a wide variety of backgrounds, experiences, and perspectives. We encourage them to collaborate, think big, and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.

Minimum Qualifications:

1+ year(s) of experience in a software development-oriented role (e.g. Software Engineer, DevOps Engineer, Site Reliability Engineer).
Experience with one or more high-level programming languages (e.g. Java, Python, Go, C/C++).
Experience with cloud infrastructure (AWS strongly preferred).
Experience with containerization technologies (Docker, Kubernetes preferred).
Experience building CI/CD pipelines.
Experience with Unix/Linux operating system internals and networking.
Experience with analyzing and troubleshooting systems.
Experience monitoring and supporting microservice architectures.
Bachelor's or higher degree in Computer Science, related technical field, or equivalent practical experience.

Preferred Qualifications:

Experience designing, analyzing, and troubleshooting distributed systems.
Experience designing and developing software oriented towards systems or infrastructure automation.
Experience implementing observability stack in an AWS multi-account environment.
Experience with the technical interview process for evaluating and hiring of candidates.
Experience with growing and mentoring other engineers in a team setting.
Ability to debug/optimize code and automate routine tasks.
Systematic problem-solving approach, coupled with effective communication skills and a sense of ownership and drive.

Responsibilities:

Engage in and improve the whole lifecycle of services - from inception and design, through deployment, operation, and refinement.
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and readiness reviews.
Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity.
Practice sustainable incident response and blameless postmortems by participating in the on-call rotation.
Build and support AWS multi-account and multi-region infrastructure using a mix of managed services (e.g. S3, Lambda, RDS, etc.) and containerized infrastructure (e.g. EKS, ECS).
Grow the SRE team by mentoring engineers and participating in the hiring process.

At Fetch Rewards, we'll give you the tools to feel healthy, happy and secure.

Stock Options
401K Match - Up to 3%
PPO and HDHP plans | Dental | Vision | Life Insurance
Pet Insurance
Education Reimbursement
Unlimited PTO
10 Paid Holidays + End of Year Break
Flexible 9 weeks of Parental Leave
Full time wellness coach to help meet your exercise & wellness needs
WFH Setup. Whatever hardware and software you need to get the job done
Regularly scheduled virtual events, happy hours, cooking classes, etc.

Read Full Job Description

Senior Site Reliability Engineer

Location

Similar Jobs