Medal

Site Reliability / Infrastructure Engineer

Posted 2 Days Ago

Be an Early Applicant

In-Office

New York City, NY, USA

150K-275K Annually

Senior level

In-Office

New York City, NY, USA

150K-275K Annually

Senior level

As a Site Reliability Engineer, you'll own reliability for GCP infrastructure, lead incident response, architect database scaling, and improve CI/CD pipelines.

The summary above was generated by AI

The Company

Medal

Medal is the world’s largest and fastest-growing platform for gaming clips, where millions of gamers capture, share, and relive their best moments. Every year, our players record billions of clips, each representing a unique, action-packed highlight. We’re building the next generation of gaming communities: social, monetized, and creator-powered. Our mission is to design products that make sharing, discovering, and connecting around gaming moments seamless and fun.

We raised a seed round of $133M from General Catalyst and Khosla to discover the next generation of intelligence.

The Role

Medal's infrastructure handles billions of clips, video ingestion pipelines, and social features at a massive scale most engineers never get to touch. We're looking for an SRE who cares deeply about reliability and scalability.

The work centers on reliability, incident response, scaling, and making sure our infrastructure keeps up with our growth. You'll own the on-call rotation, drive postmortems, and work directly with engineering teams to meet their infra needs.

The right person probably came through startups and scale-ups. You've been in the room when things broke at 2am, you've scaled databases under pressure, and you know the difference between a durable fix and a patch that buys you a week.

Key Responsibilities

Own reliability across our GCP infrastructure: Kubernetes clusters, managed services, and data pipelines, driving measurable improvements to availability and latency
Lead incident response end-to-end: on-call rotations, runbooks, postmortems, and the follow-through that makes sure the same thing doesn't happen twice
Architect and execute database scaling strategies (sharding, replication, query optimization, and capacity planning) across MySQL and Postgres at meaningful scale
Partner with product engineering to translate feature requirements into infrastructure designs that hold up as we grow
Manage and evolve our Terraform-managed GCP environment and Kubernetes cluster configurations
Build and maintain observability across the stack: metrics, dashboards, alerting, and tracing
Constantly improve CI/CD reliability and delivery pipelines across GitHub Actions
Harden IAM, secrets management, and network segmentation as part of normal infra hygiene

About You

You’ve worked at startups and are comfortable in an environment of rapid growth where scaling up is a priority
You have great judgment - you know the difference between a durable, sustainable fix vs. a patch that buys you a week
You have deep, hands-on experience scaling and sharding relational databases in production environments
You know GCP maybe a little too well: Kubernetes, VPC, IAM, Cloud Logging, and the managed services ecosystem
You are fluent in Terraform and have owned real infrastructure-as-code at scale
You have strong incident response instincts: you can work a P0 calmly, communicate clearly under pressure, and run a postmortem that prevents recurrence.
You’ve worked with GitHub Actions in a production CI/CD environment.
You have excellent communication skills (this is crucial!) and can both flag issues clearly and rapidly during incidents, and lead / write actionable postmortems

Our Stack

Google Cloud Platform

Terraform, Salt, GitHub Actions

Java, Redis, RabbitMQ, ElasticSearch, BigQuery, Kubernetes for backend

Electron+React

C# and C++ for native windows recording & more

Swift for iOS, Kotlin for Android

Benefits

Competitive salary and meaningful equity
Comprehensive medical, dental, and vision coverage
401(k)
Wellness and fitness perks including a Wellhub membership and mental health resources
Paid parental leave, fertility and maternal health benefits
Generous PTO policy
Daily meals and commuter benefits at our NYC HQ in Flatiron
Learning and development stipend

Benefits vary by country and employment type.

Top Skills

BigQuery

C++

Elasticsearch

Electron

Github Actions

Google Cloud Platform

Java

Kotlin

Kubernetes

RabbitMQ

React

Redis

Salt

Swift

Terraform

Upper West Side, New York, New York, United States, 10024

Similar Jobs

CoreWeave

Senior Site Reliability Engineer

17 Days Ago

In-Office

New York, NY, USA

165K-242K Annually

Senior level

165K-242K Annually

Senior level

Cloud • Information Technology • Machine Learning

As a Senior Site Reliability Engineer, you'll ensure the reliability and performance of a Kubernetes-based data platform, focusing on scaling infrastructure, enhancing security, and optimizing deployment processes.

Top Skills: AirflowArgo CdFlinkGithub ActionsGrafanaHelmIstioKafkaKubernetesLinkerdOpentelemetryPrometheusPulumiSparkTerraform

MongoDB

Site Reliability Engineer

5 Days Ago

Easy Apply

Remote or Hybrid

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.

Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform

Milestone Systems

Site Reliability Engineer

18 Days Ago

Remote or Hybrid

United States

160K-180K Annually

Senior level

160K-180K Annually

Senior level

Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics

The Lead Site Reliability Engineer will oversee the Infrastructure SRE team, focusing on system reliability, automation, and mentoring while collaborating with product engineering.

Top Skills: Ci/CdDatadogDockerElk StackGitopsGoKubernetesLinux/UnixNew RelicNoSQLPrometheusPythonSQLStackdriverTerraform

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
Key Industries: Artificial intelligence, Fintech
Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Medal

Site Reliability / Infrastructure Engineer

Top Skills

Medal New York, New York, USA Office

Similar Jobs

Senior Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

What you need to know about the NYC Tech Scene

Key Facts About NYC Tech