Heidi Health

Senior Site Reliability Engineer (Upmarket)

Posted 14 Days Ago

Be an Early Applicant

In-Office

San Francisco, CA

140K-185K Annually

Senior level

In-Office

San Francisco, CA

140K-185K Annually

Senior level

The Senior Site Reliability Engineer will respond to incidents, improve operational reliability, manage cloud infrastructure, and collaborate with engineering teams to enhance system performance.

The summary above was generated by AI

Who We Are

Healthcare needs a better rhythm: one that keeps care continuous and deeply human. Heidi is building an AI Care Partner that works alongside clinicians to make that possible.

We’re a team of doctors, engineers, designers, researchers, and creatives building tools that help clinicians stay focused on what matters most: their patients.

In just 18 months, Heidi has given back more than 18 million hours to healthcare professionals — supporting 73 million patient visits in 116 countries. Today, more than two million patient visits each week are powered by Heidi worldwide.

Backed by nearly $100 million in funding, we’re growing in the US, UK, Canada, and Europe, partnering with leading health systems including the NHS, Beth Israel Lahey Health, and Monash Health.

What you’ll do

Participate in on-call and incident response:
Respond to production incidents, contribute to service restoration, and support clear communication during incidents. Over time, take increasing responsibility for leading incidents end-to-end.
Improve operational reliability:
Identify recurring issues and reliability risks, and drive fixes through better alerting, automation, system changes, or process improvements.
Own parts of the production environment:
Operate and improve Kubernetes clusters, cloud infrastructure, and core platform services, with growing ownership as familiarity increases.
Strengthen observability:
Improve dashboards, alerts, logs, and traces so issues are detected earlier and diagnosed faster, with a strong focus on actionable signals.
Reduce operational toil:
Automate repetitive tasks, simplify runbooks, and improve tooling to make on-call and day-to-day operations easier and safer.
Support safe change:
Improve deployments, rollback mechanisms, and operational readiness to reduce the risk of incidents caused by change.
Contribute to operational practices:
Write and maintain runbooks, participate in blameless post-mortems, and help improve incident response processes over time.
Collaborate closely with engineers:
Work with product and feature teams to improve production readiness, service ownership, and reliability expectations.

What we’re looking for

3–6+ years in SRE, DevOps, Platform, or operations-heavy engineering roles.
Experience supporting production systems and participating in on-call rotations.
Comfortable debugging live systems under pressure.
Experience operating cloud infrastructure (AWS preferred).
Working knowledge of Kubernetes and containerised workloads.
Infrastructure as Code experience (Terraform or similar).
Familiarity with monitoring and alerting tools (Datadog, Prometheus, etc).
Scripting or automation experience (Python, Bash, or similar).

The way we work

1. Build to Last

We design for safety and reliability so clinicians, patients, and our teams can trust what we build every day.

2. Own Your Practice

Ideas rise on merit, not title, and everyone shares responsibility for the standards we set together.

3. Move Fast, Stay Steady

We move quickly but never at the cost of trust. Progress only matters if people can depend on what we make.

4. Make Others Better

Honest feedback, steady support, and shared growth keep our teams improving together.

Why you will flourish with us 🚀?

In office to collaborate with like-minded professionals
Healthcare, Dental, Vision benefit options
401k with 3% match
Personal development budget of $500 per annum
Become an owner, with shares (equity) in the company, if Heidi wins, we all win
The rare chance to create a global impact as you immerse yourself in one of the leading healthtech startups
The opportunity to fast track your startup career!

Heidi is dedicated to creating an equitable, inclusive, and supportive work environment that brings people together from diverse backgrounds, experiences, and perspectives. Our strength is in our differences. We're proud to be an equal opportunity employer and welcome all applicants as we're committed to promoting a culture of opportunity for all.

Top Skills

AWS

Bash

Datadog

Kubernetes

Prometheus

Python

Terraform

Similar Jobs

Akamai Technologies

Senior Site Reliability Engineer

Yesterday

In-Office or Remote

107K-221K Annually

Senior level

107K-221K Annually

Senior level

Cloud • Security • Software • Cybersecurity

The Senior Site Reliability Engineer will enhance performance and reliability of distributed systems, define KPIs, and collaborate cross-functionally to improve infrastructure and operational efficiency.

Top Skills: AdbmsBashDatadogGrafanaInternet ProtocolsJavaScriptOracle SqlPrometheusPython

Akamai Technologies

Senior Site Reliability Engineer

Yesterday

In-Office or Remote

107K-221K Annually

Senior level

107K-221K Annually

Senior level

Cloud • Security • Software • Cybersecurity

The Senior Site Reliability Engineer will manage scalable systems on the ZTNA Cloud Platform, automate operations, optimize performance, and work with multiple teams to enhance security products.

Top Skills: ApacheArgocdAWSCeleryElasticsearchHelmJenkinsKubernetesLinuxNginxOpensearchPostgresRabbitMQTerraformUbuntu

Akamai Technologies

Site Reliability Engineer

Yesterday

In-Office or Remote

107K-221K Annually

Senior level

107K-221K Annually

Senior level

Cloud • Security • Software • Cybersecurity

The Senior Lead Site Reliability Engineer will ensure performance and uptime of security products, develop automation pipelines, and improve monitoring systems, working closely with various teams.

Top Skills: AzureDatabricksDockerGoJenkinsKubernetesPythonTerraform

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
Key Industries: Artificial intelligence, Fintech
Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory