Optum Logo

Optum

Site Reliability Engineer - Remote

Posted An Hour Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in Eden Prairie, MN
73K-130K Annually
Mid level
In-Office or Remote
Hiring Remotely in Eden Prairie, MN
73K-130K Annually
Mid level
The Site Reliability Engineer will design, develop, and support a secure cloud infrastructure while collaborating with development and DevOps teams, ensuring high performance and reliability of systems.
The summary above was generated by AI
Requisition Number: 2358259
For those who want to invent the future of health care, here's your opportunity. We're going beyond basic care to health programs integrated across the entire continuum of care. Join us to start Caring. Connecting. Growing together.
Our Optum Serve IT team develops cutting-edge solutions that help people live healthier lives and help make the health system work better for everyone. From advanced data analytics and AI to cybersecurity, we use innovative approaches to solve some of healthcare's most complex challenges. To support this mission, OSIT has initiated a multi-year modernization program aimed at updating and enhancing enterprise technology systems in accordance with modern design standards
The Site Reliability Engineer will architect, develop, and maintain Optum Serve's cloud environment in both the commercial and government clouds. The role will work closely with software engineers, architects, and DevOps engineers to architect and maintain a secure, resilient and high-performance cloud infrastructure.
You'll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges. For all hires in the Minneapolis or Washington, D.C. area, you will be required to work in the office a minimum of four days per week.
Primary Responsibilities:
  • Build, operate, and support IaaS and PaaS infrastructure in Azure and AWS commercial and government cloud environments under established architecture and standards
  • Partner with development teams to help define, track, and report on SLIs, SLOs, and SLAs
  • Contribute to the development and support of platform services, including provisioning, configuration, deployment, and day to day operations
  • Integrate applications and platforms with centralized logging, monitoring, metrics, and incident management systems
  • Configure and maintain observability tools (dashboards, APMs, alerts) to help engineering teams safely operate applications in production
  • Participate in an on-call rotation to support software and cloud infrastructure, following documented runbooks and escalation paths
  • Support root cause analysis efforts and assist with remediation by implementing automation, monitoring improvements, and reliability fixes
  • Maintain and enhance operational tooling, scripts, and frameworks used for platform and service support
  • Execute performance and resiliency testing for platform services using existing frameworks and tools
  • Configure and tune alerts related to performance, availability, cost, security, and compliance signals
  • Follow and help improve operational processes, contributing automation to reduce manual and repetitive support activities

You'll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in.
Required Qualifications:
  • 4+ years of experience working in a Site Reliability Engineering, Cloud Engineering, or DevOps role
  • Hands-on experience supporting Kubernetes (managed or bare metal) clusters in production environments
  • Some hands-on experience with monitoring and observability tools (e.g., Azure Monitor, Splunk, Dynatrace, Grafana, Prometheus)
  • Experience using Infrastructure as Code (IaC) tools such as Terraform or Pulumi
  • Experience supporting infrastructure and applications in production cloud environments
  • Experience interacting with or supporting systems that expose RESTful APIs
  • Solid working knowledge of at least one major cloud service provider (Azure preferred, AWS acceptable)
  • Working knowledge of networking fundamentals and common internet protocols
  • Understanding of identity and access management (IAM) concepts and best practices
  • Basic understanding of security concepts including encryption, PKI, and common application security risks (e.g., OWASP)
  • Familiarity with Kubernetes deployment and GitOps tools such as Helm, ArgoCD, or Flux
  • Familiarity with IDEs and source control tools such as Visual Studio Code, GitHub, GitLab
  • Ability to participate in a 24/7 on-call rotation following documented procedures and escalation paths
  • United States Citizenship
  • If you are offered this position, you will be required to provide extensive personal information to obtain and maintain a suitability or determination of eligibility for a Confidential/Secret or Top Secret security clearance as a condition of your employment

*All employees working remotely will be required to adhere to UnitedHealth Group's Telecommuter Policy
Pay is based on several factors including but not limited to local labor markets, education, work experience, certifications, etc. In addition to your salary, we offer benefits such as, a comprehensive benefits package, incentive and recognition programs, equity stock purchase and 401k contribution (all benefits are subject to eligibility requirements). No matter where or when you begin a career with us, you'll find a far-reaching choice of benefits and incentives. The salary for this role will range from $72,800 to $130,000 annually based on full-time employment. We comply with all minimum wage laws as applicable.
Application Deadline: This will be posted for a minimum of 2 business days or until a sufficient candidate pool has been collected. Job posting may come down early due to volume of applicants.
At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes - an enterprise priority reflected in our mission.
OptumCare is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations.
OptumCare is a drug-free workplace. Candidates are required to pass a drug test before beginning employment.

Similar Jobs at Optum

An Hour Ago
In-Office or Remote
92K-164K Annually
Senior level
92K-164K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Senior Site Reliability Engineer will architect and maintain cloud infrastructure, collaborating with software and DevOps engineers while ensuring security and performance.
Top Skills: ArgocdAWSAzureAzure MonitorDynatraceFluxGraphanaHelmKubernetesPrometheusPulumiRestful ServicesSplunkTerraform
11 Days Ago
In-Office or Remote
135K-231K Annually
Senior level
135K-231K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
This role involves leading site reliability engineering initiatives, ensuring operational excellence and security for digital platforms within the organization, and collaborating across teams to improve system performance.
Top Skills: Automation ToolsAWSAzureGCPIds/IpsMonitoring SystemsSecurity FrameworksSIEM
16 Days Ago
In-Office or Remote
92K-164K Annually
Senior level
92K-164K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
The Senior Site Reliability Engineer will manage and enhance cloud infrastructure, focusing on automation, performance, and security while collaborating with software and DevOps teams.
Top Skills: ArgocdAzureAzure MonitorDynatraceFluxGrafanaHelmKubernetesPrometheusPulumiRestful ServicesSplunkTerraform

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

  • Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
  • Key Industries: Artificial intelligence, Fintech
  • Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
  • Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account