Get the job you really want.
Maximum of 25 job preferences reached.
Top Reliability Engineer Jobs in NYC, NY
Artificial Intelligence • Software
As a Senior/Staff Network Reliability Engineer, you'll optimize and maintain Fluidstack's network platform, ensuring performance and reliability for AI and HPC workloads. Responsibilities include tuning networking protocols, deploying and validating switches, automating telemetry, conducting root-cause analyses, and collaborating with vendors.
Top Skills:
BgpDpdkEbpfEvpnGeneveGoPythonRdmaRustTcp/IpVxlanXdp
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Senior Software Engineer focused on Site Reliability Tooling, you'll enhance system reliability, implement SRE practices, and build automation tools to support site reliability across Upstart's infrastructure.
Top Skills:
CdkCloudFormationDatadogGoJavaScriptKubernetesPrometheusPythonTerraformTypescript
Sales • Software • Automation
Join the Infrastructure Team to build and maintain critical systems, automating database lifecycles and enhancing disaster recovery with a focus on resilience and simplicity.
Top Skills:
AnsibleArgocdAWSClickhouseDockerElasticsearchFlaskGithub ActionsGrafanaKubernetesMongoDBPostgresPythonRedisTerraform
Marketing Tech
The Cloud Reliability Engineer develops, configures, and deploys cloud tools, enhances applications, ensures observability, and participates in on-call rotations.
Top Skills:
AWSCi/CdDockerGithub ActionsGoGoogle BigqueryGCPKubernetesLinuxPythonSQLTerraform
Artificial Intelligence • Healthtech • Software
As a Site Reliability Engineer, you will manage cloud infrastructure, implement observability, and ensure system reliability by collaborating with engineering teams and maintaining databases.
Top Skills:
AzureBashGitGitKubernetesPostgresPythonRedisSQLTypescriptVscode
Mobile • Software
Site Reliability Engineers will work on production infrastructure, focusing on AWS and Kubernetes while ensuring high availability and customer satisfaction.
Top Skills:
AirflowAWSCircleCICloudwatchEksGrafanaMongoDBPagerdutyPingdomRustScala SparkTerraformTypescript
Big Data • Cloud • Software • Database
The Site Reliability Engineer will design and build cloud infrastructure for MongoDB Atlas, optimize performance, and automate services worldwide.
Top Skills:
AWSDnsGCPHTTPKubernetesLinuxAzureProgramming LanguagesTls
Reposted 18 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will support, maintain and grow the Atlas platform, focusing on automating processes and running multi-cloud environments.
Top Skills:
AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Cloud
The Senior Database Reliability Engineer will design, optimize, and operate PostgreSQL and MySQL clusters, ensure database performance, and lead incident responses.
Top Skills:
AnsibleAWSDatadogGCPGoGrafanaKubernetesMySQLPostgresPrometheusPythonTerraform
Fintech • Financial Services
As an EQ Electronic Reliability Engineer, you will ensure the reliability of electronic trading systems, resolve outages, automate processes, and collaborate across teams for improved system performance and resilience.
Top Skills:
C++ElasticGrafanaItrsJavaKdbLinuxPython
Reposted 20 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The role involves maintaining and improving CI/CD infrastructure using Argo Workflows and Kubernetes, ensuring effective deployment for engineering teams.
Top Skills:
AWSAzureGoGCPKubernetesPython
AdTech • Marketing Tech
As a Data Reliability Engineer II, you will analyze and enhance data pipelines, write scripts in Python and SQL, and work with data visualization tools. You will also engage in on-call support and improve operational quality.
Top Skills:
BashBigQueryDatabricksGitlabGrafanaMongoDBOltpPythonSnowflakeSQL
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Artificial Intelligence • Cybersecurity
The Database Reliability Engineer will ensure database availability, performance, scalability, and security across AWS, collaborating with application and security teams.
Top Skills:
AWSCrossplaneDatadogGitlab Ci/CdKubernetesNoSQLOpensearchPostgresTerraform
Artificial Intelligence • Machine Learning • Software
As a Staff Site Reliability Engineer, you will enhance the reliability, scalability, and performance of production services by applying SRE principles, implementing observability practices, automating processes, and collaborating with engineering teams.
Top Skills:
AWSAzureCloudFormationDatadogDockerElk StackGCPGoGrafanaJaegerKubernetesOpentelemetryOpentofuPrometheusPythonTerraform
eCommerce • Legal Tech • Professional Services • Software • Data Privacy
The Site Reliability Engineer will ensure systems run smoothly, work with automation tools, resolve issues, and drive operational improvements.
Top Skills:
AWSAzureCloudFormationDockerGCPGrafanaKubernetesMemcachedNew RelicOpentelemetryPostgresPrometheusPulumiRedisSentryTerraform
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
The Site Reliability Engineer will build and maintain infrastructure, improve software systems, develop scalable microservices, and ensure quality software delivery.
Top Skills:
AWSGoGoogle Cloud PlatformJavaKubernetesAzureSQL
Consumer Web • Mobile
As a Site Reliability Engineer at Patreon, you'll improve AWS infrastructure, implement SRE practices, enhance Kubernetes capabilities, and develop automation tools.
Top Skills:
AnsibleAWSChefKubernetesPuppetPythonTerraform
Artificial Intelligence • Software
The Network Operations Engineer will lead site operations, ensuring network reliability, handling incidents, coordinating hardware repairs, and supporting datacenter deployments. Responsibilities include executing maintenance runbooks and mentoring junior engineers while collaborating with cross-functional teams.
Top Skills:
AnsibleBgpClos TopologiesEvpn/VxlanHigh-Radix SwitchingPython
Information Technology • Mobile • News + Entertainment • Social Media
As a Senior Site Reliability Engineer, you'll enhance reliability and performance of Reddit's systems using knowledge of distributed systems and automation, collaborating with teams to improve infrastructure and service delivery.
Top Skills:
ClickhouseCloudGoGrafanaKubernetesLinuxLokiOtelPrometheusPythonTcp/IpThanosVector
Fintech
The Principal Site Reliability Engineer enhances application performance and reliability, manages incidents, designs systems, and mentors teams.
Top Skills:
AuroraAWSChefDockerDynamo DbGitGoJavaJenkinsJmsKafkaKubernetesMavenMemcachedOraclePythonRedisSqsSwarm
Insurance
The Staff Engineer will innovate and enhance systems, solve critical problems, mentor engineers, and drive platform reliability and performance.
Top Skills:
.NetAnsibleAWSAzureCi/CdDockerElasticsearchGCPGoGrafanaJavaKubernetesMongoDBMySQLNoSQLOpen TelemetryPostgresPrometheusPythonSQLTerraform
Software
As a Site Reliability Engineer, you'll ensure platform reliability through scalable systems, incident response, observability, and collaboration with engineering teams.
Top Skills:
AWSDatadogGrafanaKubernetesOpentelemetryPrometheusTypescript
Reposted 22 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Fintech • Mobile • Payments • Financial Services
This role involves setting technical strategies, collaborating across teams, managing operations and availability, and fostering a culture of quality and ownership within the Site Reliability Engineering team.
Top Skills:
AWSKotlinKubernetesMySQLPythonSpark
eCommerce • Healthtech • Kids + Family • Retail • Social Media
Seeking a Senior Software Engineer, Site Reliability to ensure system stability, scalability, and reliability, while optimizing AWS infrastructure using modern DevOps practices and tools like Terraform, Docker, and Kubernetes.
Top Skills:
AWSCircleCICronitorDatadogDockerGithub ActionsJenkinsKubernetesMySQLPagerdutyReactRedisRuby On RailsSentrySidekiqTerraform
Security • Cybersecurity
Lead DevOps initiatives to enhance microservice infrastructure, mentor engineering teams, and manage production issues, with a focus on security and automation.
Top Skills:
AWSCircleCIGithub ActionsGoJenkinsKubernetesPythonTerraform
Top NYC Companies Hiring Reliability Engineers
See AllPopular Job Searches
All Software Engineer Jobs in NYC
.NET Developer Jobs in NYC
Android Developer Jobs in NYC
C# Jobs in NYC
C++ Jobs in NYC
DevOps Jobs in NYC
Engineering Manager Jobs in NYC
Front End Developer Jobs in NYC
Golang Jobs in NYC
Hardware Engineer Jobs in NYC
iOS Developer Jobs in NYC
Java Developer Jobs in NYC
Javascript Jobs in NYC
Linux Jobs in NYC
Perl Jobs in NYC
PHP Developer Jobs in NYC
Python Jobs in NYC
QA Jobs in NYC
Ruby Jobs in NYC
Sales Engineer Jobs in NYC
Salesforce Developer Jobs in NYC
Scala Jobs in NYC
Artificial Intelligence Jobs in NYC
Artificial Intelligence Engineer Jobs in NYC
AWS Engineer Jobs in NYC
Backend Engineer Jobs in NYC
DevOps Engineer Jobs in NYC
Director of Engineering Jobs in NYC
Engineering Jobs in NYC
Full Stack Engineer Jobs in NYC
Infrastructure Engineer Jobs in NYC
Lead Software Engineer Jobs in NYC
Network Engineer Jobs in NYC
Platform Engineer Jobs in NYC
Principal Architect Jobs in NYC
Principal Engineer Jobs in NYC
Principal Software Engineer Jobs in NYC
Quality Assurance Automation Engineer Jobs in NYC
Reliability Engineer Jobs in NYC
Senior Backend Engineer Jobs in NYC
Senior Cloud Engineer Jobs in NYC
Senior Full-Stack Engineer Jobs in NYC
Senior Platform Engineer Jobs in NYC
Senior Python Engineer Jobs in NYC
Senior Site Reliability Engineer Jobs in NYC
Solutions Architect Jobs in NYC
Solutions Engineer Jobs in NYC
Staff Engineer Jobs in NYC
Staff Software Engineer Jobs in NYC
Systems Engineer Jobs in NYC
Vice President of Engineering Jobs in NYC
All Filters
Total selected ()
No Results
No Results
































