Top Reliability Engineer Jobs in NYC
The Reliability Engineer at Two Sigma will improve software reliability, automate technical processes, create data pipelines, and collaborate with various teams. Required qualifications include a degree in computer science, experience with Bash, Python, and UNIX command line, and familiarity with Site Reliability Engineering (SRE) or related functions. Benefits include fully paid medical and dental insurance, 401k match, tuition reimbursement, and flexible work policy.
Site Reliability Engineers (SREs) combine software and systems engineering expertise to build and operate the systems that power the firm's investment strategies at the speed and scale required in fast-evolving markets.
We are looking for a reliability expert who is passionate about scaling Cloud services to join our growing SRE team. You'll be the driving force for change!
The Sr. Database Reliability Engineer will be responsible for advancing data layer resilience and performance, making architectural decisions, improving load testing and capacity planning frameworks, supporting and monitoring the use of data layer technologies, handling operational challenges, and enhancing visibility into data layer workflows and workloads.
We are looking for a reliability expert who is passionate about scaling Cloud services to join our growing SRE teams. An ideal candidate is someone who is aware of current industry trends (particularly those related to reliability) and who thrives on working with a diverse set of partners, who can articulate the business impact of a problem and can also dive deep into the technical solution.
As a Site Reliability Engineer III at JPMorgan Chase within the Digital Private Markets /Aumni, you will configure, maintain, and optimize applications and infrastructure, collaborate with teams for design and deployment, and implement site reliability engineering best practices.
Site Reliability Engineer responsible for supporting real-time, distributed environments, capacity planning, migrations, and infrastructure upgrades. Strong collaboration, troubleshooting, and communication skills required. Experience in trading desk support and Python development preferred.
Zocdoc is looking for a Principal Site Reliability Engineer to drive the reliability, resiliency, observability, availability, and scalability of systems and services. Responsibilities include analyzing complex distributed system challenges, supporting product engineering with scaling and performance needs, monitoring cloud-based infrastructure, automating tooling, and mentoring colleagues.
Featured Jobs
As a Site Reliability Engineer III at JPMorgan Chase within the Infrastructure Platforms Engineering team, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to drive innovation and modernize complex systems. You will collaborate with software engineers to design and implement deployment approaches using automated continuous integration and delivery pipelines, as well as resolve complex problems with technical experts.
As a Federal Site Reliability Engineer (SRE) at ServiceNow, you will provide 24x7 support for the Government Cloud infrastructure on the 3rd Shift. Responsibilities include driving technical resolutions across the technology stack, optimizing platform operability, reducing incidents, and improving services for customers. Skills required include DevOps, Automation, Scripting, Linux systems, software development, Observability, Monitoring, and Cloud technologies (Azure, AWS). If you have a background in IT Operations, Software Development, or Systems Engineering, this could be the opportunity for you.
The Site Reliability Engineer at ServiceNow is responsible for maintaining and developing the reliability, scalability, and performance of the ServiceNow infrastructure. The role involves combining software development, networking, and systems engineering expertise to enhance platform operability and customer experience.
The Site Reliability Engineer at ServiceNow is responsible for maintaining and developing the reliability, scalability, and performance of the ServiceNow infrastructure. They work to drive technical resolutions across the technology stack, improve operability, and enhance services for customers by combining software development, networking, and systems engineering expertise.
The Site Reliability Engineer will work on the Federal SRE Team providing 24x7 production support for the Government Community Cloud infrastructure. Responsibilities include driving technical resolutions across the technology stack, automating tasks, and improving operability and platform stability for customers.
Seeking a Site Reliability Engineer to improve human health and quality of life through advanced computational methods and cloud-based systems. Responsibilities include building IT infrastructure, influencing project implementations, working closely with teams, and Linux systems administration. Bachelor's degree in Computer Science or related field required. Pay and perks include a competitive salary, equity-based compensation, healthcare, 401k, flexible work schedule, and more.
Develop core blockchain node software and ecosystem infrastructure. Implement blockchain node reference deployment, build automation tools, and support node operator and developer ecosystem. Focus on performance and reliability. Responsible for architecture, design, implementation, testing, and optimization. 5+ years experience in backend software or infrastructure, proficiency in Rust, Python, C++, Java. Experience with Terraform, Helm, Kubernetes, AWS, GCP, Azure, and blockchain. Strong communication and problem-solving skills.
Lead Software Engineer at JPMorgan Chase responsible for developing and troubleshooting software solutions, ensuring operational stability, evaluating architectural designs, and driving the use of new technologies. Required qualifications include 5+ years of experience, SRE experience in a cloud environment, proficiency in automation, and advanced knowledge of programming languages.
The Platform Infrastructure Engineer at iCapital will be responsible for building highly available solutions, maintaining site reliability, developing software for automation, and designing and operating Kubernetes environments. They will have extensive experience in DevOps, TechOps, and AWS with strong technical skills in various technologies and programming languages.
Develop automated systems, optimize performance and reliability, build self-service functionality, design monitoring and alerting tools, set and measure service standards, diagnose and solve problems.
As a Senior Site Reliability Engineer at Atlassian, you will work on improving performance and reliability of services, address incidents, automate tasks, and collaborate with team members. Responsibilities also include writing code in Bash and Python, capacity planning, managing enterprise monitoring solutions, and maintaining infrastructure in AWS. Desired skills include exposure to configuration management tools, experience with Docker and Kubernetes, familiarity with ITIL terminology, and compliance requirements knowledge. You will have the opportunity to contribute directly to the reliability of services and work on enhancing products for a reliable customer experience.
Collaborate with Infrastructure and Product Delivery teams to enhance Kubernetes clusters' readiness, scale the system, optimize performance, and promote operational excellence practices. Lead strategic initiatives to streamline platform operations, optimize performance, and improve system health. Automate application deployments and monitoring changes using ArgoCD. Improve monitoring, system performance, and develop automation tools. Ensure platform security and compliance with best practices.
The Senior Site Reliability Engineer will be responsible for designing and building high levels of availability, scalability, and reliability into the company's systems. They will work closely with feature teams, introduce best practices, and drive technical change to improve reliability. Additionally, they will contribute to incident management and root cause analysis.
The Site Reliability Engineer will be responsible for architecting, building, securing, and maintaining systems powering Direct-to-Consumer platforms with a focus on automation, databases, testing, observability, and resiliency.
The Sr Site Reliability Engineer is responsible for monitoring the production environment, addressing gaps in engineering services, automating processes, implementing new technologies, and providing tier-three support. The role requires 10+ years of experience in Information Technology with a focus on desktop and end user systems engineering.
Seeking a Senior Site Reliability Engineer to manage the Cloud Infrastructure and Observability solutions for the Beeswax platform, ensuring high scalability, reliability, and performance levels. Responsibilities involve infrastructure design, code development, configuration management, incident resolution, and documentation. Preferred qualifications include a postgraduate degree and certifications in AWS and Kubernetes.
The Junior Site Reliability Engineer will help improve the reliability, security, and cost-effectiveness of our cloud environments. They will be responsible for building and maintaining development tools, providing operation support for web-based and data infrastructure software applications, and collaborating with team members and stakeholders to define requirements and technical roadmaps. The ideal candidate has experience in Linux system administration, scripting, AWS, Git, CI/CD, and web technologies.
Top NYC Companies Hiring Reliability Engineers
See AllAll Filters
No Results
No Results