Top Reliability Engineer Jobs in NYC, NY
Lead database reliability engineer role at Warner Bros. Discovery with expertise in cloud-native database technologies, AWS services, and database dashboarding. Responsible for building high-performance, stable, scalable systems and contributing to team strategy and planning. Must have experience in CI/CD pipelines, functional and declarative programming languages, and hands-on database administration.
Join Peloton as a Senior Database Reliability Engineer to work on data protection, automation, observability, and performance insights related to datastores like PostgreSQL, DynamoDB, and more. Lead initiatives, maintain data persistence layer, mentor engineers, and design scalable backend systems.
Senior IT Site Reliability Engineer role at Hudson River Trading (HRT) responsible for managing on-premise containerized web services, automating technical infrastructure, designing reliable systems, implementing monitoring solutions, and assisting with DHCP and DNS administration.
Seeking a Senior Site Reliability Engineer with 5-8 years of experience to work on securing the Gov Cloud Environment, planning for High Availability, Disaster Recovery, and FedRamp controls. Responsibilities include developing the Cloud using Infrastructure As Code, improving reliability, optimizing system performance, and supporting multiple large distributed software applications. Must have US Citizenship.
Seeking an experienced Senior Site Reliability Engineer to maintain the reliability of the system using microservices, Kubernetes operators, data centers, and resource efficiency. Responsibilities include building internal services, writing performant code, leading on-call rotations, and mentoring team members. Requires 5+ years of experience with cloud technologies and strong programming skills in various languages.
As a Senior Site Reliability Engineer at Alloy, you will architect and build infrastructure solutions to improve uptime and exceed service level objectives. You will work on provisioning and managing AWS resources, deploying applications to Kubernetes, and ensuring secure and reliable systems. Additionally, you will focus on automating processes, supporting application developers, and continuously improving infrastructure.
The Senior Site Reliability Engineer is responsible for planning, scoping, architecting, and implementing solutions based on functional and performance capability requirements. They collaborate with various teams to influence and enhance solutions and platforms across the organization. Key responsibilities include building solutions for complex problems, championing Infrastructure as Code, enhancing CI/CD pipelines, reviewing existing systems, mentoring engineers, and participating in project proposals and designs.
Site Reliability Engineers (SREs) combine software and systems engineering expertise to build and operate the systems that power the firm's investment strategies at the speed and scale required in fast-evolving markets.
Featured Jobs
Zocdoc is seeking a Principal Site Reliability Engineer to drive the reliability, resiliency, observability, availability, and scalability of systems and services. Responsibilities include analyzing complex distributed system challenges, supporting product engineering with scaling needs, monitoring cloud-based infrastructure, automating tooling and processes, and mentoring colleagues. Ideal candidates are passionate about system reliability, motivated to learn new technologies, and enjoy collaborating in a diverse team environment.
Seeking a highly skilled Senior Site Reliability Engineer to ensure the reliability, security, and scalability of a data platform in a healthcare setting. Responsibilities include infrastructure management, security compliance, DevOps, monitoring, production support, and collaboration with cross-functional teams. Preferred qualifications include a bachelor's degree, 5+ years of experience, proficiency in Golang, Python, or Java, and strong knowledge of cloud environments and security best practices.
Design and implement platform architecture for Mainframe systems, monitor and troubleshoot systems, perform root cause analysis, optimize systems, collaborate with stakeholders, stay updated with Mainframe technologies.
Site Reliability Engineer responsible for supporting real-time, distributed environments, capacity planning, migrations, and infrastructure upgrades. Strong collaboration, troubleshooting, and communication skills required. Experience in trading desk support and Python development preferred.
Zocdoc is looking for a Principal Site Reliability Engineer to drive the reliability, resiliency, observability, availability, and scalability of systems and services. Responsibilities include analyzing complex distributed system challenges, supporting product engineering with scaling and performance needs, monitoring cloud-based infrastructure, automating tooling, and mentoring colleagues.
NBCUniversal is looking for a Site Reliability Engineer to support live channel origination and playout operations within the Distribution Engineering team. Responsibilities include investigating broadcast playout systems issues, creating documentation, deploying patches, and providing on-air systems support. The role involves 24x7 support and collaboration with team members and vendors.
Leverage system, network, and database skills to provide technical leadership for a team of on-site engineers responsible for the availability and performance of ServiceNow's cloud platform. Lead recovery efforts during major outages and crisis management. Support US Federal customers and ensure compliance with Federal screening requirements.
Site Reliability Engineers (SREs) at Braze are responsible for ensuring internal-facing services and platforms run smoothly, focusing on site uptime and infrastructure automation. They collaborate with engineering teams, develop internal platform infrastructure, and manage incidents to prevent downtime.
Plan, manage, and scale data layer for resilience and performance. Collaborate with engineering teams on architectural decisions. Provide expertise in database optimization. Improve load testing frameworks. Monitor and support data layer usage. Handle operational challenges and incidents. Work on database infrastructure automation and tools.
Seeking a Site Reliability Engineer to improve human health and quality of life through advanced computational methods and cloud-based systems. Responsibilities include building IT infrastructure, influencing project implementations, working closely with teams, and Linux systems administration. Bachelor's degree in Computer Science or related field required. Pay and perks include a competitive salary, equity-based compensation, healthcare, 401k, flexible work schedule, and more.
Design and implement production grade systems, establish standards and build automation, plan complex migrations, improve on-call experience, lead technical roadmaps creation and execution.
Seeking a senior engineer with specialization in the observability stack to improve visibility and performance of observability platforms in a cloud environment. Responsibilities include designing and implementing platforms, improving performance and scalability, developing insights, and collaborating with teams to establish observability standards.
Design, build, and maintain PostgreSQL and MySQL infrastructure for a global trading platform. Implement infrastructure-as-code tools, optimize database performance, and collaborate with cross-functional teams at HRT.
As a Senior Site Reliability Engineer at Formation Bio, you will collaborate with engineers and product managers to design and implement software infrastructure solutions for reliable drug development. You will drive SRE culture and contribute to the mission of accelerating drug development process.
Guide technical design, implementation, and optimization of global infrastructure services with a focus on Hybrid Cloud. Research new technology, manage projects, automate operational activities, ensure compliance, and integrate systems technologies with the ServiceNow platform.
Build robust, fault-tolerant, scalable systems, monitor system health, optimize performance, create tools for team efficiency, maintain security practices, collaborate with teams to solve production issues.
Join Grammarly as a Software Engineer in Reliability Engineering, responsible for building secure and reliable cloud-native infrastructure solutions. Collaborate with the engineering team to enhance reliability and observability, improve incident management, and implement proactive reliability improvements. Manage cloud-native infrastructure solutions using modern tools and services like AWS.
Top NYC Companies Hiring Reliability Engineers
See AllAll Filters
No Results
No Results