Railway Logo

Railway

Senior Infra Engineer: Observability

Reposted 12 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in United States
Senior level
In-Office or Remote
Hiring Remotely in United States
Senior level
Build ingestion pipelines for logs and metrics, scalable alerting engines, and observability APIs. Interface with product teams and develop microservices using Golang and Rust.
The summary above was generated by AI

Job description

Our core mission at Railway is to make software engineers higher leverage. We believe that people should be given powerful tools so that they can spend less time setting up to do, and more time doing.

Many infrastructure platforms simply focus on how you deploy your singular application, and now how these applications function in concert. Questions like “How do you build systems for zero downtime deployment”, “How do you do service-to-service communications”, etc are usually left up to the engineers to define.

At Railway, our goal is to be an all encompassing solution to all these problems. As such, we take special care as we define our networking infrastructure.

Note: Networking falls under the platform engineering umbrella. If you’re specialized, we’d love to chat! That said, we’d also like it noted you’re probably going to do a lot of non-networking + platform things

“But the world would be a better place if more engineers, like me, hated technology. The stuff I design, if I'm successful, nobody will ever notice. Things will just work, and will be self-managing”

- Radia Perlman

 
About The Role

For this role, you will:

  1. Build ingestion pipelines to consume 1M+ RPS streams of logs, metrics, and other telemetry

  2. Build scalable, fault tolerant alerting engines for notifying users, in real-time, of threshold breaches

  3. Craft rich backend observability APIs, working with product to build amazing experiences for instantly grokking their application

  4. Provide APIs to access realtime log/metrics streams to be consumed by the Dashboard and Product Teams

  5. Build Golang/Rust GRPC services from scratch capable of supporting tens of thousands of users, and the million+ to come.

  6. Define infrastructure that can be torn down, failed over, and reconstituted from scratch using principle of immutable infrastructure using Terraform and Ansible.

  7. Write Engineering Requirement Documents to take something from idea, to defined tasks, to implementation, to monitoring it’s success.

  8. Interface with our TypeScript and GraphQL edge to expose your microservice APIs for both internal and potentially external consumption

This is a high impact, high agency role with direct effect on company culture, trajectory, and outcome.

About You
  • A strong understanding of distributed systems. You enjoy building fault tolerant, resilient, and scalable services

  • Interests in VictoriaMetrics, ClickHouse, and other systems for building observability stacks from the ground up

  • A solid intuition about how long your solutions will last. All systems age. In startups, we can hope for 2-3 orders of magnitude, or 12-18mo.

  • The tact to implement your solution, creator monitors for it’s error boundaries, and document any requirements for when you’re not around

  • A great sense of direction and prioritization when it comes to dealing with the ambiguity of an early stage startup

  • A sense of grit to dive into a problem, implement a solution, scale that solution, and replace it when needed

  • A great set of communication skills for getting your point across, solution implemented, and beyond

We value and love to work with diverse persons from all backgrounds

Things to know

For better or worse, we're a startup; our team dynamics are different from companies of different sizes and stages.

  • We're distributed ALL across the globe, and that's only going to be more and more distributed. As a result, stuff is ALWAYS happening.

  • We do NOT expect you to work all the time, but you'll have to be diligent about your boundaries because the end of your day may overlap with the start of someone else's.

  • We're a small team, with high ownership, who are not only passionate about what we do, but seek to be exceptional as well. At the time of writing we're 21, serving hundreds of thousands of users. There's a lot of stuff going on, and a lot of ambiguity.

  • We want you to own it. We believe that ownership is a key to growth, and part of that growth is not only being able to make the choices, but owning the success, or failure, that comes with those choices.

Benefits and perks

At Railway, we provide best in class benefits. Great salary, full health benefits including dependents, strong equity grants, equipment stipend, and much more. For more details, check back on the main careers page.

Beyond compensation, there are a few things that we believe that make working at Railway truly unique:

  • Autonomy: We have very few meetings. Just a Monday and a Friday to go over the Company Board. We think your time is sacred, whether it's at work, or outside of work.

  • Ownership: We're a company with a high ownership, high autonomy culture. We hope that you'll come in, help us, and over the course of many years do the best work of your life. When we bring you onboard, we expect you to change the company.

  • Novel problems/solutions: We're a startup that's well funded, with cool problems, which lets us implement novel solutions! We abhor “busywork” and think, whether it's community, engineering, operations, etc there's always opportunity for creative and high leverage solutions.

  • Growth: We want you to grow with us, but we know that talent is loaned, so when you figure out what area you want to grow in next, whether it's at Railway or outside, we'll make sure you land there.

How we hire

No tricks. No surprises. Here's the entire process:

  1. Talk with us about the role

    • This is completely open ended and we're just trying to see who you are, what you want to do, and where you wanna go.

  2. Work on a small project to discuss in the interview

    • Asynchronously implement the following:

      1. Imagine a theoretical or actual system like Railway which can manage stateless and stateful compute workloads. Design the engine for managing observability

      2. Interview Structure (60 Minutes):

      3. Pre-work (before your interview): Complete your solution (advised)

      4. 0-5m: introduction

      5. 5-50m: Building (or expanding) your solution

      6. 50-60m: Questions on Railway/Tech/etc.

        You can, and SHOULD! ask us questions ahead of time. Ask away!

  3. Review your solution with the Team

    You'll sit down with someone on the team and go over the above. We'll poke into your solution, as well as get you acquainted with two more members of the team.

    Looking for: Learn about your problem solving skills. How you break down a problem and how you present a solution.

    1. Interview Structure (60 Minutes):

      1. Prework (submitted before your interview): Complete your solution

      2. 0-5m: introduction

      3. 5-50m: Building (or expanding) your solution

      4. 50-60m: Questions on Railway/Tech/etc

  4. Meet the Team

    1. You'll meet the Team, which will be comprised of 4 people from vastly different sections of the company.

      1. Looking for: How you work with the rest of the team and communicate.

  5. Offer and Details Chat with CEO

    1. Finally, we will go over the process, the role, and hammer out the details about your position, onboarding, and all the deets.

    Final Note: The interview goes both ways. Once again, please ask us things. Many things! Hard things. That's what we're here for.

Similar Jobs

12 Days Ago
In-Office or Remote
United States
Senior level
Senior level
Software
The Senior Infra Engineer will build and maintain ingestion pipelines, scalable alerting engines, and observability APIs, while ensuring resilience and scalability in infrastructure. They will work with tools like Golang, Rust, Terraform, and Ansible, documenting requirements and interfacing with product teams.
Top Skills: AnsibleGoGraphQLGrpcRustTerraformTypescript
5 Hours Ago
Remote
108K-132K Annually
Junior
108K-132K Annually
Junior
Aerospace • Information Technology • Software • Cybersecurity • Design • Defense • Manufacturing
Manage on-site C-17 (CC177) maintenance operations at CFB Trenton, leading technicians to execute scheduled/unscheduled repairs, ensure airworthiness and compliance, control costs, drive safety and quality, and coordinate with Boeing, RCAF, and government stakeholders.
Top Skills: C-17 Globemaster Iii (Cc177)ExcelMicrosoft OutlookMicrosoft PowerpointMicrosoft Word
7 Hours Ago
Easy Apply
Remote
Easy Apply
89K-139K Annually
Senior level
89K-139K Annually
Senior level
Big Data • Fintech • Mobile • Payments • Financial Services
Lead a team of Customer Advocacy Associates to resolve complex escalations, own QA, escalations, vendor performance, and operational workflows. Drive cross-functional initiatives, analyze complaint data to identify root causes, improve customer outcomes, and scale processes while ensuring compliance and service-level performance.

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

  • Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
  • Key Industries: Artificial intelligence, Fintech
  • Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
  • Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account