Hex

Infra Engineer, Datadog Whisperer

Reposted 20 Hours Ago

Remote or Hybrid

3 Locations

Mid level

Remote or Hybrid

3 Locations

Mid level

As an Infra Engineer, you will optimize Datadog costs by managing metrics, logs, and performance monitoring while promoting cost-awareness across the engineering team.

The summary above was generated by AI

About the role

We build products people genuinely love. Our features are impactful, our business is growing, and … it’s pretty great! We pride ourselves on many things, but let’s be honest: we have been operating on “ship it first, check the Datadog bill later”. And later has arrived. Our Datadog bill is a threat, an ever-growing line item that threatens to consume what remains of our cloud spend budget.

This is a critical juncture, where our monitoring costs are starting to overshadow the systems they're meant to monitor. We need a hero. A detective. A mercenary with a deep-seated love for logs, metrics, and most importantly, savings.

You are not just an Infra Engineer; you are an economic covert ops specialist**.** Your sole, glorious mission is to make our Datadog spend dramatically and sustainably go down. We're talking down down. The bill should look like it's been body-slammed by a professional wrestler.

You will be embedded within the Infrastructure team, and will have the autonomy to look across every team and service to streamline and purge that which needs streamlining and purging. As you rack up wins, you'll increasingly become the person we introduce at company meetings as, "The reason we could spend $$ on that nice company offsite.”

What you will do

Mitigation of myriad metrics: Hunt down and decommission all high-cardinality custom metrics that no one actually uses, replacing them with sane, aggregated alternatives, or build a system that insulates us from this risk area entirely.
Liberation from legions of logs: Audit the log ingestion for every service. You'll work with engineering teams to tune logging levels, apply intelligent sampling and exclusion filters at the source (i.e., the agent), and implement better categorization and archiving strategies.
Analysis of Performance Monitoring (APM): Analyze our APM and trace ingestion and ensure it’s smartly used. You'll champion distributed tracing strategies that are both informative and economical.
Standardization: Use automation to enforce cost-saving policies across our entire fleet, ensuring developers can't accidentally check in a new, expensive monitoring configuration
Evangelization: Be the champion for cost-aware engineering. Create internal documentation, run "Datadog Dojo" workshops, and embed the mindset of "monitor what matters" across the entire engineering organization.

About you

3+ years as an Infrastructure, DevOps, or Site Reliability Engineer.
Expert-level, obsessive knowledge of Datadog's pricing model and platform architecture. You know how to read the usage report better than you know your own credit card statement.
Deep proficiency with AWS and Kubernetes.
Strong programming skills for infrastructure automation.
The courage to tell a a founder or principal engineer that their favorite metric is financially irresponsible.

Bonus:

Experience with other monitoring/observability tools (Prometheus, Grafana, Honeycomb, Splunk) and a view on whether we should be using any of them to displace some Datadog functionality.
Experience implementing OpenTelemetry standards and agents for cost-effective vendor neutrality.
A proven track record of actually reducing cloud costs, not just talking about it.

Our stack

Our product is a web-based notebook and app authoring platform. Our frontend is built with Typescript and React, using a combination of Apollo GraphQL and Redux for managing application state and data. On the backend, we also use Typescript to power an Express/Apollo GraphQL server that interacts with Postgres, Redis, and Kubernetes to manage our database and Python kernels. Our backend is tightly integrated with our infrastructure and CI/CD, where we use a combination of Terraform, Helm, and AWS to deploy and maintain our stack.

In addition to our unique culture, Hex proudly offers a competitive total rewards package, including but not limited to, market-benched salary & equity, comprehensive health benefits, and flexible paid time off.

The salary range for this role is: Variable, depends on how much $$ you save

The salary range shown may be a reflection of additional factors such as geographical location and skill ranges/levels we’re open to. Placement in the salary range will be decided upon completion of the interview process, taking into account factors like leaving room for growth, internal fairness & parity, your demonstrated skills, and the depth of your experience. Our Recruiting team will be able to provide more details during the interview process.

By submitting an application the candidate consents to the use of their personal information in accordance with the Hex Privacy policy: https://learn.hex.tech/docs/trust/privacy-policy.

Top Skills

Apollo Graphql

AWS

Datadog

Helm

Kubernetes

Opentelemetry

Postgres

React

Redis

Terraform

Typescript

44 W 18th St, New York, NY, United States, 10011

Similar Jobs at Hex

Hex

Mid-market Account Executive

20 Hours Ago

Remote or Hybrid

220K-240K Annually

Mid level

220K-240K Annually

Mid level

Artificial Intelligence • Big Data • Software • Analytics • Business Intelligence • Big Data Analytics

This role involves driving revenue through new business development, managing sales cycles, and collaborating with various teams to meet goals.

Top Skills: AnalyticsDataDs/MlSaaSSQL

Hex

Software Engineer

20 Hours Ago

Remote or Hybrid

176K-220K Annually

Senior level

176K-220K Annually

Senior level

Artificial Intelligence • Big Data • Software • Analytics • Business Intelligence • Big Data Analytics

The role involves building secure enterprise solutions, implementing access control systems, and developing public APIs while ensuring intuitive user experiences and scalable architecture.

Top Skills: MfaOauthPublic ApisScimSso

Hex

Infrastructure Engineer

2 Days Ago

Remote or Hybrid

215K-270K Annually

Senior level

215K-270K Annually

Senior level

Artificial Intelligence • Big Data • Software • Analytics • Business Intelligence • Big Data Analytics

In this role, you will lead infrastructure strategy, mentor engineers, and design scalable solutions using AWS and Kubernetes while ensuring database performance optimization and cost efficiency.

Top Skills: AWSCi/CdCloudFormationKubernetesPostgresRedisTerraform

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
Key Industries: Artificial intelligence, Fintech
Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Hex

Infra Engineer, Datadog Whisperer

Top Skills

Hex New York, New York, USA Office

Similar Jobs at Hex

Mid-market Account Executive

Software Engineer

Infrastructure Engineer

What you need to know about the NYC Tech Scene

Key Facts About NYC Tech