Montauk Capital Jobs

Head of Inference, Stealth Edge AI Co

Montauk Capital

Head of Inference, Stealth Edge AI Co

Reposted 5 Hours Ago

Be an Early Applicant

Hybrid

New York City, NY, USA

Expert/Leader

Hybrid

New York City, NY, USA

Expert/Leader

The Head of Inference will define the architecture for Edge AI, build a proof of concept, optimize GPU utilization, and drive technical decisions. This role requires deep knowledge of production inference systems and leadership in building distributed inference pipelines.

The summary above was generated by AI

Head of Inference

Full Time, Remote, NYC Preferred (US Based)

About Montauk Capital

Montauk Capital builds and backs companies at the forefront of the Electron Economy, the generational shift towards electrified, intelligent technologies reshaping industries and driving unprecedented demand for energy. Our team combines deep investing acumen with decades of operating experience to give founders the strategic clarity and hands-on support that accelerates the building of enduring companies of consequence.

About Stealth Edge AI Co

Co-founded by Montauk Capital, Stealth Edge AI Co is a pre-seed venture specialized in modular, metro-edge AI capabilities. By leveraging existing infrastructure for inference deployment, Edge AI provides low-latency, SLA-guaranteed performance across diverse GPU SKUs and colocation environments. Our technology intelligently routes traffic based on demand proximity and real-world network limitations, bypassing the heavy power and infrastructure requirements of traditional hyperscalers. Currently initiating operations with pilot nodes in NYC, we are executing a city-by-city expansion strategy with plans for a broader multi-metro rollout.

About the Role

We are seeking a visionary and execution-oriented Head of Inference. You'll define the inference architecture, make foundational decisions, build the first POC, and own this domain end to end alongside the CEO. You will be a senior, hands-on technical leader and the technical authority on inference in the room. You’ll own the key technical decisions, and will be the internal and external expert on inference. You will own the core inference capability driving the platform and customer experience, and have a strong voice over the technical foundation of the company. You’ll evolve the vision into a viable proof of concept, building the practical system to then design and implement distributed inference systems. Alongside the CEO, you’ll represent the company with top-tier partners, early customers and investors, and will own this domain end to end. In addition to the CEO, you will have the support of a team of strong advisors, and the initial founding team.

What You’ll Do

Create the inference strategy and define the inference architecture for Edge AI
Own the inference serving layer end-to-end: vLLM, TensorRT-LLM, Triton, or equivalent
Build a credible POC fast — proves the platform works to NVIDIA, cloud providers, and customers
Drive cost-per-token optimization
Optimize GPU utilization, KV-cache management, and batching for production workloads
Own observability and reliability SLAs
Build distributed inference pipelines across multi-GPU, multi-node edge deployments
Set performance baselines and SLAs for inference latency and throughput, plus observability and performance SLA’s
Define quantization strategy
Translate complex inference requirements for infrastructure designs
Define the software access layer architecture and oversee integration efforts
Engage credibly with investors, partners, and technical stakeholders, represent the company externally

What You’ll Bring

You have a passion for inference and a background as a hands-on technical builder who has directly implemented inference systems before, ideally in production or near-production environments. Deep knowledge and are excited about model serving, and the practical engineering required to make an inference system work on real hardware. You can take a vision and initial concept and translate it into a viable POC quickly and are comfortable making foundational technical decisions quickly, in ambiguity, and building first of a kind.

If inference is your craft and you've built systems in production, we want to talk.

Production inference serving — vLLM, TensorRT-LLM, Triton Inference Server, or equivalent distributed at scale
Quantization, SGLang, containerization, cost-per-token
Observability tooling:distributed tracing, latency profiling, alerting. Instrument and debug complex distributed systems with a focus on building world-class observability and debuggability tools
C++/CUDA/Rust
GPU utilization and CUDA kernel optimization — has pushed hardware to its limits
Batching, KV-cache, speculative decoding expertise
Scale systems using Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference
Has built a serving system that NVIDIA and cloud providers respect
Model deployment and serving
Systems engineering
Technical leadership experience, either over teams or outcomes
Startup / 0→1 DNA: You ship fast and communicate clearly

Why Join Us

Category-Defining Opportunity: Solving the AI inference bottleneck without the burden of power and infrastructure constraints Own the metro edge inference across heterogeneous, disparate compute nodes
Massive Market Opportunity: AI spending projected to exceed hundreds of billions annually, 54GW of AI Inference demand expected by 2030
Studio Support: Leverage Montauk Capital's resources, network, and operational expertise during critical early stages
Competitive compensation + equity: True ownership over what you build

New York, NYC, United States, 10001

Similar Jobs

New York Life Insurance Company

Corporate Vice President, Red Team Program Lead

4 Hours Ago

Hybrid

New York, NY, USA

185K-265K Annually

Mid level

185K-265K Annually

Mid level

Artificial Intelligence • Cloud • Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics

The Red Team Program Lead manages the Red Team Program, coordinating cybersecurity exercises, stakeholder relations, governance processes, and program outcomes while ensuring effective communication and risk management.

Top Skills: CybersecurityProject ManagementRisk Management

Estuary

Solutions Engineer

5 Hours Ago

Hybrid

New York, NY, USA

120K-160K Annually

Mid level

120K-160K Annually

Mid level

Artificial Intelligence • Big Data • Software • Infrastructure as a Service (IaaS)

The Solutions Engineer will address customer needs, influence product strategies, provide technical support, and develop documentation while collaborating with users and internal teams.

Top Skills: JavaScriptPythonSQL

United Hospital Fund of New York

Research Analyst (Statistical Programmer)

5 Hours Ago

Hybrid

New York, NY, USA

85K-95K Annually

Mid level

85K-95K Annually

Mid level

Healthtech • Insurance • Other • Social Impact • Database

The Research Analyst will develop data analyses, create visualizations, implement data governance, and assess Medicaid program effectiveness using SQL and R.

Top Skills: RSQLTableau

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
Key Industries: Artificial intelligence, Fintech
Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Montauk Capital

Head of Inference, Stealth Edge AI Co

Montauk Capital New York, New York, USA Office

Similar Jobs

Corporate Vice President, Red Team Program Lead

Solutions Engineer

Research Analyst (Statistical Programmer)

What you need to know about the NYC Tech Scene

Key Facts About NYC Tech