adaption

AI Systems & Inference Frameworks Engineer

Posted 4 Days Ago

Remote or Hybrid

2 Locations

Mid level

Remote or Hybrid

2 Locations

Mid level

Design and build LLM inference stack, optimize performance and cost efficiency, and collaborate on model execution and infrastructure.

The summary above was generated by AI

About Us

Most AI is frozen in place - it doesn't adapt to the world. We think that's backwards. Our mandate is to build efficient intelligence that evolves in real-time. Our vision is AI systems that are flexible, personalized, and accessible to everyone. We believe efficiency is what makes this possible - it's how we expand access and ensure innovation benefits the many, not the few. We believe in talent density: bringing together the best and most driven individuals to push the boundaries of continual adaptation. We're looking for builders and creative thinkers ready to shape the next era of intelligence.

The Role

You’ll work directly with our founders to design and build the inference and optimization systems that power our core product. This role bridges research and production, combining deep exploration of inference techniques with hands-on ownership of scalable, high-performance serving infrastructure. You’ll own the full lifecycle of LLM inference—from experimentation and performance analysis to deployment and iteration in production—thriving in a zero-to-one environment and helping define the technical foundations of our inference stack.

Responsibilities

Inference Research & Systems: design and build our LLM inference stack from zero to one, exploring and implementing advanced techniques for low-latency, high-throughput serving of language and multimodal models.
Frameworks & Optimization: develop and optimize inference using modern frameworks (e.g., vLLM, SGLang, TensorRT-LLM), experimenting with batching strategies, KV-cache management, parallelism, and GPU utilization to push performance and cost efficiency.
Software–Hardware Co-Design: collaborate closely with founders and model developers to analyze bottlenecks across the stack, co-optimizing model execution, infrastructure, and deployment pipelines.

Qualifications

Strong experience building and optimizing LLM inference systems in production or research environments
Hands-on expertise with inference frameworks such as vLLM, SGLang, TensorRT-LLM, or similar
Deep performance mindset with experience in GPU-backed systems, latency/throughput optimization, and resource efficiency
Solid understanding of transformer inference, serving architectures, and KV-cache–based execution
Strong programming skills in Python; experience with CUDA, Triton, or C++ a plus
Comfort working in ambiguous, zero-to-one environments and driving research ideas into production systems
Nice to have: experience with model quantization or pruning, speculative decoding, multimodal inference, open-source contributions, or prior work in systems or ML research labs

Above all, we're looking for great teammates who make work feel lighter and aren't afraid to go out on a limb with bold ideas. You don't need to be perfect, but you do need to be adaptable. We encourage you to apply, even if you don't check every box.

Benefits

Flexible work: In-person collaboration in the Bay Area, a distributed global-first team, and quarterly offsites.
Adaption Passport: Annual travel stipend to explore a country you've never visited. We're building intelligence that evolves alongside you, so we encourage you to keep expanding your horizons.
Lunch Stipend: Weekly meal allowance for take-out or grocery delivery.
Well-Being: Comprehensive medical benefits and generous paid time off.

Top Skills

C++

Cuda

Python

Triton

Similar Jobs

Superhuman

Senior Product Manager

2 Hours Ago

Easy Apply

Remote or Hybrid

United States

Easy Apply

186K-258K Annually

Senior level

186K-258K Annually

Senior level

Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI

The Senior Product Manager will drive product strategy, collaborate with teams, analyze user feedback, and own projects from ideation to launch.

Top Skills: AICollaboration ToolsProduct Integrations

Luxury Presence

Product Manager

4 Hours Ago

Easy Apply

Remote or Hybrid

Easy Apply

Senior level

Marketing Tech • Real Estate • Software • PropTech • SEO

The Staff Data Product Manager will unify data architecture, lead product strategy for AI-powered data products, and ensure rapid delivery of innovative solutions for real estate agents. This role emphasizes data product management, AI capabilities, and deep understanding of real estate data compliance.

Top Skills: Ai ToolsData ArchitectureIdxMls DataReso StandardsRetsSchema Design

GitLab

Solutions Architect

4 Hours Ago

Easy Apply

Remote

United States

Easy Apply

139K-297K Annually

Expert/Leader

139K-297K Annually

Expert/Leader

Cloud • Security • Software • Cybersecurity • Automation

As a Principal Solutions Architect, you will advise customers on unlocking value from GitLab's AI-powered platform, leading technical strategy, and customer interactions throughout the sales cycle.

Top Skills: AICi/CdCloud ComputingDevsecops

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
Key Industries: Artificial intelligence, Fintech
Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory