Sonatus Logo

Sonatus

Staff AI Engineer, Inference & Optimization

Sorry, this job was removed at 10:14 p.m. (EST) on Tuesday, Dec 02, 2025
Easy Apply
In-Office
Sunnyvale, CA
198K-260K Annually
Easy Apply
In-Office
Sunnyvale, CA
198K-260K Annually

Similar Jobs at Sonatus

20 Hours Ago
Easy Apply
Hybrid
Sunnyvale, CA, USA
Easy Apply
176K-242K Annually
Senior level
176K-242K Annually
Senior level
Artificial Intelligence • Automotive • Cloud • Software
Design and implement large-scale cloud infrastructure and data processing pipelines for AI applications in software-defined vehicles, ensuring best practices in data governance and observability.
Top Skills: DockerElasticsearchGoKafkaKubernetesPulumiPythonRabbitMQTerraform
20 Hours Ago
Easy Apply
Hybrid
Sunnyvale, CA, USA
Easy Apply
169K-232K Annually
Senior level
169K-232K Annually
Senior level
Artificial Intelligence • Automotive • Cloud • Software
The Staff DevOps/MLOps Engineer will design and build end-to-end DevOps and MLOps platforms, managing cloud infrastructure, CI/CD pipelines, and machine learning lifecycles to ensure efficient model deployment and monitoring.
Top Skills: AWSAzureDockerGCPKubeflowKubernetesMlflowPulumiPythonTerraformVertex Ai
Yesterday
Easy Apply
Hybrid
Sunnyvale, CA, USA
Easy Apply
203K-338K Annually
Expert/Leader
203K-338K Annually
Expert/Leader
Artificial Intelligence • Automotive • Cloud • Software
Seeking a Senior Director of Field Application Engineering to lead a global team, engaging with OEMs, driving customer success, and influencing product strategy for AI-enabled software-defined vehicles.
Top Skills: AIAutosarComputer VisionCybersecurityEmbedded SoftwareFunctional SafetyIso 26262Machine LearningReal-Time Embedded Systems

Join a high-performing team at Sonatus that’s redefining what cars can do in the era of Software-Defined Vehicles (SDV).

At Sonatus, we’re driving the transformation to AI-enabled software-defined vehicles. Traditional automotive software methods can’t keep pace with consumer expectations shaped by the mobile industry—where features evolve rapidly, update seamlessly, and improve continuously. That’s why leading OEMs trust Sonatus to accelerate this shift. Our technology is already in production across more than 5 million vehicles on the road today and rapidly expanding.

Headquartered in Sunnyvale, CA, with 250+ employees worldwide, Sonatus combines the agility of a fast-growing company with the scale and impact of an established partner. Backed by strong funding and proven by global deployment, we’re solving some of the most interesting and complex challenges in the industry. Join us and help redefine what’s possible as we shape the future of mobility.

The Opportunity:

We're looking for a highly skilled and experienced Staff AI Engineer with domain expertise in optimizing AI models for production Edge environments. You’ll own the full lifecycle of model inference and hardware acceleration, from initial optimization to large-scale deployment. In this role, you will be a key contributor to our team, ensuring our AI solutions are not just functional but also incredibly fast, efficient, and reliable on various inference hardware platforms.

Role and Responsibilities:
  • Design, build, and maintain robust pipelines and runtime environments for deploying and serving machine learning models at the Edge. Ensure high availability, low latency, and efficient resource utilization for inference at scale.
  • Collaborate with researchers and hardware engineers to optimize models for performance, latency, and power consumption on specific hardware, including GPUs, TPUs, NPUs, and FPGAs. This includes a strong focus on inference optimization techniques like quantization, pruning, and knowledge distillation.
  • Use of AI compilers and specialized software stacks (e.g., TensorRT, OpenVINO, TVM) to accelerate model execution, ensuring models are compiled and optimized for peak performance on target hardware.
  • Design, build, and maintain MLOps pipelines for deploying models to various edge devices (e.g., highly integrated vehicle compute), with a specific focus on performance and efficiency constraints.
  • Implement and maintain monitoring and alerting systems to track model performance, data drift, and overall model health in production.
  • Work with cloud platforms and on-device environments to provision and manage the necessary infrastructure for scalable and reliable model serving.
  • Proactively identify and resolve issues related to model performance, deployment failures, and data discrepancies, with a specific focus on inference bottlenecks.
  • Work closely with Machine Learning Engineers, Software Engineers, and Product Managers to bring models from design to high-performance production systems.
Qualifications:
  • Minimum 7 years of work experience in MLOps or a similar role with a strong focus on high-performance machine learning systems.
  • Proven experience with inference optimization techniques such as quantization (INT8, FP16), pruning, and model distillation.
  • Deep hands-on experience with hardware acceleration for machine learning, including familiarity with GPUs, TPUs, NPUs and related software ecosystems.
  • Strong experience with AI compilers and runtime environments like TensorRT, OpenVINO, and TVM.
  • Proven experience deploying and managing ML models on edge devices (e.g., NVIDIA Jetson, Raspberry Pi, NXP, Renesas).
  • Strong experience in designing and building distributed systems. Proficiency with inter-process communication protocols like gRPC, message queuing systems like MQTT, and efficient data handling techniques such as buffering and callbacks.
  • Hands-on experience with popular ML frameworks such as PyTorch, TensorFlow, TFLite, and ONNX.
  • Proficiency in programming languages, including Python and C++.
  • Solid understanding of machine learning concepts, the ML development lifecycle, and the challenges of deploying models at scale.
  • Proficiency with containerization technologies (Docker, Kubernetes) and cloud platforms (AWS, Azure).
  • Expertise in CI/CD principles and tools applied to machine learning workflows.
  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or a related quantitative field.
Benefits:

Sonatus is a tight-knit team aligned around a unified vision. You can expect a strong engineering-oriented culture that focuses on building the best products and solutions for our customers. We embrace equality and diversity in all regards because respect is ingrained in our every fiber. Other benefits Sonatus offers include:

  • Stock option plan
  • Health care plan (Medical, Dental & Vision)
  • Retirement plan (401k, IRA)
  • Life Insurance (Basic, Voluntary & AD&D)
  • Unlimited paid time off (Vacation, Sick & Public Holidays)
  • Family leave (Maternity, Paternity)
  • Flexible work arrangements
  • Free food & snacks in office

The posted salary range is a general guideline and represents a good faith estimate of what Sonatus ("Company") could reasonably expect to pay for a base salary for this position. The pay offered to a selected candidate will be determined based on factors such as (but not limited to) the scope and responsibilities of the position, the qualifications of the selected candidate, departmental budget availability, geographic location and external market pay for comparable jobs. The Company reserves the right to modify this range in the future, as needed, as market conditions change.

Pay range for this role
$197,500$260,000 USD
Sonatus is a fast-paced and innovative company and are seeking team members who are passionate about making a difference. If you are ready to take your career to the next level, we highly encourage you to apply.
 
To all recruitment agencies: Sonatus, Inc. ("Sonatus") does not accept unsolicited agency resumes. Please do not forward resumes to our careers alias or other Sonatus' employees. Sonatus is not responsible for any fees associated with unsolicited activities.

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

  • Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
  • Key Industries: Artificial intelligence, Fintech
  • Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
  • Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account