Mecka AI Logo

Mecka AI

Forward Deployed Data Engineer

Posted Yesterday
In-Office
New York, NY, USA
180K-250K Annually
Senior level
In-Office
New York, NY, USA
180K-250K Annually
Senior level
Own end-to-end delivery of customer datasets: gather requirements, build and harden ingestion/transformation/QA/export pipelines, define dataset contracts and quality metrics, query and slice large corpora for model fit, and drive tooling improvements. Serve as technical customer contact and translate ambiguous needs into reproducible, versioned datasets suitable for robotics/embodied-AI training.
The summary above was generated by AI
About Mecka AI

Mecka AI is building the data infrastructure layer for robotics and embodied AI.

We partner with leading AI labs and robotics companies to deliver high-quality, real-world datasets used to train, evaluate, and deploy robotic systems - where model performance is dictated by data quality.

The Role

We are hiring a Forward Deployed Data Engineer to operate on the frontier with customers: take messy, real-world capture data - much of it raw video - and turn it into beautiful, reliable, model-ready datasets, while owning the technical relationship end-to-end.

This is a senior, high-trust role with significant autonomy. You'll combine data engineering, hands-on analysis, and product judgment to deliver datasets customers can train and ship on - and to make our delivery systems more reliable every time you do.

What You'll Work OnCustomer Delivery & Technical Ownership
  • Own the end-to-end delivery of customer datasets: requirements, validation, iteration, final handoff.

  • Be the technical point of contact: communicate clearly, set expectations, and close loops.

  • Turn one-off customer needs into durable internal improvements - tooling, pipelines, and standards that make every future delivery faster and safer.

Data Systems & Pipelines
  • Build, debug, and harden data pipelines across ingestion, transformation, QA, and export.

  • Work fluently across storage and database paradigms (SQL + NoSQL + object storage) and pick the right tool for the job.

  • Establish reliable dataset "contracts": schemas, versioning, provenance, and reproducible builds - so every dataset has a clear source of truth.

Dataset Quality & Signal
  • Define and measure what makes a dataset good for a given task: coverage, diversity, balance, label fidelity, and fitness for the customer's model.

  • Build quality scorecards and coverage/diversity reports that make dataset health legible to customers and internal teams.

  • Query and slice large corpora to maximize customer fit - surface exactly the data that matches a target distribution, not just bulk volume.

  • When the signal a customer needs is missing or weak in the raw video, diagnose it and partner with the perception/ML pipeline teams to extract or improve it upstream.

Who You AreRequired Background
  • 5+ years in data engineering and/or backend engineering (or equivalent impact).

  • Strong experience with large data systems, pipelines, and analytical workflows.

  • Strong SQL proficiency and comfort across multiple database/storage paradigms.

  • Excellent engineering judgment and debugging ability in production systems.

  • Genuine data taste - you can look at a dataset and reason about whether it's complete, balanced, and trustworthy, not just whether the job ran.

Strong Signals
  • You've owned high-stakes customer deliveries with autonomy and trust.

  • You can translate ambiguous requirements into crisp dataset specs and execution plans.

  • You have strong product instincts and care about polish: "would I trust this dataset?"

  • You're comfortable working with unstructured, real-world data - especially video.

Nice to Have
  • Working literacy in video understanding, embeddings, and encoders - enough to reason about what a dataset teaches a model and where signal is missing.

  • Experience building data-quality, coverage, or diversity tooling.

  • Background adjacent to ML, computer vision, or robotics data.

Why This Role
  • Own the customer-facing delivery loop for world-class robotics datasets.

  • High autonomy, high trust, and direct impact on customer success and revenue.

  • Work across the full stack of the problem: data, pipelines, analysis, and delivery quality.

  • Sit at the exact point where raw, messy, real-world data becomes the thing that makes embodied-AI models work.

Similar Jobs

21 Minutes Ago
Remote or Hybrid
USA
75K-125K Annually
Senior level
75K-125K Annually
Senior level
Machine Learning • Payments • Security • Software • Financial Services
Lead business analysis for Digital Identity projects: gather and document system requirements, define capabilities, create system flows, manage backlogs, roadmap and releases, mentor junior analysts, coordinate stakeholders, and drive process improvement within Agile frameworks.
Top Skills: ConfluenceDynatraceJIRAKanbanMS OfficePostmanSafeScrumServicenowSoapui
4 Hours Ago
Hybrid
New York, NY, USA
95K-120K Annually
Senior level
95K-120K Annually
Senior level
Fintech • Information Technology • Insurance • Financial Services • Big Data Analytics
Serve as a technical functional analyst for risk-related programs, gathering and influencing requirements, coordinating vendors and development teams, managing project plans, timelines, budgets, and test plans, and integrating cloud and AI solutions into GRC and risk-management systems. Produce leadership-level documentation and status reporting while collaborating with Architecture, Risk IT, and business partners to deliver technical solutions that improve risk capabilities.
Top Skills: AIAzure DevopsCloudPower BIPowerPointPythonSQL
5 Hours Ago
Hybrid
70K-95K Annually
Senior level
70K-95K Annually
Senior level
Information Technology • Insurance • Software
Serve as a trusted advisor leading complex enterprise SaaS implementations for MGA/insurance clients. Gather requirements, configure applications, execute data conversions, manage full project lifecycle, deliver training and UAT, liaise with product/engineering, mentor consultants, and ensure projects meet scope, budget, timeline, and quality targets.
Top Skills: AgileBackend Development SystemsMS OfficePmbokPolicy Administration SystemsRatersSaaSVertafore

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

  • Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
  • Key Industries: Artificial intelligence, Fintech
  • Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
  • Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account