Applico Capital is the leading venture capital firm focused on the $8 trillion B2B distribution industry. Through our learnings and understanding of the industry, we are building a tech startup, currently in stealth, to solve the industry's biggest problems as it comes to unlocking AI-enabled synergies.
Our mandate is to leverage AI and modern technologies to reimagine the role of the traditional distributor and transform how the entire industry operates.
We are looking for highly technical builders who thrive in entrepreneurial, scrappy, and collaborative environments.
About the Role:We’re looking for a Full-Stack Data Engineer to help build this foundation from the ground up. You’ll work across ingestion, transformation, enrichment, and delivery layers — connecting ERP, CRM, PIM, CMS, and external data sources into a unified, intelligent data environment.
This is a hands-on builder role ideal for someone who thrives in fast-moving, entrepreneurial environments. You’ll prototype, automate, and iterate quickly while helping establish engineering patterns that will scale across multiple operating companies.
Key Responsibilities- Design and implement end-to-end data pipelines for ingestion, transformation, and enrichment using modern open-source tools
- Integrate data from core enterprise systems (ERP, CRM, PIM, CMS) and third-party APIs
- Build automated ELT/ETL workflows with observability, testing, and monitoring baked in
- Partner with Data Architecture and AI teams to prepare data for analytics, machine learning, and agent-driven workflows
- Develop lightweight APIs and internal tools (e.g., FastAPI, Streamlit, or Retool) to expose clean data products to internal teams
- Implement data quality, lineage, and governance frameworks to ensure reliability and transparency
- Contribute to the definition of open data models and schemas, following best practices for standardization and interoperability
- Use LLMs and AI-augmented tools to accelerate integration, cleaning, and mapping tasks
- Collaborate with product and business stakeholders to understand workflows and translate them into scalable data solutions
Requirements
- 4–7 years of experience in data engineering or full-stack data development, preferably in a modern cloud environment
- Strong skills in Python and SQL, with experience building production-grade data pipelines
- Hands-on experience with open-source data tools (e.g., dbt, Airbyte/Meltano, Dagster, Prefect, DuckDB, Postgres, Delta Lake, or Iceberg)
- Familiarity with data modeling and schema design (star/snowflake, normalized, or semantic/graph models)
- Experience working with cloud platforms (AWS, GCP, or Azure) and infrastructure as code (Terraform, GitHub Actions, etc.)
- Exposure to semantic or graph databases (Neo4j, Weaviate, ArangoDB) or an eagerness to learn
- Experience developing and consuming REST or GraphQL APIs
- Bonus: familiarity with LLM frameworks (LangChain, LangGraph, DSPy) or integrating AI enrichment into data pipelines
- Strong bias toward automation, testing, and documentation — you treat pipelines as products
- Comfortable operating in ambiguous, high-velocity environments where experimentation and impact matter most
Top Skills
Similar Jobs
What you need to know about the NYC Tech Scene
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory



