We are seeking a Data Engineer to design, build and optimize scalable data pipelines. Strong experience in Databricks, PySpark, and data warehouse concepts is required.
This is a remote position.
We are looking for a Data Engineer with strong experience in Databricks, PySpark, and modern Data Warehouse systems. The ideal candidate can design, build, and optimize scalable data pipelines and work closely with analytics, product, and engineering teams.
Requirements
- Strong hands-on skills in Databricks, PySpark, and SQL
- Experience with data warehouse concepts, ETL frameworks, batch/streaming pipelines
- Solid understanding of Delta Lake and Lakehouse architecture
- Experience with at least one cloud platform (Azure preferred)
- Experience with workflow orchestration tools (Airflow, ADF, Prefect, etc.)
Similar Jobs
Artificial Intelligence • Big Data • Healthtech • Information Technology • Machine Learning • Software • Analytics
Design, build, and operate scalable data pipelines and AI-ready data products from large structured and unstructured sources (OCR/images/documents). Enable production Generative AI (RAG, semantic search), ensure data quality/observability, orchestrate CI/CD and infra-as-code, and mentor engineers while collaborating with product, analytics, and compliance teams.
Top Skills:
AirflowAWSAzureChartjsDatabricksDatabricksDeequDelta LakeDockerEvent HubsGCPGithub ActionsGreat ExpectationsJavaKafkaKinesisKubernetesLlmOcrPlotlyPysparkPythonRagScalaSeabornSemantic SearchSnowflakeSparkSQLTerraform
Artificial Intelligence • Fintech • Machine Learning • Mobile • Payments • Retail • Software
Own and modernize Upsides analytics data platform: migrate pipelines, reduce cost, improve governance, design reusable modeling/orchestration patterns, deliver domain-critical data products, lead cross-functional initiatives, mentor engineers, and support ML and product teams.
Top Skills:
AWSCi/CdDagsterDatabricksDbtSnowflakeTerraform
Artificial Intelligence • Legal Tech
Founding data engineer responsible for consolidating multiple data sources into a BigQuery warehouse, building ETL/ELT pipelines, creating self-serve data tools (including natural-language/LLM agents), enabling analytics and personalization, and defining data engineering standards and infrastructure for a growing AI product.
Top Skills:
BigQueryData LakeEtl/EltGoogle Cloud PlatformLlmsPythonSQLTerraformText-To-Sql
What you need to know about the NYC Tech Scene
As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory



