Design and implement scalable data pipelines using Databricks, develop ETL processes, optimize workflows, ensure data security, and collaborate with stakeholders.
Data Integration Lead Responsibility Design and implement scalable data pipelines for ingesting and transforming data primarily using Databricks and leveraging PySpark notebooks, Spark SQL, and Python. Develop ETL processes to extract, transform, and load data from diverse sources into data lakehouse architecture on Databricks Analyze existing integration Terdata landscape (TPT, BTEQ, Talend, IBM Sterling) Define ingestion and integration strategy for Databricks Ensure seamless data flow from source systems to Lakehouse Lead integration design and oversee pipeline migration Optimize data processing workflows to enhance performance and efficiency using Databricks capabilities. Ensure data security and compliance with data privacy regulations throughout the data engineering process. Deliver high quality data products based on business requirements Collaborate with stakeholders to gather, understand requirements and create technical solutions using Microsoft Azure stack Create comprehensive documentation of workflows, pipelines, and architecture.
Similar Jobs
Information Technology • Consulting
The Big Data Lead will manage project progress using Burn Down Charts, lead teams with SCRUM and Agile Kanban methodologies, and resolve conflicts.
Top Skills:
Agile KanbanCrystal FrameworkScrum
Information Technology • Consulting
Responsible for designing and maintaining data infrastructure, developing data pipelines, ensuring data quality, and collaborating with various teams to enhance data utility. Also involves communication with stakeholders and innovative solutions for business success.
Top Skills:
Azure Data FactoryAzure DatabricksC#CdgcCosmosdbData QualityDihFabricHdinsightInformatica CihJavaMaster Data ManagementPower BIPythonSnowflakeSQLSQL ServerSynapse
Information Technology • Consulting
The Big Data Lead will implement ETL pipelines, ensure data integrity, troubleshoot PySpark applications, and integrate with existing frameworks while leading a team.
Top Skills:
HadoopHiveKafkaPysparkPythonSparkSQL
What you need to know about the NYC Tech Scene
As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory
