Lead design and implementation of scalable Databricks data platforms and end-to-end data pipelines using Spark and Delta Lake. Drive migrations from legacy systems, enforce governance/security, optimize performance and costs, and collaborate with stakeholders and data science teams to enable analytics and AI/ML use cases.
We are seeking an experienced Databricks Architect to lead the design and implementation of scalable data platforms on Databricks. The role will drive end-to-end architecture, including data ingestion, transformation, optimization, and governance, while enabling advanced analytics and AI/ML use cases. The ideal candidate will have strong expertise in Spark, Delta Lake, cloud platforms (Azure/AWS), and modern data engineering practices, along with the ability to collaborate with business and technology stakeholders to deliver high-impact solutions.
Responsibilities- Lead the design and implementation of scalable, secure, and high-performance data architecture on Databricks
- Define end-to-end data pipelines (ingestion, transformation, serving) using Spark and Delta Lake
- Drive migration and modernization initiatives from legacy platforms to Databricks
- Establish best practices for data engineering, performance optimization, and cost management
- Design and implement data governance, security, and compliance frameworks
- Collaborate with business stakeholders, data scientists, and engineering teams to translate requirements into technical solutions
- Provide technical leadership, mentorship, and guidance to development teams
- Ensure data quality, reconciliation, and reliability across data workflows
- Integrate Databricks with enterprise tools (e.g., MuleSoft, Alteryx, BI/reporting platforms)
- Stay current with Databricks innovations and recommend adoption of new capabilities (e.g., ML, AI, DBSQL, Unity Catalog)
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
- 9-12+ years of experience in data engineering, data architecture, or analytics platforms
- Strong hands-on expertise with Databricks, Apache Spark, and Delta Lake
- Experience with cloud platforms such as Azure (preferred), AWS, or GCP
- Proven experience designing and implementing scalable data pipelines and architectures
- Strong knowledge of SQL, Python, and/or Scala
- Experience with data integration tools (e.g., MuleSoft, Alteryx) and modern data ecosystems
- Familiarity with data governance, security frameworks, and compliance best practices
- Experience with performance tuning, optimization, and cost management in Databricks
- Strong problem-solving skills and ability to work in a cross-functional, collaborative environment
- Excellent communication and stakeholder management skills
- Exposure to AI/ML use cases, Databricks SQL, and Unity Catalog is a plus
EXL New York, New York, USA Office
320 Park Avenue, 29th Floor, New York, NY, United States, 10022
EXL Jersey City, New Jersey, USA Office
Jersey City, United States, 0
EXL Newark, New Jersey, USA Office
Newark, United States
Similar Jobs
Artificial Intelligence • Information Technology • Professional Services • Software • Analytics • Generative AI • Big Data Analytics
Lead design and implementation of an enterprise Databricks lakehouse, build scalable batch and streaming pipelines, enforce governance and CI/CD standards, optimize Spark workloads, operationalize ML with MLflow, manage cloud infrastructure and IaC, and mentor data engineering teams.
Top Skills:
AdlsAWSAzureDatabricksDatabricks Asset BundlesDatabricks WorkflowsDbxDelta LakeDelta Live TablesFeature StoreGCPGcsGitMlflowPhotonPysparkPythonS3ScalaSpark SqlSQLStructured StreamingTerraformUnity Catalog
Information Technology • Consulting
As an EDWH Solution Architect, you'll strategize and engineer Big Data and cloud solutions, lead data architecture initiatives, mentor team members, and ensure integrations meet client requirements while utilizing various tools and technologies.
Top Skills:
AdfAzureDatabricksDatastageETLPysparkPythonSparksql
Consumer Web • eCommerce • Internet of Things
Own and produce developer-facing documentation for DNSid including API references (TypeScript, Python, Go), conceptual guides, integration tutorials, developer portal IA, standards/spec writing, changelogs, and CI-validated code samples. Work closely with SDK engineers and developer advocates to document features pre-release, set style and tooling, and ensure docs are machine- and AI-consumable.
Top Skills:
A2ACiCrewaiDnsDocusaurusGitGoLangchainLlamaindexLlms.TxtMcpMicrosoft Agent FrameworkMintlifyOauth 2.0OidcOpenai Agents SdkPythonReadthedocsSpiffeSpireTxt RecordsTypescript
What you need to know about the NYC Tech Scene
As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory



