Design, build, and scale enterprise Databricks Lakehouse platforms. Lead architecture, platform standardization, ingestion frameworks (batch/stream), governance, performance/cost optimization, cloud integration, and mentor engineering teams.
Role Overview
Data Platform Architecture
Mandatory Skills
We are looking for a highly skilled Databricks Architect to design, build, and scale enterprise-grade Lakehouse data platforms. This role will drive architecture strategy, platform standardization, and enterprise data modernization initiatives, leveraging Databricks and cloud ecosystems.
The ideal candidate brings deep expertise in Spark, Delta Lake, and cloud-native architecture, along with strong leadership in driving large-scale data transformations.
Data Platform Architecture
- Define and implement end-to-end Databricks Lakehouse architecture.
- Design scalable systems for:
- Batch & real-time data processing
- Structured & unstructured workloads
- Batch & real-time data processing
- Establish medallion architecture (Bronze, Silver, Gold layers) as a standard.
- Lead deployment and optimization of:
- Azure Databricks / AWS Databricks / GCP Databricks
- Azure Databricks / AWS Databricks / GCP Databricks
- Define standards for:
- Workspace design & cluster strategy
- Job orchestration
- Data storage (Delta Lake)
- Workspace design & cluster strategy
- Drive adoption of:
- Unity Catalog
- MLflow
- Databricks SQL & Photon
- Unity Catalog
- Architect robust data ingestion frameworks:
- Batch (ADF, Airflow)
- Streaming (Kafka, Event Hub)
- Batch (ADF, Airflow)
- Define reusable patterns for:
- ETL/ELT pipelines
- Data modeling (star schema, data vault, dimensional models)
- ETL/ELT pipelines
- Guide engineering teams on best practices in Spark/PySpark optimization.
- Optimize workloads for:
- Query performance
- Cluster utilization
- Storage efficiency
- Query performance
- Implement cost governance strategies (auto-scaling, job clusters, spot instances).
- Architect enterprise-grade governance frameworks:
- Data lineage, cataloging, metadata management
- Fine-grained access control (RBAC/ABAC)
- Data lineage, cataloging, metadata management
- Ensure compliance with data privacy and regulatory standards.
- Integrate Databricks with:
- Data sources (ERP, CRM, APIs, IoT)
- BI tools (Power BI, Tableau)
- ML pipelines and AI platforms
- Data sources (ERP, CRM, APIs, IoT)
- Collaborate with cloud architects for:
- Networking, security, and storage strategies.
- Networking, security, and storage strategies.
- Provide architectural guidance to data engineers, scientists, and TPMs.
- Conduct design reviews and enforce architecture governance.
- Mentor teams on emerging patterns:
- Data Mesh
- DataOps / MLOps
- GenAI workloads on Databricks
- Data Mesh
Mandatory Skills
- 12+ years of experience in data engineering, architecture, or platform design.
- 5+ years of hands-on experience with:
- Databricks (must-have)
- Apache Spark / PySpark / SQL
- Databricks (must-have)
- Strong expertise in:
- Delta Lake
- Distributed data processing
- Delta Lake
- Experience with at least one cloud:
- Azure (preferred), AWS, or GCP
- Azure (preferred), AWS, or GCP
Similar Jobs
Cloud • Information Technology • Machine Learning
Lead utility engagement and power access strategy for U.S. (and some European) electric utilities. Manage relationships, interconnection and power procurement, negotiate Electric Service Agreements, assess power feasibility for new sites, coordinate cross-functional stakeholders, track timelines/risks, and standardize power procurement processes to support CoreWeave expansion.
Cloud • Information Technology • Machine Learning
Design, implement, and automate scalable cloud security solutions and baselines across AWS/Azure/GCP. Build guardrails, IAM, detection/monitoring, configuration management, and data protection using IaC (Terraform) and everything-as-code. Collaborate with Product Engineering and security teams, produce technical documentation, and operationalize security tooling.
Top Skills:
AWSAzureCi/CdConfiguration ManagementContainer OrchestrationCspm (Wiz)Detection And MonitoringGCPGoIamInfrastructure-As-CodeKubernetesPythonTerraform
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3 • Infrastructure as a Service (IaaS)
The AI Operations Lead will identify operational bottlenecks, build AI automations and workflows, and drive cross-team collaboration to enhance efficiency and establish internal AI foundations.
Top Skills:
AIAPIsData PlatformsDbtGCPLow-Code ToolsNo-Code ToolsSQL
What you need to know about the NYC Tech Scene
As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

