As a Data Engineer, you will design and maintain data pipelines, manage data lakes on AWS, and transform data into reliable datasets for analytics.
About Partao
Partao is building the global marketplace for heavy-machinery parts; a single platform that connects buyers and sellers across the industry. With 2,000+ brands and more than 5 million SKUs, we make sourcing faster, smarter, and more reliable by helping people find the exact part that fits their machine.
We’re focused on doing the right things well... first. That means building something useful, something real, with impact that lasts. If you want to be part of a fast-moving team shaping a marketplace at global scale, Partao is the place to do it.
About the Role
We are looking for a talented and analytically minded Data Engineer to join our growing team. In this role, you will be at the heart of our data ecosystem; designing and maintaining robust data pipelines, managing our cloud-based data lake on AWS, and transforming raw data into clean, reliable datasets that power business-critical reporting and analytics.
Key Responsibilities
- Design, build, and maintain scalable ETL/ELT pipelines using orchestration tools such as Apache Airflow, dbt, or equivalent frameworks.
- Extract data from diverse sources (APIs, databases, streaming systems) into our AWS-based data lake.
- Transform raw, unstructured data into clean, well-modelled datasets ready for analytics and reporting.
- Own and evolve our data lake architecture, including multi-zone S3 storage and AWS Glue cataloguing.
- Manage relational and non-relational databases, ensuring optimal schema design, indexing, and query performance.
- Leverage AWS services (S3, Redshift, Glue, Lambda, EMR) to build scalable, cost-efficient data solutions.
- Process and analyse large-scale datasets using big data technologies such as Apache Spark.
- Collaborate with analysts and business stakeholders to translate reporting requirements into reliable data models.
- Contribute to the design and evolution of our overall data architecture, data governance, and quality standards.
- Document systems, data flows, and architectural decisions to a high standard.
Requirements
- Proven experience as a Data Engineer or in a similar data-focused engineering role.
- Strong proficiency in Python for data engineering tasks; experience with Rust is a significant advantage.
- Hands-on experience building and maintaining ETL/ELT pipelines, ideally using Apache Airflow
- Deep knowledge of database management - both relational and non-relational.
- Solid experience with AWS cloud services relevant to data engineering (S3, Glue, Redshift, EMR, Lambda, IAM).
- Experience working with big data platforms and distributed computing frameworks (e.g. Apache Spark).
- Strong understanding of data lake architecture, including storage layers, partitioning, and data cataloguing.
- Excellent analytical thinking and problem-solving ability - you enjoy digging into complex data challenges.
- Ability to communicate technical concepts clearly to non-technical stakeholders.
Bonus Points
- Experience with Rust for performance-critical data processing.
Why Join Us?
- Shape the data backbone of a fast-scaling platform.
- Work with real-world data that matters to customers every day.
- Join a small, sharp, and globally distributed team.
- Own critical decisions and make an impact from Day 1.
What We Offer
- Opportunity to shape the data architecture of a growing organisation from an early stage.
- Opportunity to be a thought leader with a wide span of control in a fast-growing startup with experienced mentors
- A challenging and rewarding environment where you can directly impact the future of the company and the industry
Similar Jobs
Artificial Intelligence • Blockchain • Internet of Things • Machine Learning • Software
The role involves developing and maintaining Airflow DAGs, troubleshooting data issues, enhancing Python services, and optimizing data workflows.
Top Skills:
AirflowCi/CdElasticsearchFlaskGitlabOraclePostgresPythonSuperset
Artificial Intelligence • Blockchain • Internet of Things • Machine Learning • Software
The Data Engineer will develop, maintain, and optimize data pipelines using Apache Airflow, manage database performance, and support Elasticsearch integration. Responsibilities include building ETL scripts, handling Unix/Linux operations, and implementing CI/CD pipelines in GitLab.
Top Skills:
Apache AirflowElasticsearchFlaskGitlabOraclePostgresPythonUnix/Linux
Travel
Manage all facility engineering and maintenance operations including HVAC, refrigeration, plumbing, electrical, water treatment, and life-safety systems. Oversee budgets, capital projects, preventive maintenance, energy conservation, regulatory compliance, contractor management, and emergency response. Supervise and train staff, conduct property inspections, maintain certifications, and drive guest satisfaction through service performance and facility standards.
Top Skills:
Alarm SystemsElectrical SystemsFire Protection SystemsHvacHvac-RPlumbingPreventive MaintenanceRefrigerationSprinkler SystemsWater Treatment
What you need to know about the NYC Tech Scene
As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory


