Data Engineer at CLEAR
CLEAR helps create safer, easier experiences everywhere you go. We believe you are you and by using your biometrics – your eyes, face, and fingerprints – we keep you moving. Imagine a world where you can do virtually everything you need to – breeze through the airport, buy a beer at the game, check-in at the doctor’s office, access your office building, and more – without ever pulling out your wallet. CLEAR is currently available in 50+ airports, venues and more. Now with Health Pass, CLEAR securely connects a person’s digital identity to multiple layers of COVID-related insights to help reduce public health risk and restore peace of mind.
We’re defining and leading an entirely new industry, obsessing over our customers, and investing in great people to lead the way. Recently named on CNBC’s Disruptor 50 List for the second year in a row and winner of the SXSW Interactive Innovation Award, CLEAR is providing innovative technology options for businesses and our 5+ million members to help create a safer environment no matter where you go.
We are looking for a talented and motivated Data Engineer that is experienced in building data pipelines from various internal and external sources to support the reporting and analytics needs across the entire business. This is an amazing opportunity to help build out our next generation data platform.
The Data Engineer will collaborate with our software engineers, database developers, and data scientists to ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing and/or redesigning our company’s data architecture to support our current and future products and data initiatives.
What You Will Do:
- Build out our data pipeline architecture, and optimize data flow and collection for cross functional teams.
- Lead the development of a Data Lake solution that can be used for reporting and analytics across the entire organization.
- Work closely with our engineering teams to integrate data sources across a multitude of micro-services.
- Work with our Data Warehouse, Data Science, and Product teams to ensure that we have high quality data that meets the needs of the business.
- Drive data acquisition and technology improvements to help our systems evolve with our needs.
Who You Are:
- You have 3+ years working in an AWS environment, with experience using one or more of the following: Kinesis, EMR, RDS, S3, Glue, Athena, DynamoDB
- Have experience developing against internal and external API’s to consume data from disparate structured and unstructured sources
- 5+ years of experience with languages such as Python, Java, and Scala
- You have experience with big data tools such as Hadoop, Spark, Hive, Hudi, Presto, Sqoop
- Experience with stream-processing systems: Kafka, Storm, Spark Streaming
- You have experience with SQL Databases such as: Redshift, SQL Server, Snowflake, Big Query, Oracle, Postgres, MySQL
- Experience with NoSQL databases such as Redis, Cassandra, CouchDB, MongoDB, Elasticsearch
- Experience with data pipeline and workflow management tools: Airflow, Luigi, Oozie, Azkaban, etc.
- Experience with queuing systems: SQS, RabbitMQ