Staff Data Engineer at Codecademy
Codecademy was started in 2011 by two college students in a dorm room at Columbia that were frustrated by the huge gap between education and employment. A few years later, we are a rapidly growing, diverse team of 150+ in SoHo, NYC. We’ve raised over $40m in venture capital funding from top investors including Union Square Ventures, Kleiner Perkins, Naspers, Y Combinator, and more.
If you want to help build a business that impacts tens of millions of people each year and helps them lead better lives, join us!
Codecademy is looking for a Staff (Senior) Data Engineer who will partner with the Director of Data Engineering and Data Science to devise a strategic roadmap to meet our data needs. Our application has accumulated billions of data points and is generating millions more each day that we are looking to use for building more intelligent recommendations and learning solutions for our millions of learners. We believe that everyone’s learning journey is unique and we can leverage data to personalize learning for each user.
Our data team will work on three key areas: building a data platform, building and supporting ML models, and building and supporting analytics (internal and external). In this role you will design and implement various solutions that can support all of our data use cases from the ground up.WHAT YOU'LL DO
- Assess our current data infrastructure and needs to devise a strategic roadmap
- Partner closely with the Data Science team and other key stakeholders to determine organizational and specifically product needs.
- Build scalable data infrastructure solutions for both internal and external use cases.
- Design and optimize new and existing data pipelines and streams.
- Integrate new data sources into our existing data architecture.
- Collaborate with a cross-functional team of software engineers and data scientists.
- Designing data models to meet critical product and business requirements
- SQL and data warehousing skills -- able to write clean and efficient queries.
- Hands-on experience building and maintaining large scale ETL systems.
- Deep understanding of database design and data structures.
- Experience with MPP columnar databases such as Redshift, Greenplum, Vertica and SQL + NoSQL data stores
- Fluency in one of the following languages: Python, Scala, Ruby, Go.
- Experience working with cloud-based data platforms (we use AWS).
- Ability to make pragmatic engineering decisions, write extensive tests and create documentation
- Strong project management skills; a proven ability to gather and translate requirements from stakeholders across functions and teams into tangible results.
- Building anomaly detection systems that will help detect real-time data issues and improving internal tools.
- Experience with tools in our current warehousing stack: Apache Airflow, Redshift, Segment, Kinesis, S3, Looker.
- Familiarity with the database technologies we use in production: MongoDB, PostgreSQL or similar NoSQL / SQL stores
- Comfort with containerization technologies: Docker, Kubernetes, etc.
- Experience with big data processing technologies such as Apache Spark.
- Experience (or interest in learning to) productionizing machine learning models.
At Codecademy, we are committed to teaching people the skills they need to upgrade their careers. Codecademy aims to educate a richly diverse demographic of users with our product and in order to accomplish this, we believe our team should reflect that rich diversity. Our company celebrates diversity in all of its forms-- race, gender, color, national origin, marital status, sexuality, religion, veteran status, age, ability, disability status-- and works to create an inclusive workplace where people of all backgrounds and beliefs are empowered to better their futures.