At Arthur, we are building the first platform for Responsible AI. We’re looking for an experienced Data Engineer response for the design & implementation of pipelines that will crunch petabyte-scale data in order to power our ML monitoring platform. The ideal candidate is a hands-on-keyboard engineer who has built streaming & distributed data architectures, is always on the lookout for the next evolution in data processing technology, and is excited to mentor others to ensure the entire team is able to build & maintain data pipelines.
As a Senior Data Engineer, you will:
Design & build a high-throughput capabilities with particular emphasis on data engineering components, in close coordination with frontend & API engineers.
Work closely with the SRE team to ensure that data pipelines are resilient, performant, well-monitored, and scalable for both our SaaS and on-prem product offerings.
Exhibit continuous curiosity in understanding emerging technology that could improve our platform.
Mentor teammates on best data engineering practices.
Qualifications
4+ years software engineering experience on a SaaS platform with emphasis on large-scale data systems
Experience building large-scale data systems using distributed file storage technologies such as hdfs and s3 and distributed processing frameworks such as Spark, EMR, and HDFS.
Experience with event processing and streaming data technologies including message queues such as Kafka and stream processors such as Spark streaming, Storm, Kinesis, etc.
Experience with multiple RDBMS & NoSql technologies
Proficiency with Python (preferred) or other commonly used data processing languages such as Java or Scala
Understanding of multi-tenant platforms, providing secure & controlled access to data
Experience working with cloud environments such as AWS and GCP
CS (preferred) or other technical degree, or equivalent practical experience
Preferences
1+ year experience as a technical lead or principal in data engineering
Experience with machine learning & AI and related tools such as Tensorflow, and Sci-kit learn
Experience with analytics or data visualization architectures
Experience with on-prem deployment architectures
Experience running a 24x7 SaaS platform with an SLA
We offer
Working with a small, fast-growing team, lots of opportunity to take ownership and run with projects
The opportunity to get in on the ground floor of a rapidly growing startup, working with a cutting-edge technology stack
Generous equity
A culture that empowers great people to accomplish great things
Full benefits package
Flexibility to work out of our NYC, DC, or remote (we are fully remote for the foreseeable future).