Data Reliability Engineer at Foursquare
Since our inception in 2009, Foursquare has been a leading force in changing how location information enriches our real-world and digital lives. As a location intelligence company, Foursquare is comprised of two well-known consumer apps, Foursquare and Swarm, as well as thriving media and enterprise products. Our B2B offerings include Places (for developers), Pinpoint and Attribution (for marketers), and Place Insights (for analysts, based on the world's largest foot traffic panel). With more than 200 people across our offices in New York, San Francisco, and in sales offices around the globe, we’re dedicated to our trailblazing mission—enriching consumer experiences and informing business
About the Data Platform Team:
The Data Platform team at Foursquare is responsible for data pipeline reliability and the data infrastructure that powers all offline jobs and data analytics. As a member of the Data Platform team, you will be responsible for modernizing efforts of the platform, owning the data ingestion pipelines, supporting and enhancing the current processing frameworks viz. Scalding and Spark, work on cloud migration and disaster recovery efforts. The Data Platform team works very closely with all engineering teams, data scientists and analysts building out data tools and infrastructure to support their needs.
About the Data Reliability Engineering Role:
Data Reliability Engineering is providing resilient, scalable, and performant data storage, data quality and retrieval. As a Data Reliability Engineer, in addition to data quality frameworks, you will be responsible for data privacy and security, data cataloging and reliability of data pipelines.Responsibilities:
- Implement and manage a data catalog
- Own the data privacy initiatives across the company primarily driving data privacy storage, encryption of data and data access.
- Implement security access controls
- Implementation plans for disaster recovery, migration, roll-back plans, expansion
- Specifications for onboarding new data, including troubleshooting, patch processes, cross-organizational data systems management processes, data security breach response plans, risk management
- Build out data quality assessment and control frameworks to understand the operational health of the data platform.
- Participation in Root Cause Analysis (RCA) processes
- Troubleshoot issues and participate in 24x7 on-call support, ensuring the stability of the data platform environment.
- 5+ years of experience in the big data space
- Operational mindset and experience with cloud providers like AWS or GCP
- Background in security or data privacy
- Knowledge of job orchestration systems like Luigi or Airflow
- Strong critical thinking ability to assess complex problems, analyze options, navigate diverse perspectives and develop optimal/acceptable solutions
- Knowledgeable in Scala and Python
- Experience in data serialization formats like Thrift, Parquet, Avro and big query systems like Hive and Presto
- Experience in distributed processing frameworks like MapReduce, Scalding and Spark
- Knowledge of data governance practices, business and technology issues related to management of enterprise information assets and approaches related to data protection
Foursquare is proud to foster an inclusive environment that is free from discrimination. We strongly believe in order to build the best products, we need a diversity of perspectives and backgrounds. This leads to a more delightful experience for our users and team members. We value listening to every voice and we encourage everyone to come be a part of building a company and products we love.
Foursquare is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected Veteran status, or any other characteristic protected by law.