Senior Site Reliability Engineer (SRE) - Big Data Team/Machine Learning
About The Opportunity
Got a taste for something new?
We’re Grubhub, the nation’s leading online and mobile food ordering company. Since 2004 we’ve been connecting hungry diners to the local restaurants they love. We’re moving eating forward with no signs of slowing down.
With more than 90,000 restaurants and over 15.6 million diners across 1,700 U.S. cities and London, we’re delivering like never before. Incredible tech is our bread and butter, but amazing people are our secret ingredient. Rigorously analytical and customer-obsessed, our employees develop the fresh ideas and brilliant programs that keep our brands going and growing.
Long story short, keeping our people happy, challenged and well-fed is priority one. Interested? Let’s talk. We’re eager to show you what we bring to the table.
About the Opportunity:
Senior Site Reliability Engineers are embedded in Big Data specific Dev teams to focus on the operational aspects of our services, and our SREs run their respective products and services from conception to continuous operation. We're looking for engineers who want to be a part of developing infrastructure software, maintaining it and scaling. If you enjoy focusing on reliability, performance, capacity planning, and the automation everything, you’d probably like this position.
Some Challenges You’ll Tackle
TOOLS OUR SRE TEAM WORKS WITH:
- Python – our primary infrastructure language
- Cassandra
- Docker (in production!)
- Splunk, Spark, Hadoop, and PrestoDB
- AWS
- Python and Fabric for automation and our CD pipeline
- Jenkins for builds and task execution
- Linux (CentOS and Ubuntu)
- DataDog for metrics and alerting
- Puppet
You Should Have
WHAT’S ACTUALLY REQUIRED:
- Python or Java / Scala development experience
- Experience working on projects in the Hadoop ecosystem
- Experience in Streaming data platforms, (Spark streaming, Kafka, Apache Flink )
- Familiarity with Cassandra, MySQL, and Elasticsearch
- Experience in AWS services like Kinesis, IAM, EMR, Redshift, and S3
- Familiarity with using and supporting analytics systems like Hive, Redshift, Presto, Tableau and similar tools.
- Familiarity with performance debugging and tuning at the OS, JVM, and cluster (MapReduce, Hive, Spark jobs) levels.
- Experience developing solutions leveraging Docker
- Experience managing Linux (Centos, Ubuntu) systems
- Configuration management experience with Puppet, Chef, or Ansible
- Continuous integration, testing, and deployment using Git, Jenkins
- Exceptional communication and troubleshooting skills.
NICE TO HAVE:
- Experience managing Linux (Centos, Ubuntu) systems
- Experience with relational databases (MySQL)
- Bonus points for deploying/operating large-ish Hadoop clusters in AWS/GCP and use of EMR, DC/OS, Dataproc.
And Of Course, Perks!
- Unlimited paid vacation days. Choose how your time is spent.
- Never go hungry! We provide weekly GrubHub/Seamless credit.
- Regular in-office social events, including happy hours, wine tastings, karaoke, bingo with prizes and more.
- Company-Wide Initiatives encouraging innovation, continuous learning and cross-department connections.
Grubhub is an equal opportunity employer. We evaluate qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, veteran status, and other legally protected characteristics. The EEO is the Law poster is available here: DOL Poster. Grubhub is committed to working with and providing reasonable accommodations to individuals with disabilities. If you need a reasonable accommodation because of a disability for any part of the employment process, please send an e-mail to [email protected] and let us know the nature of your request and your contact information.