We are looking for a Cloud Systems Engineer to join our growing production operations team, as a night shift support (from midnight ET to 8 am ET). The candidate will be responsible for maintaining strict production SLAs of a custom distributed application running on AWS cloud infrastructure, monitoring, alerting, incident management, security, and overall platform stability and improvement, as well as resolve business requests coming through our internal ticketing system.
The ideal candidate will have experience with various Linux OS distributions (CentOS, Ubuntu, AWS Linux), excellent troubleshooting and analytical skills, intermediate scripting knowledge, experience with monitoring solutions and metric analysis, understanding and ability to troubleshoot network communication issues.
Finally, we are seeking someone who wants to be a major contributor in a small, dynamic work environment, loves a challenge, and has a strong balance of technical and people skills.
The Cloud Systems Engineer will be responsible for:
- Continuous improvement in system visibility and applications with advanced monitoring, metrics and log analytics
- Identification of root causes on critical problems throughout the platform, incident reports and communication
- Maintain, monitor, and help improve the performance and availability of the 24x7 production environment including networks, servers, databases, etc.
- Troubleshooting and supporting our production systems
- Participate in creating long-term and short term strategies for scaling the production environment
- Adhere to a comprehensive incident management program including problem management
- Generate KPIs for service availability, uptime, and adherence to SOPs, and SLAs
The ideal qualifications of the Cloud Systems Engineer:
- Must have hands on Linux system level administration experience
- Highly familiar with standard Linux commands (ps, top, netstat, chmod, etc)
- Linux service trouble shooting experience
- Experience with Amazon Web Services (EC2, ECS, S3, etc.)
- Technical knowledge of log and metric monitoring and analytics tools (examples such as AWS CloudWatch, Sumologic, Datadog)
- Experience with Docker, Ansible, Terraform is a plus
- Experience maintaining a secure production environment
- 2 + years’ experience with 24x7 production operations
No sponsorships at this time.About mParticle
Founded in 2013, mParticle is the leading customer data platform that unlocks the full power of data for businesses. The company empowers brands to accelerate their growth strategy to keep pace with their customers by providing the most advanced data platform for web and apps across all devices in the marketplace. A trusted partner among renowned brands such as Airbnb, Foursquare, Hulu, King, and Spotify among many others, the mParticle platform has grown to manage over 1 billion mobile users each month, capturing over $5 billion in ecommerce transactions and processes over 250 billion API calls. Recognized as one of Crain’s 100 Best Places to Work in New York City and named to Gartner’s “Cool Vendors in Mobile App Development” list, mParticle has 45 employees and is headquartered in New York City with offices in San Francisco, Florida, Seattle and London.
Here at mParticle we embrace the differences that make us unique. We are dedicated to building an inclusive environment that fosters respect and celebrates an array of backgrounds and perspectives.
Employment opportunities are available to all applicants without regard to race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.