Given our continued growth, we always have room for more intellect, energy, and enthusiasm - join our global team and see why it's so special to be a part of Mitratech!
Job Overview
Responsibilities
Design, deploy, and maintain scalable and secure infrastructure supporting AI and ML workloads.
Build and maintain AWS cloud environments for compute (EC2, ECS/EKS, Lambda), storage (S3, EFS, FSx), and networking (VPC, Transit Gateway, PrivateLink, Route 53, load balancers).
Implement security best practices using IAM, KMS, Secrets Manager, GuardDuty, and Security Hub.
Support and optimize AI/ML workloads across AWS services (SageMaker, Bedrock, Batch, Step Functions).
Develop and maintain Infrastructure as Code (IaC) using Terraform, AWS CDK, and CloudFormation.
Manage containerized workloads and orchestration platforms (Docker, EKS, Fargate), including GPU scheduling and scaling.
Set up and maintain monitoring and observability frameworks using CloudWatch and OpenTelemetry.
Build and manage CI/CD pipelines (CircleCI, GitHub Actions, GitLab CI) for infrastructure automation and ML/Gen AI deployments.
Collaborate with ML and Generative AI teams to scale models, optimize performance, and design efficient prompt or inference pipelines.
Develop runbooks and SOPs for AI service deployment, troubleshooting, and performance optimization.
Ensure security, compliance, and data protection across AI datasets and environments.
Required Skills & Experience
Strong proficiency in Linux administration and systems-level troubleshooting.
Deep expertise in AWS cloud services, with experience in compute, storage, networking, and security domains.
- Proficiency in container orchestration (Kubernetes/EKS) and infrastructure automation tools.
Hands-on experience with IaC tools such as Terraform, AWS CDK, or CloudFormation.
Familiarity with monitoring, logging, and observability stacks (Prometheus, Grafana, OpenTelemetry).
Experience implementing CI/CD pipelines for automated deployment and testing.
Understanding of AI/ML concepts, including model deployment, inference scaling, and LLM performance tuning.
Working knowledge of security best practices, IAM roles, encryption, and compliance controls.
Excellent collaboration and communication skills to partner with ML engineers, data scientists, and product teams.
Education:
- A Master’s degree in Machine Learning, Computer Science with a preference for specialization in the NLP domain.
We are an equal-opportunity employer that values diversity at all levels. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, national origin, age, sexual orientation, gender identity, disability, or veteran status.
Top Skills
Similar Jobs
What you need to know about the NYC Tech Scene
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory


