Morgan Stanley Logo

Morgan Stanley

Senior Cloud Engineer (AWS / Azure / GCP) - VP

Reposted 3 Days Ago
In-Office
New York, NY, USA
150K-210K Annually
Senior level
In-Office
New York, NY, USA
150K-210K Annually
Senior level
Design, build, and manage secure, scalable cloud platforms across AWS, Azure, and GCP. Focuses on incident response, automation with Terraform, and SRE principles to improve operational excellence in a multi-cloud environment.
The summary above was generated by AI

Role Summary

We are seeking a Senior Cloud Engineer / Site Reliability Engineer (SRE) to design, build, and operate secure, scalable cloud platforms across AWS, Azure, and GCP. This role is responsible for configuring, deploying, and maintaining virtual machines and containerized applications, using Terraform to automate infrastructure provisioning and lifecycle management. You will provide specialized support for high-stakes production deployments, lead incident response for technical escalations, and apply SRE principles (SLIs/SLOs, error budgets, automation, and reliability engineering) to improve availability, performance, and operational excellence in a multi-cloud environment.

Key Responsibilities

Cloud Platform Engineering (AWS / Azure / GCP)

  • Architect, implement, and maintain cloud infrastructure across AWS, Azure, and GCP using Terraform (IaC).
  • Design and implement cloud landing zones aligned with best practices:
    • Account/subscription/project structure, environment separation, identity boundaries
    • Baseline guardrails and policy enforcement (Azure Policy, AWS Organizations/SCPs, GCP Org Policies)
    • Centralized audit logging, monitoring, and cost allocation standards
  • Build and operate cloud-native virtual network constructs (cloud-focused only):
    • Azure: VNETs, subnets, NSGs, route tables, Private Endpoints, hub/spoke patterns.
    • AWS: VPCs, subnets, security groups, NACLs, route tables, VPC endpoints/PrivateLink, multi-account connectivity patterns.
    • GCP: VPC networks, subnets, firewall rules, routes, Private Service Connect, Shared VPC patterns.
  • Implement private-by-default service access patterns (private endpoints, controlled egress, service-to-service access controls).

Compute, Virtual Machines, and Containers

  • Configure, deploy, and maintain virtual machines and scalable compute patterns:
    • AWS EC2 (Launch Templates, Auto Scaling Groups)
    • Azure Virtual Machines / VM Scale Sets
    • GCP Compute Engine / Managed Instance Groups
  • Own OS hardening, baseline configuration, patching strategies, and instance bootstrapping (cloud-init, image pipelines).
  • Deploy and operate containerized workloads using Kubernetes:
    • EKS / AKS / GKE (cluster design, upgrades, node pools, RBAC, scaling)
    • Container registries (ECR / ACR / Artifact Registry) and artifact promotion strategies
  • Implement workload delivery patterns (Helm/Kustomize), rollout strategies (blue/green, canary), and safe rollbacks.

Infrastructure as Code, Automation & CI/CD (Terraform)

  • Build reusable, versioned Terraform modules with standards for naming, tagging/labels, and secure defaults.
  • Implement Terraform best practices: remote state, locking, environment isolation, secrets handling, and drift detection.
  • Integrate IaC into CI/CD pipelines (e.g., GitHub Actions, Azure DevOps, GitLab CI):
    • Automated validation, linting, security scanning, plan/apply workflows, approvals, and promotions
  • Implement policy-as-code guardrails (OPA/Conftest, Sentinel where applicable) to prevent unsafe changes.

SRE: Reliability Engineering, Observability & Operational Excellence

  • Define, implement, and improve SLIs/SLOs (availability, latency, error rates, saturation) for critical services and platforms.
  • Manage and enforce error budgets to balance reliability with delivery velocity.
  • Establish and continuously improve observability standards:
    • Metrics, logs, traces, dashboards, and alerting across cloud services and Kubernetes
    • Tooling such as CloudWatch, Azure Monitor/Log Analytics, GCP Cloud Monitoring/Logging, OpenTelemetry, Prometheus/Grafana (where used)
  • Improve incident detection quality by reducing alert noise, implementing actionable alerts, and creating clear escalation paths.
  • Drive reliability improvements through:
    • Capacity planning, performance tuning, load testing support
    • Resilience engineering (multi-zone design, graceful degradation, retries/timeouts, backpressure)
    • Continuous automation to eliminate toil (self-healing, auto-remediation runbooks, ChatOps where applicable)

Production Support, Incident Response & Escalations

  • Provide specialized support for high-stakes production deployments (major releases, platform cutovers, migrations).
  • Lead incident response: triage, mitigation, recovery, communication, and post-incident review (PIR/RCA).
  • Troubleshoot escalations across cloud services, Kubernetes, IAM, storage, and CI/CD pipelines using evidence-driven debugging.
  • Build and maintain runbooks, operational playbooks, and postmortem action tracking to prevent repeat incidents.
  • Participate in on-call rotation and continuously improve on-call health through automation and better observability.

Security, Identity, and Governance

  • Implement least-privilege access controls across AWS/Azure/GCP (IAM/RBAC), including role design and permission boundaries.
  • Enforce secure configurations: encryption at rest/in transit, secrets management, key management (KMS/Key Vault/Cloud KMS).
  • Implement compliance-oriented logging and auditing, and partner with security teams to remediate findings and harden platforms.

Required Skills & Experience

  • 10+ years in cloud engineering, platform engineering, DevOps, or SRE roles with significant production ownership.
  • Strong hands-on experience across AWS and Azure, plus practical experience in GCP (production exposure preferred).
  • Expert-level Terraform (modules, state, CI integration, scalable environment patterns).
  • Strong Kubernetes operations experience (EKS/AKS/GKE), including upgrades, scaling, and workload reliability.
  • Experience implementing SRE practices: SLIs/SLOs, alerting strategies, incident response, postmortems, and automation/toil reduction.
  • Strong Linux and scripting (Bash/Python) and ability to debug systems from symptoms to root cause.
  • Strong security fundamentals: IAM/RBAC, encryption, secrets, and auditability in cloud environments.
  • Proven ability to lead technical escalations and coordinate resolution across teams.

WHAT YOU CAN EXPECT FROM MORGAN STANLEY:

At Morgan Stanley, we raise, manage and allocate capital for our clients – helping them reach their goals. We do it in a way that’s differentiated – and we’ve done that for 90 years.  Our values - putting clients first, doing the right thing, leading with exceptional ideas, committing to diversity and inclusion, and giving back - aren’t just beliefs, they guide the decisions we make every day to do what's best for our clients, communities and more than 80,000 employees in 1,200 offices across 42 countries. At Morgan Stanley, you’ll find an opportunity to work alongside the best and the brightest, in an environment where you are supported and empowered. Our teams are relentless collaborators and creative thinkers, fueled by their diverse backgrounds and experiences. We are proud to support our employees and their families at every point along their work-life journey, offering some of the most attractive and comprehensive employee benefits and perks in the industry. There’s also ample opportunity to move about the business for those who show passion and grit in their work.

To learn more about our offices across the globe, please copy and paste https://www.morganstanley.com/about-us/global-offices​ into your browser.

Expected base pay rates for the role will be between $150,000 and $210,000 per year at the commencement of employment. However, base pay if hired will be determined on an individualized basis and is only part of the total compensation package, which, depending on the position, may also include commission earnings, incentive compensation, discretionary bonuses, other short and long-term incentive packages, and other Morgan Stanley sponsored benefit programs

Morgan Stanley is an equal opportunity employer committed to building and maintaining a workforce that is diverse in experience and background.  Our recruiting efforts reflect our strong commitment to a culture of inclusion, where individuals are hired, developed, and advanced based on their skills and talents.

Our workforce reflects a broad cross-section of the global communities in which we operate, bringing a variety of backgrounds, talents, perspectives, and experiences.

For more information, please visit: https://www.morganstanley.com/people-opportunities/eeo.

HQ

Morgan Stanley New York, New York, USA Office

1585 Broadway, New York, NY, United States, 10036

Morgan Stanley New York, New York, USA Office

522 5th Ave, New York, NY, United States

Similar Jobs

15 Minutes Ago
Easy Apply
Remote or Hybrid
4 Locations
Easy Apply
195K-286K Annually
Expert/Leader
195K-286K Annually
Expert/Leader
Artificial Intelligence • Cloud • Security • Software • Cybersecurity
Lead and scale strategic ISV go-to-market partnerships globally by aligning with Product, developing and executing GTM plans, driving joint revenue opportunities, enabling field and partner sales, creating marketplace listings, and measuring partnership performance to optimize impact.
Top Skills: AICloud MarketplacesCloud TechnologiesDatadogSecurity
18 Minutes Ago
Hybrid
134K-220K Annually
Senior level
134K-220K Annually
Senior level
AdTech • Digital Media • Internet of Things • Marketing Tech • Mobile • Retail • Software
Lead enterprise Customer Identity and Access Management strategy and operations, modernize authentication (SSO, MFA, passwordless), enforce identity standards (OAuth2, OIDC, SAML, SCIM), enable application teams, ensure compliance (SOX, PCI-DSS, SOC 2), oversee platform reliability, security telemetry, and lead a high-performing IAM engineering team.
Top Skills: AWSAzureCi/CdCiamDevice TrustFederated IdentityFido2/WebauthnGCPGoIdpJavaMfaNode.jsOauth2Openid Connect (Oidc)PasskeysPasswordlessPythonSAMLScimSdksSIEMSso
2 Hours Ago
Remote or Hybrid
US
141K-229K Annually
Senior level
141K-229K Annually
Senior level
Consumer Web • eCommerce • Machine Learning • Software • Sports • Analytics
Lead backend and full-stack work on the Payments team, building multi-gateway integrations (Stripe, PayPal), payment APIs, and customer payment UIs. Ensure secure, compliant (PCI-DSS) payment flows, reliability, observability, and scalability across AWS/Kubernetes microservices. Partner cross-functionally to design architecture, implement settlement/reconciliation, and maintain high availability.
Top Skills: .NetAi-Assisted Development ToolsAWSC#DatadogDynamoDBKafkaKubernetesPaypalPci-DssPostgresReactStripeSvelteTypescript

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

  • Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
  • Key Industries: Artificial intelligence, Fintech
  • Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
  • Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account