The Public Sector Site Reliability Engineer will manage cloud infrastructure, ensure compliance with regulations, implement observability tools, lead incident response, and enhance automation in federal environments.
At Unstructured, we’re building the backbone of generative AI—helping federal agencies transform PDFs, HTML, Word docs, images, and more into secure, high-performance data pipelines that scale. Our tools are trusted by nearly half of the Fortune 500 and downloaded more than 38 million times in the open-source community.
We’re expanding our federal/public sector practice, and we’re hiring a Public Sector Site Reliability Engineer (SRE) to help design, scale, and secure the systems that power the next generation of AI-driven workloads for government.
What You’ll Own & Drive
🔐 Mission-Grade Reliability & Security
Design, build, and manage secure, highly available, and scalable cloud infrastructure for federal environments.
Ensure compliance with FedRAMP, FISMA, and other relevant security and regulatory frameworks.
Develop IaC with Terraform, Pulumi, or similar for repeatable, compliant deployments.
Build and maintain automated CI/CD pipelines that move fast without sacrificing security or stability.
📊 Full Observability in Sensitive Environments
Implement/maintain monitoring, logging, and alerting (Prometheus, Grafana, Datadog, Elastic).
Enable real-time visibility and rapid response for mission-critical workloads.
Partner with engineering and program teams for high-assurance rollouts.
Lead capacity planning, deployment strategies, and resilient architecture design for federal networks.
🔥 Incident Response & Continuous Improvement
Lead incident response and root-cause analysis with a blameless, systems-thinking approach.
Drive postmortems and reliability improvements.
Enhance developer experience with secure automation and streamlined workflows.
Help teams iterate quickly while maintaining compliance and operational excellence.
What You Bring
5–9 years managing software deployed to US government or Department of Defense (DOD) networks
Active SECRET clearance required; TS/SCI strongly preferred
Expertise with AWS GovCloud and/or Azure Government.
Deep experience with Kubernetes, Docker, and container orchestration at scale.
Strong Linux systems and networking fundamentals.
Scripting/automation: Python, Bash, or Go. IaC: Terraform, Pulumi, Ansible (or similar).
Strong grasp of monitoring, logging, and observability best practices.
Travel required up to 20%
Bonus Points
ML infrastructure or real-time data pipelines experience.
Serverless or event-driven architectures.
Contributions to open-source DevOps/SRE projects.
Hands-on work with US government security/compliance in cloud-native settings.
Unstructured values service and encourages veterans of the US military and civilian agencies to apply to this role.
Why You’ll Love It Here
Mission Impact: Power critical AI workloads in the public sector.
Big Technical Challenges: High-assurance problems at the edge of AI, data, and cloud.
Elite Team: Sharp, low-ego engineers who value execution and learning.
Innovation + Security: Build cutting-edge systems with rigorous reliability for federal use cases.
Top Skills
Ansible
Aws Govcloud
Azure Government
Bash
Datadog
Docker
Elastic
Go
Grafana
Kubernetes
Prometheus
Pulumi
Python
Terraform
Similar Jobs
Artificial Intelligence • Real Estate
As a Senior Site Reliability Engineer, you will enhance platform reliability and observability, streamline incident response, improve cloud infrastructure, and collaborate across teams to drive operational excellence.
Top Skills:
AWSCircleCIDatadogGithub ActionsGrafanaPrometheusTerraform
Big Data • Cloud • Healthtech • Software • Big Data Analytics
The Senior Site Reliability Engineer will ensure the reliability and scalability of enterprise applications, lead incident management, develop automation tools, mentor team members, and collaborate with cross-functional teams.
Top Skills:
AnsibleAWSBashDockerGitGoHibernateJavaKubernetesLinuxMavenMySQLPythonRubyShellSolrSpringTomcatVagrant
Information Technology • Security • Cybersecurity
The Staff/Principal Site Reliability Engineer leads infrastructure initiatives, architects solutions for cloud and SaaS, and collaborates cross-functionally to enhance reliability and innovation.
Top Skills:
AWSBashBazelCuelangDatadogGitopsGoGrafanaHelmKubernetesLinuxPrometheusPythonTerraform
What you need to know about the NYC Tech Scene
As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.
Key Facts About NYC Tech
- Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
- Key Industries: Artificial intelligence, Fintech
- Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
- Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory



