Hearst Logo

Hearst

Principal Data Engineer

Posted Yesterday
Be an Early Applicant
In-Office
New York, NY, USA
230K-250K Annually
Expert/Leader
In-Office
New York, NY, USA
230K-250K Annually
Expert/Leader
Senior technical leader responsible for architecting scalable data platforms and pipelines to support semantic and hybrid search, personalization, and GenAI use cases. Leads implementation of OLAP/OLTP, streaming, vector/graph-based search, entity enrichment, and production ML integration while managing offshore engineering teams and ensuring data governance, observability, and cross-functional alignment.
The summary above was generated by AI

The Enterprise Corporate Data Team is looking for a Principal Data Engineer, a senior technical leader responsible for architecting the core data infrastructure and platforms that power enterprise-scale AI applications. Reporting to the VP of Engineering, this role will focus on building systems to surface content, audience, products etc through semantic search capabilities to support personalization, audience discovery, and intelligent content discovery. The Principal Data Engineer will lead the end-to-end design and implementation of scalable pipelines,platforms and systems that support semantic search across massive volumes of structured/semi-structured data using GeN AI technology. This individual will also co-ordinate with a team of off-shore engineers, ensuring consistent delivery, code quality, and alignment with business and technical goals. The ideal candidate will possess an entrepreneurial ethos, an ability to operate in a dynamic environment, and a working knowledge of the current digital media landscape. The candidate should be an expert/knowledgeable with Search systems including but not limited to Similarity, hybrid and semantic search. This role is based in New York City.

Key Responsibilities:

● Lead the design and implementation of high-performance OLAP and OLTP systems to support similarity and semantic search.
● Architect scalable data platforms that integrate structured and unstructured data,including behavioral signals, content metadata, and user engagement data for Gen AI use cases.
● Build systems that enable semantic enrichment of content through entity recognition, disambiguation, normalization and deduplication techniques.
● Lead the design and build of high throughput, low latency and highly relevant Enterprise search systems using Vectors, Graph and other search strategies.
● Familiar with relevance measurement techniques like DCG, NDCG etc.
● Partner closely with other Data engineers, ML engineers and data scientists to deploy and operationalize models for content and audience intelligence.
● Oversee and co-ordinate with an offshore engineering team, providing technical guidance, code reviews, and project oversight to ensure timely, high-quality
deliverables.
● Ensure best practices in data governance, quality, observability, and documentation across all engineering workflows.
● Collaborate with stakeholders across product, marketing, and data science to translate business needs into scalable AI data systems.
● Well versed in architecting, designing and developing large scale OLTP and OLAP systems.
● Experience building and operating streaming systems using messaging systems like Kafka, Pub/sub, SQS etc.
● Experience building an RAG/Graph RAG system with Google, OpenAI or another Gen AI platform.
● Experience building a knowledge graph using Neo4j, Spanner, Neptune or another tool is a plus

Qualifications:

● 10+ years of experience in data engineering, with significant experience building large-scale, distributed data systems to support Data analysis, AI/ ML and key business use cases.
● Proven expertise in Search and search related sub systems like Query understanding, search suggest, ranking, relevance with modern strategies like similarity search, hybrid search etc.
● Strong coding and data architecture skills using Typescript, Python, SQL, and tools like Apache Spark, Kafka, Airflow, Node Js, and cloud-native platforms (e.g., AWS, GCP, or Azure).
● Hands-on experience integrating ML models into production environments for tasks such as entity extraction, text classification, or semantic search.
● Familiar with AI grounding strategies including but not limited to Entity graph
● Experience managing and mentoring distributed/offshore engineering teams, with a track record of driving execution across time zones.
● Excellent communication and collaboration skills, with the ability to bridge technical execution and business strategy.

Preferred Qualifications:

● Experience in digital media, publishing, ad tech, or content platforms.
● Bachelor’s , Master’s or Ph.D. in Computer Science, Data Engineering, or a related field.
● Knowledge of LLMs and generative AI in applied settings (e.g., content summarization, auto-tagging, retrieval augmentation).
● Working experience with OLAP and OLTP systems is a plus

In accordance with applicable law, Hearst is required to include a reasonable estimate of the compensation for this role if hired in New York City. The reasonable estimate, if hired in New York City, is $230,000 to $250,000. Please note this information is specific to those hired in New York City. For candidates outside New York City, the salary range will be aligned with the specific location. A final decision on the successful candidate’s starting salary will be based on a number of permissible, non-discriminatory factors, including but not limited to skills, experience, training, certifications, and education.  Hearst provides a competitive benefits package, including medical, dental, vision, disability, and life insurance, 401(k), paid holidays and paid time off, employee assistance programs, and more.

About Us
Hearst is a leading global, diversified information, services, and media company dedicated to innovating, informing audiences and leading with purpose, integrity and a culture of care.

Our portfolio includes more than 360 businesses worldwide. On the consumer side, we operate 35 television stations, 28 daily newspapers and publish more than 200 magazine editions featuring many of the most iconic brands in media. We also hold ownership stakes in leading cable networks such as A&E, HISTORY, Lifetime and ESPN. On the business-to-business side, our companies include Fitch Group, a global leader in financial information and analytics; Hearst Health, which provides intelligence and software that improve care outcomes; and Hearst Transportation, which delivers data and software for aviation, automotive and trucking.

Our strength lies in our people. We value the diverse perspectives that move us forward. We are an Equal Opportunity Employer and makes employment decisions without regard to race, color, religion, national origin, sex or gender, sexual orientation, gender identity, gender expression, age, disability, military or veteran status or any other status protected by federal, state, or local law. We also provide reasonable accommodations to applicants and employees consistent with applicable law.
About the TeamOur Corporate Teams deliver essential programs and services that support the entire Hearst enterprise. Spanning communications, employee benefits, finance, learning and development, legal, technology, and more, these teams lead initiatives that advance Hearst’s mission to inform and inspire. Here, you’ll find opportunities to grow, collaborate and make a lasting impact.
HQ

Hearst New York, New York, USA Office

300 West 57th Street, New York, NY, United States, 10019

Similar Jobs

4 Days Ago
Remote or Hybrid
New York, NY, USA
215K-250K Annually
Senior level
215K-250K Annually
Senior level
Artificial Intelligence • Fintech • Machine Learning • Mobile • Payments • Retail • Software
Own and modernize Upsides analytics data platform: migrate pipelines, reduce cost, improve governance, design reusable modeling/orchestration patterns, deliver domain-critical data products, lead cross-functional initiatives, mentor engineers, and support ML and product teams.
Top Skills: AWSCi/CdDagsterDatabricksDbtSnowflakeTerraform
4 Days Ago
Remote or Hybrid
USA
195K-320K Annually
Expert/Leader
195K-320K Annually
Expert/Leader
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
As a Principal Data Engineer, you will design and implement LLM, AI-powered security data platforms, mentor engineers, and drive the adoption of data solutions across teams.
Top Skills: AirflowAWSBigQueryDaskDockerFlinkGCPKafkaKubeflowKubernetesLangchainLlamaindexMlflowMlops ToolsOciPulsarPythonSagemakerSnowflakeSparkVertex Ai
Yesterday
In-Office or Remote
United States
36K-127K Annually
Senior level
36K-127K Annually
Senior level
Agency • Information Technology
Design, build, modernize, and maintain operational and analytical data capabilities for Wealthscape Reporting, Analytics and Insights. Perform solution design, data analysis, ETL development, and production rollouts using Snowflake, AWS, cloud databases, and CI/CD pipelines. Collaborate across teams in a fast-paced financial services environment.
Top Skills: AnsibleAWSAws LambdaCi/CdContainerizationDockerInformaticaJenkinsLinuxMavenOraclePostgresPythonShell ScriptingSnaplogicSnowflakeStashUnix

What you need to know about the NYC Tech Scene

As the undisputed financial capital of the world, New York City is an epicenter of startup funding activity. The city has a thriving fintech scene and is a major player in verticals ranging from AI to biotech, cybersecurity and digital media. It also has universities like NYU, Columbia and Cornell Tech attracting students and researchers from across the globe, providing the ecosystem with a constant influx of world-class talent. And its East Coast location and three international airports make it a perfect spot for European companies establishing a foothold in the United States.

Key Facts About NYC Tech

  • Number of Tech Workers: 549,200; 6% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Capgemini, Bloomberg, IBM, Spotify
  • Key Industries: Artificial intelligence, Fintech
  • Funding Landscape: $25.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Greycroft, Thrive Capital, Union Square Ventures, FirstMark Capital, Tiger Global Management, Tribeca Venture Partners, Insight Partners, Two Sigma Ventures
  • Research Centers and Universities: Columbia University, New York University, Fordham University, CUNY, AI Now Institute, Flatiron Institute, C.N. Yang Institute for Theoretical Physics, NASA Space Radiation Laboratory

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account