‘I’ve Learned How Dynamic I Am’: Merck’s Data Science and Informatics Team Nurtures Multidisciplinary Innovation

Merck’s Data Science and Scientific Informatics team of Research and Development Sciences IT is tackling age-old questions and emerging challenges alike with a dynamic multidisciplinary approach — all while serving its mission of saving lives.

Written by Jenny Lyons-Cunha
Published on Aug. 17, 2023
‘I’ve Learned How Dynamic I Am’: Merck’s Data Science and Informatics Team Nurtures Multidisciplinary Innovation
Brand Studio Logo

Lillie Shelton’s career in science starts with a glinting childhood memory.

“Growing up in Southeast Texas, I always had a clear image of the Texas Medical Center in my mind’s eye,” said Shelton, an associate specialist at Merck. “Since I was a little girl, I’ve always had an interest in medicine and healthcare — I’ve kept that image at the forefront,” Shelton continued, adding that her family history bolstered a passion for healthcare.  

Shelton majored in materials science and engineering, with a concentration on biomaterials, at Johns Hopkins University, holding fast to the dream of applying her skills to biology and healthcare spaces. While finishing her degree in the throes of the pandemic, a serendipitous chain of events led her to Merck.

Her original internship was canceled because of Covid-19, Shelton said, leading her to apply for a role in Data Science and Scientific Informatics at Merck.

“DSSI gave me a path to nurture the skills in applied math and computing that I had never watered in the past,” she said.

That summer internship was transformative.

“It allowed me to see the impact that data could have in the healthcare space,” Shelton said, beaming. “It energized me — I realized how badly I want to be a part of this mission.”

Antong Chen, director of DSSI and a 12-year veteran of Merck, remembers Shelton’s start with the company well, having himself started as a Merck intern after his education in China and at Vanderbilt University,

“So many internships got canceled during the pandemic,” he said, noting that Merck’s initial posting garnered over 600 submissions in three days. “At the time, we wanted to pay back the community — but the effort has been paid back in droves.

“Of course, Lillie Shelton was among the applicants who really stood out.”

 

What Merck Does

For 130 years, Merck, known as MSD outside of the United States and Canada, has been inventing for life — bringing forward medicines and vaccines for many of the world’s most challenging diseases in pursuit of its mission to save and improve lives.

 

Associate Director of Data Science Kate Brown Williams’ journey to Merck was less traditional but just as intentional.

Williams’ career started in a pre-med track, which evolved into a PhD in metabolic biology. From her beginnings as a bench biologist in a wet lab to her current role in data science, Williams has chased new ways to tackle long-standing questions with new and powerful tools.

Over her decade-long career, six years of which have been with Merck, she’s never lost sight of where her career started — with her mother.

“My mother suffered from cancer,” Williams said. “A lot of us here at Merck join because we have personal stories. It might have been too late to save the people we wanted to help so badly, but there are many people just like them. There’s a very important mission at work here.”

 

Merck team member stands in front of orange wall with branding reading "SPIE Medical Imaging"
Merck

 

 

Translating Scientific Problems Into Data Science Solutions

Put simply, the DSSI team at Merck is responsible for supporting Merck Research Lab — which coexist with Merck’s Manufacturing Division, Global Human Health, and Animal Health arms — all in service of developing life-saving treatments. The 38-person DSSI team covers several areas, including image data analytics, natural language processing, data science realization, bioinformatics, machine learning operations, bioinformatics and decision science. 

For DSSI, the ‘customers’ are internal — predominately scientists who produce data in the MRL division, which is the research and development arm of the organization.

Williams is the sub-team leader for data science realization. Here, she serves a team exploring therapeutic composition of matter, which expands upon Merck’s bread and butter of chemistry-based work into antibody and large molecule modalities.

“When I dove into data science, there was a need for someone who could act as a translator,” Williams said. “I’ve lived on both sides of the conversation.” On one side of the discourse are computational scientists — closer aligned with a data scientist persona, said Williams. On the other side are more traditional scientists with little to no coding experience but an eye for patterns. In the early stages of a data science project, both sides are trying to understand the problem.

“The question you’re trying to answer really needs to be a conversation,” said Williams.

When I dove into data science, there was a need for someone who could act as a translator. The question you’re trying to answer really needs to be a conversation.”

 

“From the inception of the drug formula to prototypes to meeting market requirements, there are hundreds of interactions between scientists, engineers and beyond,” said Chen, who skews on the data science side of the equation. 

“Understanding biology and chemistry problems calls for working with people who have a wealth of scientific knowledge, but probably aren’t familiar with the intricacies of mathematics,” he added.

For lab-based scientists within Merck Research Lab, there are a number of different pain points where DSSI data scientists have made an impact. Free text, both within public scientific literature and within company scientific notebooks, has increasingly become difficult for any individual scientist to remain abreast of to structure and mine for analytics and reporting. Using NLP techniques such as BERT models and large language models, data scientists can support insights that would have taken MRL scientists hours or days to compile manually, if they were able to at all.

For specific experimental data, especially data that is collected as images or signals, data scientists have applied state-of-the-art deep learning techniques. These include segmenting out tumors from a CT image using a multidimensional unified Swin Transformer, or detecting cells from a microscopy image using the YOLO model, which reduces manual time for MRL scientists and improves reproducibility of results.

The work that DSSI does is focused on increasing the efficiency of gleaning scientific insights from data in a robust and reproducible manner. The team’s MRL colleagues and collaborators use its data science tools to streamline their work and make decisions on which target to pursue, or which molecule to develop into a drug. 

DSSI has a further advantage due to where it sits within the organization. The team interfaces significantly with its IT colleagues, and has access to platform and infrastructure support critical for its work, such as AWS Cloud, Databricks and Dataiku. When a project evolves into a product, DSSI team members matrix with Agile teams that include scrum masters, software engineers and UI/UX developers.

Williams noted, “Some of our colleagues in MRL will code in R, especially if they have a life science background, but most of our applications use Python, which is helpful for scaling.”

 

DEI at Merck

  • Skills-first hiring and recruiting
  • Pipeline with the Society of Women Engineers and National Society of Black Engineers
  • A wide range of employee resource groups, including the Women’s Network

 

Merck employee presenting with blue curtain background
Merck

 

Gaining Multifaceted Experience

Last week, one of Chen’s colleagues celebrated his 30th year with the company. But the feat isn’t rare at Merck, he quickly added. There is a marked loyalty among the massive workforce.

“Merck doesn’t just take care of patients — it takes care of its people,” Chen said, outlining ample opportunity for professional development and perpetual advancement. 

As Chen noted, there is a unique mixture of experience and new, dynamic employees, like Shelton, who is earning her master’s degree in data science with Merck’s support. While earning her advanced degree, Shelton has been cycling through multiple facets of Merck as a part of the company’s talent rotation program. 

“The program immediately stood out to me as an opportunity to learn within such a large company,” said Shelton, who has passed through image data analytics, cloud computing, manufacturing, bioinformatics, research lab and beyond. “The experience has taught me so much about how dynamic I am.

“It’s always felt like Merck is the perfect environment for me to continue to grow and change and learn,” Shelton chimed in. “That’s not something that’s easy to come by.”

Simply put, Merck offers a comprehensive experience, Williams added.

“Merck offers an incredible diversity of problem types, an end-to-end pipeline of drug development,” she continued.

Employees gain access to a wealth of growth opportunities, which is bolstered by internal allyship, said Williams.

“There is a sense of pride and loyalty of being a part of Merck,” Williams continued. “People have a lot of interest in mentoring the next generation — paying forward all they’ve learned over the years.”

 

 

Responses have been edited for length and clarity. Images provided by Merck.

Hiring Now
Monte Carlo
Big Data • Cloud • Software • Big Data Analytics