Demand for data scientists at tech companies is growing with the rise of big data, but the supply of talent hasn’t quite caught up in this relatively new field within the tech sector. That's partly because of a lack of awareness of just what data scientists do.
Generally, data scientists work with large datasets, develop predictive models, and grasp architecture, infrastructure and distributed programming, all while comprehending the theory behind the science. Here's what the day-to-day work looks like at a few local tech companies.
JW Player was a pioneer of video on the web in 2008. Today, the company provides a flexible video platform that helps clients distribute, manage and monetize videos on the web and mobile apps. We spoke with lead data scientist Nir Yungster to get a look into how the company’s data science team works.
Day to day, what are your responsibilities and priorities?
It varies quite widely, but many days are spent working on some component of an algorithm to recommend videos to viewers across the vast number of sites that use our player.
What is the data science team culture like at JW Player? Any traditions?
We have a love affair with Meal Pal. (It's great).
How does the data science team collaborate with other teams within the company?
While the majority of the data science team's work revolves around customer- or user-facing data products like video recommendations, we regularly interact with teams around the company to better understand potential areas where we can bring value to the company.
What projects is your team currently working on?
In addition to video recommendations, a couple things we're also working on are automated thumbnail selection and scene break detection.
What’s an important lesson you’ve learned at JW Player?
When you're dealing with the scale that JW Player deals with, having an understanding of machine learning architecture (how algorithms are implemented) is as important as the algorithms themselves.
CB Insights is focused on predicting developments in the tech industry for clients. It does so by analyzing millions of data points from across the web on venture capital, patents, companies, partnerships and news mentions. We spoke with tech lead Amit Abbi to learn more about his role.
Day to day, what are your responsibilities and priorities?
We generally have a product or goal that we are working toward and have outlined experiments that we might want to try out in order to achieve that goal. My responsibilities basically include doing work required to help with this process. This involves trying to remove any obstacles in the team's path, helping out with modeling and data exploration, making sure the team has the right infrastructure/resources needed and also doing engineering work. I might spend some time researching a new approach or a new technology; some time might be spent writing code, some time on error analysis and so on.
What is the data science team culture like at CB Insights? Any traditions?
Our team is a part of the development team, and our culture is very similar. We are organized into four teams and each team has chosen a name for itself. We call ourselves Team Delphi, named after the seat of the oracle in ancient Greece. Using our data in creative ways is quite entrenched in our culture at CBI — many people, as a result, get themselves involved in data science at our periodic hack days. Some of the things that people worked on this last hack day included models that predict churn and improve recommendations.
How does the data science team collaborate with other teams within the company?
Our team is currently focused on building data products, so collaboration involves working closely with our research and content team, product team, data team and, of course, the engineering team, of which we are a part. When initially building our products, we collaborate with our research and content team to spec out as well as give us feedback on the insights our models extract. We might also consult with them and the data team when adding and exploring any new data sources. We may collaborate with the product team to figure out what form the data product takes when it shows up on our platform.
What projects is your team currently working on?
We are currently working on a number of projects that will ultimately help clients understand competitors and emerging technology trends. We are mining a variety of unstructured information sources like patents, news articles and regulatory docs to enable this. In addition, we are working on NLG (natural language generation) to develop narrative-driven insights akin to what an analyst might do. We think there are some very interesting opportunities ahead in expert automation, i.e. having our technology replicate aspects of human cognition.
What’s an important lesson you’ve learned at CBI?
Perfect is the enemy of good — this especially applies when working in data science teams or on data products.
What is the last thing you do before you leave the office every day?
Make my list of things I need to do the next day, catch up on interesting HipChat discussions and read the CBI newsletter.
Crossix is a healthcare analytics firm whose platform helps companies better understand and more effectively reach their audiences, analyze health outcomes, measure sales and maximize return on investment. Adam Dubrow, senior manager of advanced analytics, told us about the company’s data science efforts.
What is the first thing you do when you come into the office every day?
I’m a strong believer in tackling the most demanding, important tasks first thing in the morning. So I start the day with an iced coffee and, after addressing any urgent emails, dive into the most challenging and high-priority items on my to-do list.
Day to day, what are your responsibilities and priorities?
I lead the design, execution and delivery of advanced analytics engagements for our clients and am responsible for developing team members and new capabilities. On a given day, priorities will include meeting with clients to review projects, internal status meetings, analysis results and code reviews, research and planning to support capabilities development or problem solving, training new team members, and team meetings.
What is the data science team culture like at Crossix? Any traditions?
The pace of innovation makes it very exciting, and our ability to innovate is fueled by collaboration. Most of the walls in our office can actually be used as whiteboards, and it’s common for a project team to gather for a whiteboarding session — outlining a new methodology, storyboarding a deliverable or thinking through the implications of something we learned from the data. Some traditions include a team outing every quarter (a recent favorite was the Royal Palms Shuffleboard Club in Brooklyn) and impromptu happy hours — usually at a place with Mexican food.
How does the data science team collaborate with other teams within the company?
We have data and analytics experts working on teams throughout the organization, across product development and strategy, business development and client services, where I sit. Collaboration is in our DNA, and it happens naturally as individuals reach out to those on other teams with relevant expertise.
What projects is your team currently working on?
We’re focused on the next wave of innovations for our clients, including continuing to expand our deployment of machine learning and the use of real-world data to improve patient outcomes and solve business problems. We’re also always working on execution and delivery of projects for clients, with some current projects including patient segmentation, developing models to predict health outcomes and pinpoint relevant audiences for programmatic media, designing A/B tests to assess campaign creatives, and measurement and optimization models for national and addressable TV campaigns.
What’s an important lesson you’ve learned at Crossix?
I’ve learned from this environment that having a diversity of perspectives on every project team is one of the things that helps us deliver the best results to clients; a group of individuals applying their unique experiences and perspectives together tends to arrive at a better solution than any one team member would individually. It also creates a virtuous circle because diverse teams provide the best learning opportunities for our team members, which accelerates their career development, and this, in turn, helps to create an energizing, in-demand work environment.
What is the last thing you do before you leave the office every day?
Ernest Hemingway advised writers to stop when you are going good and when you know what will happen next. This is great advice that can be applied to ending a work day. I put it in practice by wrapping up everything I need to get out for the day, and then taking stock of where I’ll start next the following day, updating my to-do list accordingly.
Knewton is an adaptive learning platform that allows students, schools and universities to customize lessons to the individual. The company was founded in 2008 and has since helped more than 10 million students create courseware to meet their educational goals. We caught up with lllya Bomash, a managing data scientist at Knewton, to learn more about what it’s like on the company’s data science team.
What is the first thing you do when you come into the office every day?
Make an espresso, order lunch for the day and then try to get something big crossed off my to-do list. As a new parent, I get into the office earlier than I used to, and that morning time is a good chance to spend time on something that needs focus.
Day to day, what are your responsibilities and priorities?
Our team has to make sure our existing models are working well in our product and are integrated correctly into new product features. But we always have a "next thing" that we’re working on to push forward how much our product contributes to learners working to master a new domain of knowledge and to instructors who are tracking their students' progress.
What is the data science team culture like at Knewton? Any traditions?
Like the rest of Knewton, our team has a strong culture around caring about education. We run an education journal club, give talks called "Learning about Learning," and debate the strategies our models should try and metrics we should use to evaluate good learning. We do a lot of learning ourselves too. We attend conferences, discuss new results in machine learning and present to each other on the projects we’re working on.
How does the data science team collaborate with other teams within the company?
At Knewton, we’re building a product that powers adaptive educational experiences for learners and instructors. However, "adaptive education" has not really been defined yet — both in the market and especially in the minds of our users. There are many aspects of adaptivity that we have to define from scratch. This requires a partnership between the Data Science team and the Product team. We work closely with the Product team to learn more about underlying user needs, and to turn our models’ underlying capabilities into compelling user features.
What projects is your team currently working on?
We’re always looking to understand what sorts of educational approaches work best for different domains or kinds of content. We can work backward from looking at typical educational content for a new domain to how we should design the adaptive experience, given the models and content representations we’ve developed already. We then test these approaches (internally or with volunteers) to continue expanding the portfolio of content we can power.
We’re also constantly working on better understanding the effectiveness of our product. Although "learning outcomes" can mean a lot of different things, it’s our team’s responsibility to represent the relevant aspects of learning to the rest of the company and to the market.
What’s an important lesson you’ve learned while working at Knewton?
Any time you think you’ve explained the results of a complex model to a lay audience, take two more steps back. There are so many things we take for granted as a result of working on adaptivity every day. Even when we think what we’re saying makes sense, we’re making lots of assumptions. You have to guide the user through those assumptions in a way that makes them feel natural.
What is the last thing you do before you leave the office every day?
Make sure that espresso cup ends up in the dishwasher.
Enigma helps organizations and individuals break down internal data silos, analyzing enterprise data in congruence with public data. The company’s platform provides an aggregated collection of public data from international government agencies, organizations and businesses. We spoke with Thomas Moran, senior data scientist and analytics team leader, senior data scientist Alex Chohlas-Wood and data scientist Nick Becker to learn more about how the company’s data science team operates.
Day to day, what are your responsibilities and priorities?
Moran: It all basically comes down to ensuring that everyone I’m working with has what they need to make progress and that I prep or follow through for my individual contributions.
Becker: I work with software and data engineers to improve and validate models, explore new ideas and work with project managers on how to best present results to the client.
What is the data science team culture like at Enigma? Any traditions?
Moran: Our data science team is individually accountable but mutually supportive. Most of our interactions are with other teams, not with each other, so everyone needs to be able to move a project forward with minimal oversight. But we carve out time to commiserate about technical and non-technical aspects of our individual work, and to collaborate on cross-functional projects. We also host a bi-weekly Data Science Forum that has the mission of nurturing an inclusive environment for data-driven development, including pop-up-style presentations from technical and non-technical participants alike.
How does the data science team collaborate with other teams within the company?
Chohlas-Wood: The data science team has one of the broadest reaches in the company. Our work, in the most elemental sense, is about thoroughly understanding needs — of clients, of internal teams, of the public — and pairing those needs with the appropriate solution. Because so much of this work involves working with others, we’re frequently embedded with engineering, commercial or our data teams as they work to make progress on projects and engagements.
What projects is your team currently working on?
Moran: Our Analytics data science team has three main functions: supporting the commercial team on customer engagements, supporting the product teams with product development and working closely with engineering teams to support customer deployments. This involvement with every aspect of the customer life cycle is one of the best parts of the role.
I like to say that when Enigma first starts working with a customer we’re often the most technical person in the room, and on the long tail of a customer relationship, we’re sometimes the most business-oriented person in the room. Combined with the broad range of business domains that Enigma handles, from financial crimes to public agencies to insurance to global logistics, I believe we offer the best data science roles in NYC.