Data Strategy – Building a Successful Data Science Team

This is the second part of the Data Strategy series. The topic of building a successful Data Science Team is so important, that we decided to expand it into a whole post. In the first part of the series, we explored key points that define a successful data strategy. Now we are going to focus on the elements that make a data team succeed.

Why building a Data Science Team? Companies are realizing that the dream of finding the elusive Data Science Unicorn the candidate that has it all – is impossible or not feasible – they are just too expensive. With growing complexity, it is getting impossible to be good at all that is needed to take the most advantage of the field. It is much more productive to build Data Science Teams with people that have different profiles.

This guide has several audiences in mind:

First, it is meant for companies looking to hire new team members or start their Data Team from scratch.
It is also developed for Data Managers that look for ways to optimize their Data Team flow and dynamic.
This guide should also be helpful to candidates looking for a new job. Job descriptions tend to differ from what the reality is like. If you are looking for a job, it is really helpful to understand the basics of data strategy and in particular the elements from that strategy that contribute to making successful Data Science teams. Like that, you can estimate if the company has a meaningful vision and objectives for their Data Science efforts and if the dynamics encouraged by the structure of the team and its place within the whole company are the right fit for your skills and preferences. This allows you to identify unique opportunities to contribute build a more successful team.

What Does a Data Science Team Do?

A Data Science Team is the group of people within the company that is in charge of the Data Science Process, from beginning to end. This includes:

Designing a strategy and the most efficient implementation for the acquisition of quality data. The quality and variety of data are so important, that it could be a decisive factor that gives a company a competitive advantage
Creating data frameworks (data warehouses and pipelines) for efficient delivery of data
Application of data analytics methods to look for patterns, group segmentation and a general quantitative understanding of the status quo of a particular business problem.
Creation of models to forecast and predict. In particular Machine Learning models are proving to give the best performance.
Implementation of the results from models to real-world applications
Testing the performance of the model
Creation of attractive and easy to understand ways to communicate their findings (for example, through data visualization)
Getting insights from the results to use as input for starting a new cycle

Depending on the structure, the size and the capacities of the company, these functions will be performed by one or more individuals. There are no formulas of what is best, it depends on each situation to determine the right mix of people that will maximize results while optimizing the scarce resources that every company has.

Data Analytics vs Data Science Team: what are the roles and responsibilities?

One of the first questions to answer as part of a successful data strategy is whether the focus is going to be on Data Analytics or Data Science. Data Science is a bigger field that encompasses Data Analytics but does not have it as the focus. Hiring Data Scientists to exclusively do the job of a Data Analyst might lead to frustration and inefficient use of resources since in general, Data Scientists’ salaries tend to be higher. So here a quick test that will point you in the right direction.

Does my company have a deep understanding of the quantitative characteristics that describe the current situation of the company?
Do we have a good reporting system and a stable definition of KPIs?
Do we have a new problem that is focused on predicting outcomes in the future?
Is this problem relevant to my business strategy?
Is it possible to gather data to measure this problem? In particular, the most useful methods in Machine Learning require data that is labeled and that can serve to train the models.

If the answer is no to these questions, your company likely needs to first focus on building a Data Analytics Team. A deep understanding of the status quo from a company is a necessary base for guiding new, more complex models, that should add value. This also manifests in the type of data that is taken as input for the analysis. If your company has a bunch of unmined data, usually in the form of databases, then it is a good idea to start there.

Like anything in life, it does not need to be black or white. Your company might be in a middle stage on its data path, seeking to expand from a purely descriptive type of analysis to a predictive focus or seeking to develop new strategies to gather data. For these situations, it is useful to understand how different data job profiles differ. Creating a mix of these will ensure that you attack the data needs that your company has from different fronts. Here (link to part 1 of Data Ecosystem post) and here (link to part 2 of Data Ecosystem) is a two-part series that we prepared to help you make sense of the Data Science Job Ecosystem.

Setting up a Data Science Team – Where should it be nested within the company?

One of the key determinants of the success of Data Science teams is the right design of their structure within the company. This implies much more than determining where the the team of Data Scientists physically sit. What is at stake is the design of dynamics and interactions both within the Data Science team and with the rest of the members of the company. Accenture developed a very useful scheme of different ways that companies use to organize their Data Science activities within their ecosystems. They accurately describe that three main factors will determine which model is more appropriate, namely the maturity that the Data Analytics teams have, the priorities that companies have and the way of balancing the supply and demand of data analytics capabilities. Each model will foster different characteristics for these three key factors.

Overview of the main organizational models for Data Science Teams:

Decentralized: each department or business unit makes has its analytics projects without coordination with others.
Centralized: there is one special department within the company that encompasses the data analytics resources and coordinates it to serve the variety of departments and business units. The prioritization of activities is done considering all the data resources as one pool. This is a preferred organizational model for a Data Science Team when the data analytics resources are very scarce and there is a strong need to determine priorities on a company level.
Consulting: the data analytics resources are also concentrated in one department, with the specificity that they act like consultants that “charge” other departments for their services. Analytics resources tend to be allocated on a first-come, first-serve basis.
Functional: it is similar to the decentralized model in that the data science activities are also spread within the departments, but the structure allows for coordination and cooperation to provide services with no own capacity of analytics. The allocation of resources is determined based on a functional – department based – basis, rather than on a company level. This is a useful model when a company is just starting their way in the data path. Each department covers its needs without the need for coordinating these efforts.
Center of Excellence: there is a centralized body that coordinates smaller data analyst pockets within departments or business units. This model balances both centralized and decentralized organizations.
Federated: very similar to the center of excellence model, with the difference that the centralized body deploys analyst resources to particular projects within departments and business units when this is necessary.

It is evident that some of these organizational models for Data Science teams just work when the company has a certain size. The strategy chosen needs to be aligned with the dynamics and structure that works for other types of activities. The last two models are a new trend that allows for a lot of flexibility and innovation. There is a centralized team that can help with the implementation with cross-functional complex projects while each unit can also have the flexibility of having their inner strategy without always consulting or depending on other units. The advantage is also that Data Scientists have a centralized body that prevents them from a feeling of isolation.

Managing and retaining the dream team

Once you managed to build a data science team that can deliver data projects from beginning to end, the challenge is to retain them. Having a solid data strategy, as described in the first part of this series, definitely helps. Data Scientist recognize the difficulty of setting a whole strategy that allows for constant innovation. Important points are therefor to prove to have a plan that is truly motivated by the business strategy, backed up with the technological structure that allows for the agility that Data Science requires and with clarity about roles and responsibilities. Moreover, Data Scientists are also motivated by curiosity and engagement. Make everything to ensure a general culture of data within your company and organize events like hackathons or Data Science conferences that give them the opportunity to present their work to the Data Science scene.

No matter how far you are in your data strategy path, we have the experience to help you. Contact us to find the right candidate(s) that will be the perfect fit for your needs.