Candidate Profile for a Great Data Scientist
Planning to hire a data scientist? Your best candidate possesses a combination of analytical, storytelling, and technical skills.
- By George Firican
- June 2, 2017
Are you looking for a data scientist? The role can include varying tasks -- from performing information analysis and interpreting the data to mining and designing data models. Although the domain can be broad, I have collected the following recommendations for finding your perfect candidate.
The technical skills your data scientist will require will depend on your technical environment. Many companies make the mistake of trying to cover all their bases by casting a wide net and listing as many must-haves as they can think of, but it's not reasonable to look for someone who is proficient in all languages and tools.
For programming language requirements, you should look for someone who is versatile in procedural languages such as C++ and MATLAB and data-oriented languages such as SQL, MySQL, SPARQL, etc. Your candidate should also be proficient in R or Python as a must-have statistical language.
After building analytics models to discover insights, a data scientist also produces and presents these findings in dashboards and visualizations. Look for someone with good written and graphic communication skills or data storytelling skills. The candidate should also have the technical skills for using a visualization tool such as Power BI, Tableau, Qlik, Domo, D3, and others.
Business acumen in your organization's domain is nice to have, but it's also something that could be learned on the job. Just ensure your candidate tendends to use business analysis skills not just to learn about their environment, but also to draw out and understand the proper requirements from clients, who often do not know what they want at first.
Interests and Abilities
Some of the technical knowledge of statistics, programming languages, and tools can be secondary to knowing how to approach the data, use it, and present it.
Speaking from experience, candidates with the following characteristics make good data scientists:
- Curious and Persistent: Especially when working with new data sets or developing new models, data scientists don't know always what they will find. They need to test their assumptions and not be afraid to try different tactics if the initial ones don't work as planned. You need a data scientist who thrives in exploring this unknown and doesn't consider a wrong assumption to be a failure but an opportunity to try a different route.
- Creative: Part of the value of a data scientist is in storytelling. Creativity helps the data scientist make a story informative and captivating at the same time. It also allows one to think "outside the box," for lack of a better term, when developing analytics models.
- Skeptical: Assumptions may be necessary when faced with missing information about data sets and gaps in the data, but a data scientist needs to question the data and the conclusions as much as possible. A good data scientist should want to know the provenance and credibility of the data sets, the data quality parameters and measures, the definitions of the business terms, and also the derived correlations, assumptions, and overall conclusions. A good level of skepticism allows the right questions to be asked, the proper analysis done, and arguments to be evaluated based on the right data.
- Accountable and Detail-Oriented: A good data scientist dives into the details of the data, tests all assumptions, and ensures the accuracy of the final results -- while staying accountable to agreed-upon deadlines.
A data scientist needs at least a basic understanding of statistics, including distributions, likelihood estimators, and familiarity with statistical tests. One needs to understand when different techniques should be applied. Multivariate calculus and linear and matrix algebra are also good to know, plus the programming and database querying skills I mentioned above. I recommend that you look for a candidate with a degree in statistics, mathematics, computer science, and/or data mining. A postgraduate degree is preferable, but specific data analytics certificates or work experience could be substituted.
In short, the best candidate possesses a combination of analytical, storytelling, and technical skills along with the interests and abilities your organization needs. A data scientist's role can vary from organization to organization, so use the above advice as guidelines and match your perfect candidate profile to your own specific requirements.
George Firican is the director of data governance and business intelligence at the University of British Columbia. His innovative approach to data management received international recognition through award-winning program implementations in the data governance, data quality, and business intelligence fields. As a passionate advocate for the importance of data, he founded www.lightsondata.com, he is a frequent conference speaker, advises organizations about how to treat data as an asset, and shares practical takeaways on social media, industry sites, and in publications. He can be followed on Twitter (@georgefirican) or reached via email, or on LinkedIn.