For Data Scientists, Data and Analytics Skills Are Not Enough
Data scientist is a wildly popular job title, but a good data scientist needs more than strong analytics skills. Knowing the business is crucial.
- By Mike Schiff
- March 10, 2016
As data warehouse practitioners, almost all of us work directly with data. Our jobs are to help our business, educational, scientific, and governmental organizations gather, integrate, store, and/or analyze data so that they can make better decisions to increase sales, prevent fraud, improve security, enhance healthcare, or even improve a candidate's chances of winning an election.
The ability to produce actionable results from analyzing our organization's data has made the position of "data scientist" one of the hottest (and best paying) job titles in our industry. In fact, in the October 2012 issue of the Harvard Business Review authors Thomas H. Davenport and D.J. Patil published an often-cited article about this very topic, Data Scientist: The Sexiest Job of the 21st Century.
Consequently, it is not uncommon for job-seeking analysts to now re-title their current positions as data scientists and for employment advisors to suggest that the term be included in their clients' resumes in order to positively flag the candidate for further consideration. However, there seems to be a continuing debate and discussion as to what skills would qualify someone to truly merit the title of data scientist.
Several universities (including the University of California, Illinois Institute of Technology, New York University, and Stanford) offer data science degrees, but there is no universally accepted data science certification and the job description and responsibilities varies greatly from organization to organization.
Two general schools of thought have emerged. Both include strong technical skills such as statistical analysis and the ability to integrate heterogeneous structured and unstructured data sources as a base requirement, with a mastery of visualization techniques and programming skills being highly desirable. However, one school of thought also emphasizes the need to understand the organization's business and its processes in order to prioritize and focus analytical efforts and be able to explain the results to the business people who will ultimately act on them.
As an aside, I consider data scientists to be data explorers capable of programming their own tools when necessary. However, as the profession matures, I expect data scientist workbenches to evolve to the point that much of the custom programming that a data scientist might now need to do will become unnecessary.
In making this distinction, I am reminded of the (perhaps apocryphal) data mining story in which an insurance company discovered that the level of customer retention and sales was strongly correlated with the age of the building where its offices were located. The older the building, the better the sales! This "discovery" was about to be thrown out as a spurious outlier until a non-technical, long-term operational employee mentioned that the company had an informal policy of remaining in the location they first opened in rather than switching office locations.
What the data mining exercise actually discovered was the obvious fact that the top performing offices were the ones that had an established and loyal customer base while the offices in the newer buildings were, in many cases, recently opened and still beginning to establish themselves.
In my opinion, neither the data miner nor the operational employee would be considered a true data scientist, but someone having both sets of skills (a data miner who understood the business) would. I expect the debate to rage on as companies realize they may never find the ideal job candidate and would thus be willing to settle for data technicians they believe will be able to learn the business.
My advice to any would-be data scientist is to make extra efforts to learn your organization's business in order to determine what additional information its executives need to address current and anticipated issues. That said, my advice is not limited to data scientists in particular or even data warehouse practitioners in general; rather, it is something all employees should practice.
About the Author
Michael A. Schiff is founder and principal analyst of MAS Strategies, which specializes in formulating effective data warehousing strategies. With more than four decades of industry experience as a developer, user, consultant, vendor, and industry analyst, Mike is an expert in developing, marketing, and implementing solutions that transform operational data into useful decision-enabling information.
His prior experience as an IT director and systems and programming manager provide him with a thorough understanding of the technical, business, and political issues that must be addressed for any successful implementation. With Bachelor and Master of Science degrees from MIT's Sloan School of Management and as a certified financial planner, Mike can address both the technical and financial aspects of data warehousing and business intelligence.