Project Jupyter has a wide range of tools (including Jupyter Lab, Jupyter Notebooks and JupyterHub among others) that make interactive data analysis a wonderful experience. However, the capabilities that give power to individual data scientists can prove to be challenging to integrate in a team setting. Challenges stem from peer review of code, to quality assurance on the analysis itself, to sharing the results with management or a client expecting a formal document.
This talk will consist of three parts:
1. An overview of the technology stack, with examples of why Jupyter has become such a popular tool in the professional data science toolkit.
2. The principles behind successful collaboration between data scientists and their managers in a team setting. Examples of several different workflows for different options will be given.
3. An exploration into how Jupyter can be tied into a larger data science ecosystem through its native support for different popular data science languages(Python, R, Julia), Jupyter tools like nbdime that allow tighter integration with git (version control), and the implementation of tools like Apache Spark.
The intended audience for this talk is practicing data scientists and people who work closely with them (data science managers, for example).