Document & Publish Your Workflow: Jupyter Notebooks
In this tutorial we learn how to effectively and efficiently document and publish our workflows online.
At the end of this activity, you will be able to:
- Explain why documenting and publishing one's code is important.
- Describe two tools that enable ease of publishing code & output: Jupyter Notebooks with the Python kernel.
Documentation Is Important
As we read in the Reproducible Science overview, the four facets of reproducible science are:
- Automation and
This week we will learn about the Jupyter Notebook as a tool to document and publish (disseminate) your code and code output.
“The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more." -- Jupyter Notebook documentation.
We use markdown syntax in Notebook documents to document workflows and to share data processing, analysis and visualization outputs. We can also use it to create documents that combine code in your language of choice, output and text.
The Jupyter Notebooks grew out of iPython. Jupyter is a close acronym meaning Julia, Python, and R. Which were the first languages outside Python that the Jupyter application was designed for. Jupyter Notebooks now support over 40 coding languages. You may still find some references to iPython in materials related to Jupyter Notebooks. This series will focus on using Jupyter Notebooks with Python, but the information presented can apply to other languages as well.
The Jupyter Notebooks application is a browser-based application. Therefore, you need an updated browser (the Jupyter people recommend Mozilla Firefox or Google Chrome, but not Microsoft Explorer). When installed on your computer you can always access the app even without internet access. You can also use Jupyter installed on a remote server. For example, Jupyter runs a training (temporary) server based version.
Why Jupyter Notebooks?
There are many advantages to using Jupyter Notebooks in your work:
- Human readable syntax.
- Simple syntax - it can be learned quickly.
- All components of your work are clearly documented. You don't have to remember what steps, assumptions, tests were used.
- You can easily extend or refine analyses by modifying existing or adding new code blocks.
- Analysis results can be disseminated in various formats including HTML, PDF, slide shows and more.
- Code and data can be shared with a colleague to replicate the workflow.
Explore Example of Notebooks
Before we jump into how to work with notebooks, check out a few shared notebooks. As you look at these different notebooks, what aspects of the layout do you like, what don't you like? Is there a place in your current workflow that these notebooks would be useful?
- Jupyter's GitHub Wiki: A gallery of interesting Jupyter Notebooks. Not only is this a great collection of example notebooks but also it is a valuable resource to learn other skills associated with using Python and Jupyter Notebooks.
- Fabian Pedregosa's Notebook Gallery
In the next tutorial we will learn more about working with Jupyter Notebooks.