This post is inspired by a video from 2017 PyData conference in Berlin. Here I focus on several main points.

Notebook structure

<aside> ☝ How big should a notebook file be?

Hypothesis — Data — Interpretation

</aside>

<aside> ☝ Keep your notebooks small!

(4-10 cells each)

How?

</aside>

I found this part particularly surprising, because my previous notebooks accompanying research papers have been huge. But by looking into his talk, I accepted this viewpoint.

Example: a fat notebook is split into several files in one directory.

Example: a fat notebook is split into several files in one directory.

Cache and images are separate folders.

<aside> ☝ Use shared libraries.

</aside>

Typical structure of the ipynb file.

  1. Imports
  2. Get Data
  3. Transform Data
  4. Modelling
  5. Visualisation
  6. Making sense of the data