18 November 2018

Jupyter notebooks, virtualbox and other friends.


I just completed another module of an online training I am doing. But to get there I have passed a couple of days breaking and fixing things on my set of tools.

As a reflection I'd like to write here some guidelines I'd like to give myself until I decide there are better ones.

If you are the admin of the coding machine (Linux), install the python libraries with synaptic or apt-get instead of pip unless there is no other way but pip.

If you are working with a specific version of python, install the libraries for that version and forget about the rest. Consequently, if you need to use pip, use the pip for the desired python version (note that pip may point to python2.7 or python3.x, while pip3 is clear where it goes).

In my case (-> means "points to"):
pip -> python -> python2.7
pip2 -> python -> python2.7
pip3 -> python3

If you run the jupyter notebooks remotely, start them with nohup, so if the network fails they just stay alive and happy. That's how I was running them, but during some troubleshooting I removed the nohup and then I faced the consequences.

Example:
IPLAB="192.168.xx.xx"; ssh  USER@$IPLAB   "nohup jupyter-notebook --no-browser --port=8889"

If the virtual machine (I am using virtualbox) running the notebooks has multiple network interfaces an after resuming from standby (on the host of the virtual machines) you have problems accessing the notebooks, disconnect the network interfaces that are not essential (before going into standby) so on resume the VM doesn't get annoyed and extending its discomfort to other areas (the notebooks).

In particular, the VM had connectivity:
bridged to wifi
bridged to ethernet
host-only network (this is the important one)

I was able to run for days without issue in a place where the bridged networks had no connectivity (the host was getting the internet via bluetooth and so the VM didn't know anything). When I moved to a place with wifi or ethernet (not very fast) the VM started giving problems after resuming the host from standby. I could run some commands, specially the ones related with filesystem, but no response for others related with network or ram. Very weird. The end was or waiting a lot for it to 'get better' or had-reboot.

In other words, deactivate (before going into standby) the virtual network interfaces that are but won't be alive after resuming (if you have problems after resuming).



Install proper debugging tools (for jupyter notebooks), like

lgpage/nbtutor: Visualize Python code execution (line-by-line) in Jupyter Notebook cells.
https://github.com/lgpage/nbtutor

Variable Inspector — jupyter_contrib_nbextensions 0.5.0 documentation
https://jupyter-contrib-nbextensions.readthedocs.io/en/latest/nbextensions/varInspector/README.html


If you find yourself unable to fix the fiasco of python libraries on system and userspace (blame pip for the later ones), then go into your ~/.local/lib and rename it (caution here, do only if you can) so python cannot find all your naughty pip installations.

Surprisingly I found myself fixing an issue with pip.
I had the latest version of tornado (installed with synaptic), but I was having many issues.
Those issues went away when I performed a pip upgrade of the tornado as root. I couldn't believe when I saw pip saying that the tornado version was older than the one reported by synaptic and proceeding to download and upgrade. Problem fixed.

In particular these 3 lines fixed my many problems

pip install --upgrade  tornado
pip install --upgrade  jupyter_client
pip install --upgrade  ipykernel

I know many people say that root should not use pip, but I don't want libraries on my user space.

Too Cool for Internet Explorer