Python and Jupyter

This page will address some common pitfalls when working with python and related tools on a shared system like a cluster.

The following topics will be discussed in detail on this page:

Available python versions

All unix systems come with a system wide python installation, however for the cluster it is highly recommended to use one of the anaconda installations provided as a modules.

# reminder
module avail python
module load python/XY

These modules come with a wide range of preinstalled packages.

Installing packages with pip

Pip is a package manager for python it can be used to easily install packages and manage their versions.
By default pip will try to install packages system wide, which will not be possible due to missing permissions.
The behaviour can be changed by adding --user to the call.
pip install --user package-name

By defining the variable PYTHONUSERBASE (best done in your bashrc/bash_profile) we change the installation location from ~/.local to a different path. Doing so will prevent your home folder from cluttering with stuff that does not need a backup and hitting the quota.
export PYTHONUSERBASE=${WORK}/software/privat

Conda environment

Some scientific software comes in form of a Conda environment (e.g. https://docs.gammapy.org/0.17/install/index.html).
By default such an environment will be installed to ~/.conda. However the size can be several GB therefore you should configure Conda to a different path. This will prevent your home folder from hitting the quota. It can be done by following these steps:

conda config # create ~/.condarc
Add the following lines to the file (replace the path if you prefer a different location)

pkgs_dirs:
- ${WORK}/software/privat/conda/pkgs
envs_dirs:
- ${WORK}/software/privat/conda/envs

You can check that this configuration file is properly read by inspecting the output of conda info
For more options see https://conda.io/projects/conda/en/latest/user-guide/configuration/use-condarc.html

Jupyter notebook security

When using Jupyter notebooks with their default configuration, they are protected by a random hashed password, which in some circumstances can cause security issues on a multiuser-system like cshpc or the cluster frontends.
We can change this with a few configuration steps by adding a password protection.

First generate a configuration file by executing
jupyter notebook --generate-config

Open a python terminal and generate a password
from notebook.auth import passwd; passwd()

Add the password hash to your notebook config file

# The string should be of the form type:salt:hashed-password.

c.NotebookApp.password = u''
c.NotebookApp.password_required = True

From now on your notebook will be password protected this comes also with the benefit that you can use bash functions for a more convenient use.

Quick reminder how to use the remote notebook

#start notebook on a frontend (e.g. woody)
jupyter notebook --no-browser --port=XXXX

on your client:
ssh -f user_name@woody.rrze.uni-erlangen.de -L YYYY:localhost:XXXX -N
Open the notebook in your local browser at https://localhost:YYYY
With XXXX and YYYY being 4 digit numbers.
Don’t forget to stop the notebook once you are done. Otherwise you will block resources that could be used by others!

Some useful functions/alias for lazy people 😉

alias remote_notebook_stop='ssh username@remote_server_ip "pkill -u username jupyter"'
Be aware this will kill ALL jupyter processes that you own!

start_jp_woody(){

nohup ssh -J username@cshpc.rrze.uni-erlangen.de -L $1:localhost:$1 username@woody.rrze.uni-erlangen.de " 
. /etc/bash.bashrc.local; module load python/3.7-anaconda ; jupyter notebook --port=$1 --no-browser" ;
echo ""; echo " the notebook can be started in your browser at: https://localhost:$1/ " ; echo ""
}

 

start_jp_emmy(){

nohup ssh -J username@cshpc.rrze.uni-erlangen.de -L $1:localhost:$1 username@emmy.rrze.uni-erlangen.de " 
. /etc/profile; module load python/3.7-anaconda ; jupyter notebook --port=$1 --no-browser" ;
echo ""; echo " the notebook can be started in your browser at: https://localhost:$1/ " ; echo ""
}

 

If you are using a cshell remove . /etc/bash.bashrc.local and . /etc/profile from the functions.