Main Web>Computing>IPython (2024-03-22, WilliamSeligman)

Jupyter notebook server at Nevis

Jupyter (formerly IPython) has become a popular tool for interactive physics analysis. Here are some examples of what you can do with notebooks. There is a dedicated Jupyter server, notebook, available for the users of the Nevis Linux cluster. To use it, visit https://notebook.nevis.columbia.edu and enter your Nevis cluster account name and password.

This page focuses on the capabilities of the Nevis Jupyter notebook server. If you'd like to have your own Jupyter/ROOT installation (e.g., for a laptop), see the Jupyter/ROOT Containers page.

The basics

When you visit notebook for the first time, you'll see your home directory on the left-hand panel. You can perform some elementary file operations from here: right-click on a file's name to see a menu.

You'll see a bunch of kernel icons to the right. To start a notebook, click on the one of the icons in the "Notebook" section that corresponds to the language or function you want to use; there's a list of kernels below. In the "Other" category, you'll see a Terminal icon if you'd like to type UNIX commands from within the web browser. The "Text File" icon will open a simple text editor. (The options under "Console" can be ignored for first-time users.)

Why notebooks?

After you've started a notebook, type some elementary commands in one of the notebook cells. For example, if you picked "Python 3":

from ROOT import TH1D, TCanvas
my_canvas = TCanvas("mycanvas","canvas title",800,600)
example = TH1D("example","example histogram",100,-3,3)
example.FillRandom("gaus",10000)
example.Draw("E")
my_canvas.Draw()

Type some commands in the language of the kernel you chose in the first cell (there are code examples below). Hit SHIFT-ENTER to execute the lines in that cell. If there's an error, make an appropriate fix and hit SHIFT-ENTER again.

Continue editing lines in that first cell until you have finished some small task (e.g., creating a histogram). Execute the cell (SHIFT-ENTER) to demonstrate to yourself that you've got it right.

Go to the next cell and continue your work. The variables and functions you defined in the first cell are still available to you. Again, you can iteratively execute and debug that new set of code until it does what you want.

In the File menu, select "Rename Notebook..." (otherwise your notebook will have the name "Untitled"). On the left-hand display you'll see your new notebook in your home directory, with the suffix .ipynb. Click on it to start up the notebook again.

Explore the menus. Note how you can save, rename, checkpoint, switch kernels, execute some or all of the cells.

Click in an empty cell. Go to the pop-up menu near the top of the page that reads "Code". Select "Markdown" from that menu. Now you can type plain text in that cell. You can also include Markdown, LaTeX, and HTML commands to format the text. When you're done, hit SHIFT-ENTER to see the formatted result.

So notebooks:

let you quickly prototype, save, and update code;
make plots, then fiddle with the code that created the plots and quickly refresh them;
easily document your work, and update the documentation as quickly as you update your code and plots;
you can do all of this from within your web browser.

That's the answer to "why notebooks".

Magic commands

In addition to the kernel languages listed below, in any notebook cell you can type "magic" commands that have effects on your system beyond what the kernel's language normally provides. The magic commands I use most often are:

!ls
%cd <directory>
%cp <file> <new-file-loc>
%lsmagic
%jsroot on

That last magic command, %jsroot on, is only available in ROOT notebooks (the first two kernels listed below). JSROOT adds some interactivity to ROOT plots.

Handy Jupyter links

Kernels

These are the "kernels" (active interpreters/compilers) available on the notebook server. The first two listed are the ones mostly likely to be used; the rest are listed in alphabetical order.

The kernels inherit the environment variables that you set in your shell initialization scripts. This can be convenient, but be sure to read the Limitations section below.

Python 3

Python is a interpretive scripting language. It's becoming more widely used in physics for both scripting and analysis. Here's a Python tutorial. You'll probably also be interested in the commonly-used scientific packages NumPy (which implements arrays), SciPy, and mathplotlib. If there's some standard Python package that's not included on the notebook server, let WilliamSeligman know.

The Python 3 set up on notebook includes PyROOT, a Python-based interface to ROOT. Here's an example of how to use it.

You can copy-n-paste the following example directly from this web page into a Python notebook cell:

import ROOT

# You may want this if you'd like your ROOT plots to be interactive in the notebook.
%jsroot on

# Define a canvas
my_canvas = ROOT.TCanvas("my_canvas","my_canvas",800,600)

hist=ROOT.TH1F("hist","example histogram",100,-3,3)
hist.FillRandom("gaus",100000)
hist.Draw()

# You have to draw the canvas to see it in the web page.
my_canvas.Draw()

ROOT C++

This kernel is the ROOT C++ interpreter, cling. In addition to working with ROOT, it also provides the C++ language within a notebook. Here's an example of using ROOT within a notebook (the C++ example is near the bottom).

A simple test, which you can copy-n-paste directly from this web page into a ROOT C++ notebook cell:

%jsroot on
TCanvas mycanvas("name","title",800,600);
TH1D test("test","example title",200,-3,3);
test.FillRandom("gaus",10000);
test.Draw();
// Unlike interactive ROOT, once you've drawn on a canvas,
// you must draw the canvas explicitly to see it in the notebook. 
mycanvas.Draw();

Bash

Bash (from "Bourne-Again SHell") is a shell language for UNIX systems. There's a good chance it's the shell you use when you login to the Nevis Linux cluster. With this kernel, you can develop shell scripts.

Fortran

Fortran (from "FORmula TRANslation") is a mathematical computer language. For decades it was the backbone of computer programming in physics, and many say that it's still the most efficient language for implementing mathematical tasks. This kernel provides an interface to the GNU gfortran compiler, which is fully compliant with the Fortran 95 Standard and includes some Fortran 2003 and Fortran 2008 features.

Note:

The Fortran compiler provided within the notebook server does not include CERNLIB.
You can also create Fortran functions that can be called by Python routines using Fortran magic.

Gnuplot

Gnuplot is a graphing utility for visualizing mathematical functions and data interactively. There are Gnuplot cell magics that let you use Gnuplot graphics to create plots within some of the other kernels on this page, such as Julia and Octave.

Haskell

Haskell is a functional programming language (as opposed to a procedural language like Python, or an object-oriented language like C++). It's a language for which lazy evaluation is fundamental. As a consequence, Haskell can potentially handle expressions with an "infinite" number of terms, or data structures that are "infinite" in size.

IForth

Forth is an interpreted, compact, extensible language. It was originally developed for embedded systems. The Forth kernel within Jupyter is not perfect (e.g., its text output includes the text input, in addition to the output of a command), but it can serve as good practice in the language. If you're interested in having gforth installed on your server, let WilliamSeligman know.

Julia

is a high-level, high-performance dynamic programming language developed at MIT for technical computing. It combines the ease-of-use of Python with the speed of Fortran. Here's a Julia tutorial

, though the plotting examples may not work in Jupyter unless you use PyPlot

); e.g.:

using Pkg
Pkg.add("PyPlot")
x = range(0; stop=2*pi, length=1000); 
# The "." after the Base function tells Julia it's operating on a Vector
y = sin.(3 * x + 4 * cos.(2 * x));
PyPlot.plot(x, y, color="red", linewidth=2.0, linestyle="--")
PyPlot.title("A sinusoidally modulated sinusoid")

Here's another little example:

# You probably won't need to add these packages, since WilliamSeligman has
# already done so on the notebook server.
using Pkg
Pkg.add("Plots")
Pkg.add("DifferentialEquations")

using DifferentialEquations
f(u,p,t) = 1.01*u
u0 = 1/2
tspan = (0.0,1.0)
prob = ODEProblem(f,u0,tspan)
sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8)

Plots.plot(sol,linewidth=5,title="Solution to the linear ODE with a thick line",
     xaxis="Time (t)",yaxis="u(t) (in &#956;m)",label="My Thick Line!") # legend=false
Plots.plot!(sol.t, t->0.5*exp(1.01t),lw=3,ls=:dash,label="True Solution!")

Octave

Octave is a scientific programming language, with nice features for handling vectors and matrices, and good visualization tools. It's an open-source equivalent of Matlab. Here's an Octave tutorial.

R

R is a language for statistical computing and graphics; it's the open-source version of S+. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques. Here is an R tutorial.

Ruby

Ruby is a dynamic, open source programming language with a focus on simplicity. Many prefer it to Python as a first programming language. The SciRuby packages are part of this installation, so that Ruby can be used for scientific computation. Unfortunately, there is no current working link between ROOT and Ruby (though one used to exist and might exist again someday). Here's a link to tutorials.

Plotting in Ruby notebooks is limited. The GnuplotRB package is available for plots; you'll have to search through the examples for x-y plots and histograms. Use the svg format for your plots; for some reason GnuplotRB won't display in png or jpeg within a Ruby kernel.

No, Ruby on Rails is not installed. We do not want you building web applications on the notebook server.

SageMath

SageMath is an open-source mathematics software system. It's a wrapper around different symbolic math and statistical packages. It's intended as a open-source replacement for Maple, Mathematica, and Matlab. Here's a SageMath tutorial.

If you ask me if we have Mathematica at Nevis, this is where I'll send you.

...and more

You don't necessarily need an explicit kernel to develop scripts. Jupyter has "cell magics" that let you redefine the language being used within a given cell. If you execute

%lsmagic

you'll see a list of available cell magic commands. Among the commands are those that switch between different languages within a single notebook.

Perl

For example, if you want to work on a Perl script, you can put lines like this in a cell in any kernel:

%%perl
# An uninteresting example: display all the regular files in my home directory
use strict;
use warnings;

use Path::Class;

my $dir = dir($ENV{'HOME'});

# Iterate over the content of my home directory
while (my $file = $dir->next) {
    
    # Skip if it is a directory
    next if $file->is_dir();
    
    # Skip if the filename ends a ~ (emacs work file)
    my $filename = $file->stringify;
    next if $filename =~ /~$/;
   
    # Print out the file name and path
    print $filename . "\n";
}

Cython

Cython is a superset of python in which the commands are compiled into C. For example:

%load_ext Cython
# Next cell:
%%cython -a
def geo_prog_cython(double alpha, int n):
    cdef double current = 1.0
    cdef double sum = current
    cdef int i
    for i in range(n):
        current = current * alpha
        sum = sum + current
    return sum
# Test in next cell:
geo_prog_cython(4.0,5)

Want even more?

It is my intention to include every available Jupyter kernel that might have an application in physics, as long as there's a clear installation method for adding it to Jupyter. If you have a request for an additional kernel, or for a library or extension to be added to an existing language, please let WilliamSeligman know.

Before you ask: I've already tried to install Jupyter kernels for Tcl and Perl. Each presented a technical issue, ranging from "simply doesn't work" to "not compatible with being invoked from Jupyter".

Limitations and workarounds

The notebook server is a shared resource for use by anyone in the Nevis particle-physics groups and/or the REU students to do light development tasks. If you need to run long, CPU-intensive, or multi-threaded parallel process via Jupyter, notebook is not a good choice. You'll potentially interfere with everyone else trying to use it at the same time. For these high-resource tasks, you can run Jupyter on your workgroup's server instead. (You may interfere with everyone else in your workgroup, but that's between you and them, not you and everyone else with a Nevis Linux cluster account.)

The Jupyter notebooks inherit your user environment, that is, the variables that you define in your shell startup scripts. However, if you modify certain variables such as $LD_LIBRARY_PATH or run customization programs in your initialization, it can affect the execution of the notebook server. The typical symptoms are a notebook kernel that refuses to start or you get library load errors.

There's no guarantee that software compiled on your workgroup server will run on notebook, especially if it relies of software libraries only found on that server. If you need libraries that were compiled for your workgroup server, you'll probably have to use them on your workgroup server. You'll know if this is the case if you get library errors when trying to use your own compiled libraries via notebook.

The solution to most of these issues is to run Jupyter on your workgroup server, or consider a container-based distribution.

Jupyter on your workgroup server

Jupyter is installed on all the interactive systems on the Linux cluster. However, there are sometimes issues with Jupyter's extensions such as JupyterLab. If you encounter such issues, see the conda page; there's a conda distribution available on the cluster:

conda activate /usr/nevis/conda/root

Once you set up ROOT, in theory you'll be able to run Jupyter:

jupyter notebook

This will start up a web browser on the system on which you execute the command (not your laptop!), with the web page open to localhost:8888. This is not what you'll want to do normally.

Remote access

You probably want to see Jupyter via a web browser on your laptop. To do this, you must port-forward a connection via ssh. The complete instructions are here

. What follows is a brief summary.

On the workgroup server, this command will start up Jupyter for you:

jupyter notebook --no-browser --port=XXXX

... where XXXX is an unused port on the server; e.g., 7000. If multiple users want to run Jupyter on your server, you'll have to coordinate with them so that you don't use the same port. You will see a message on your terminal that includes something like this:

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:XXXX/?token=<string-of-hex-digits>

...where XXXX is your argument to the --port option. Copy that entire URL string.

On your laptop, forward that port XXXX to you:

ssh -N -f -L localhost:YYYY:localhost:XXXX <username@server.nevis.columbia.edu>

...where <username@server.nevis.columbia.edu> is your Nevis account and server on which you ran the jupyter command. YYYY can be any unused port on your laptop; often users pick YYYY=XXXX, but you don't have to.

Then to access the notebook, go to the web browser on your laptop and visit http://localhost:YYYY/?token=<string-of-hex-digits>. Note that this is almost the same as the URL you copied above, except that you'll have to substitute the laptop port YYYY for the XXXX in the message.

There's a potential problem with just using any value for YYYY: Most current web browsers won't let you visit random ports anymore. In Firefox, there is a workaround; in Safari there isn't. The simplest solution is to let YYYY be a generally-recognized port for internet access; e.g., 8080.

After all that...

You'll hopefully see a Jupyter home window similar to the one you see when using the notebook server. The chief difference is that you won't have the full range of exotic kernels available on that server, just Python and ROOT C++. As long as your jupyter command keeps running, you can login again without the token by making sure the ssh port forwarding is running on your laptop, then visiting http://localhost:YYYY in your browser.

If you want to keep your Jupyter process running even after you've closed the terminal window on your workgroup server, you may want to use the UNIX tmux command. The commands would look something like this:

tmux
conda activate /usr/nevis/conda/root
jupyter notebook --no-browser --port=XXXX
# Copy the URL
# Switch to a different screen to work
<Ctrl-b c>

You can close the terminal window whenever you wish; your processes (including jupyter) will continue to run. When you login to your workgroup server again, the command

tmux attach

will reconnect you with the screen(s) you created before, including the jupyter screen.

Jupyter on your laptop

See the Jupyter/ROOT containers page.

Topic revision: r56 - 2024-03-22 - WilliamSeligman

Main

Webs
ATLAS
DOE
Main
TWiki
Veritas