Jupyter notebook server at Nevis
Jupyter
(formerly IPython) has become a popular tool for interactive physics analysis. Here are some
examples
of what you can do with notebooks. There is a dedicated Jupyter server,
notebook
, available for the users of the Nevis
Linux cluster. To use it, visit
https://notebook.nevis.columbia.edu
and enter your Nevis cluster account name and password.
This page focuses on the capabilities of the Nevis Jupyter notebook server. If you'd like have your own Jupyter/ROOT installation (e.g., for a laptop), see the
Jupyter/ROOT Containers page.
The basics
When you visit
notebook
for the first time, you'll see your home directory on the left-hand panel. You can perform some elementary file operations from here: right-click on a file's name to see a menu.
You'll see a bunch of kernel icons to the right. To start a notebook, click on the one of the icons in the "Notebook" section that corresponds to the language or function you want to use; there's a list of kernels below. In the "Other" category, you'll see a Terminal icon if you'd like to type UNIX commands from within the web browser. The "Text File" icon will open a simple text editor. (The options under "Console" can be ignored for first-time users.)
Why notebooks?
After you've started a notebook, type some elementary commands in one of the notebook cells. For example, if you picked "Python 3":
from ROOT import TH1D, TCanvas
my_canvas = TCanvas("mycanvas","canvas title",800,600)
example = TH1D("example","example histogram",100,-3,3)
example.FillRandom("gaus",10000)
example.Draw("E")
my_canvas.Draw()
Type some commands in the language of the kernel you chose in the first cell (there are code examples below). Hit SHIFT-ENTER to execute the lines in that cell. If there's an error, make an appropriate fix and hit SHIFT-ENTER again.
Continue editing lines in that first cell until you have finished some small task (e.g., creating a histogram). Execute the cell (SHIFT-ENTER) to demonstrate to yourself that you've got it right.
Go to the next cell and continue your work. The variables and functions you defined in the first cell are still available to you. Again, you can iteratively execute and debug that new set of code until it does what you want.
In the File menu, select "Rename Notebook..." (otherwise your notebook will have the name "Untitled"). On the left-hand display you'll see your new notebook in your home directory, with the suffix
.ipynb
. Click on it to start up the notebook again.
Explore the menus. Note how you can save, rename, checkpoint, switch kernels, execute some or all of the cells.
Click in an empty cell. Go to the pop-up menu near the top of the page that reads "Code". Select "Markdown" from that menu. Now you can type plain text in that cell. You can also include
Markdown
,
LaTeX
, and
HTML
commands to format the text. When you're done, hit SHIFT-ENTER to see the formatted result.
So notebooks:
- let you quickly prototype, save, and update code;
- make plots, then fiddle with the code that created the plots and quickly refresh them;
- easily document your work, and update the documentation as quickly as you update your code and plots;
- you can do all of this from within your web browser.
That's the answer to "why notebooks".
Magic commands
In addition to the kernel languages listed below, in any notebook cell you can type
"magic"
commands that have effects on your system beyond what the kernel's language normally provides. The magic commands I use most often are:
!ls
%cd <directory>
%cp <file> <new-file-loc>
%lsmagic
%jsroot on
That last magic command,
%jsroot on
, is only available in
ROOT
notebooks (the first two kernels listed below).
JSROOT
adds some interactivity to ROOT plots.
Handy Jupyter links
Kernels
These are the "kernels" (active interpreters/compilers) available on the notebook server. The first two listed are the ones mostly likely to be used; the rest are listed in alphabetical order.
The kernels inherit the environment variables that you set in your
shell initialization scripts. This can be convenient, but be sure to read the
Limitations
section below.
Python 3
Python
is a interpretive scripting language. It's becoming more widely used in physics for both scripting and analysis. Here's a Python
tutorial
. You'll probably also be interested in the commonly-used scientific packages
NumPy
(which implements arrays),
SciPy
, and
mathplotlib
. If there's some standard Python package that's not included on the
notebook
server, let
WilliamSeligman know.
The Python 3 set up on
notebook
includes
PyROOT
, a Python-based interface to
ROOT
. Here's an
example
of how to use it.
You can copy-n-paste the following example directly from this web page into a Python notebook cell:
import ROOT
# You may want this if you'd like your ROOT plots to be interactive in the notebook.
%jsroot on
# Define a canvas
my_canvas = ROOT.TCanvas("my_canvas","my_canvas",800,600)
hist=ROOT.TH1F("hist","example histogram",100,-3,3)
hist.FillRandom("gaus",100000)
hist.Draw()
# You have to draw the canvas to see it in the web page.
my_canvas.Draw()
ROOT C++
This kernel is the
ROOT
C++ interpreter,
cling
. In addition to working with ROOT, it also provides the
C++
language within a notebook. Here's an
example
of using ROOT within a notebook (the C++ example is near the bottom).
A simple test, which you can copy-n-paste directly from this web page into a ROOT C++ notebook cell:
%jsroot on
TCanvas mycanvas("name","title",800,600);
TH1D test("test","example title",200,-3,3);
test.FillRandom("gaus",10000);
test.Draw();
// Unlike interactive ROOT, once you've drawn on a canvas,
// you must draw the canvas explicitly to see it in the notebook.
mycanvas.Draw();
Bash
Bash
(from "Bourne-Again SHell") is a shell language for UNIX systems. There's a good chance it's the
shell you use when you login to the Nevis
Linux cluster. With this kernel, you can develop shell scripts.
Fortran
Fortran
(from "FORmula TRANslation") is a mathematical computer language. For decades it was the backbone of computer programming in physics, and many say that it's still the most efficient language for implementing mathematical tasks. This kernel provides an interface to the
GNU gfortran
compiler, which is fully compliant with the Fortran 95 Standard and includes some Fortran 2003 and Fortran 2008 features.
Note:
- The Fortran compiler provided within the
notebook
server does not include CERNLIB.
- You can also create Fortran functions that can be called by Python routines using Fortran magic
.
Gnuplot
Gnuplot
is a graphing utility for visualizing mathematical functions and data interactively. There are Gnuplot
cell magics
that let you use Gnuplot graphics to create plots within some of the other kernels on this page, such as Julia and Octave.
Julia
Julia
is a high-level, high-performance dynamic programming language developed at MIT for technical computing. It combines the ease-of-use of Python with the speed of Fortran. Here's a Julia
tutorial
, though the plotting examples may not work in Jupyter unless you use
PyPlot
); e.g.:
using Pkg
Pkg.add("PyPlot")
x = range(0; stop=2*pi, length=1000);
# The "." after the Base function tells Julia it's operating on a Vector
y = sin.(3 * x + 4 * cos.(2 * x));
PyPlot.plot(x, y, color="red", linewidth=2.0, linestyle="--")
PyPlot.title("A sinusoidally modulated sinusoid")
Here's another little example:
# You probably won't need to add these packages, since WilliamSeligman has
# already done so on the notebook server.
using Pkg
Pkg.add("Plots")
Pkg.add("DifferentialEquations")
using DifferentialEquations
f(u,p,t) = 1.01*u
u0 = 1/2
tspan = (0.0,1.0)
prob = ODEProblem(f,u0,tspan)
sol = solve(prob, Tsit5(), reltol=1e-8, abstol=1e-8)
Plots.plot(sol,linewidth=5,title="Solution to the linear ODE with a thick line",
xaxis="Time (t)",yaxis="u(t) (in μm)",label="My Thick Line!") # legend=false
Plots.plot!(sol.t, t->0.5*exp(1.01t),lw=3,ls=:dash,label="True Solution!")
Octave
Octave
is a scientific programming language, with nice features for handling vectors and matrices, and good visualization tools. It's an open-source equivalent of
Matlab
. Here's an Octave
tutorial
.
Python 2
Python 2 is also available as a kernel. Python 3 is the
future
of the Python language, and Python 2 is no longer officially supported. I haven't deleted the Python 2 kernel yet, but I no longer keep it up-to-date or compatible with ROOT. I strongly advise you to move away from it and into Python 3.
R
R
is a language for statistical computing and graphics; it's the open-source version of
S+
. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques. Here is an R
tutorial
.
Ruby
Ruby
is a dynamic, open source programming language with a focus on simplicity. Many prefer it to Python as a first programming language. The
SciRuby
packages are part of this installation, so that Ruby can be used for scientific computation. Unfortunately, there is no current working link between ROOT and Ruby (though one
used to exist
and might exist again
someday
). Here's a
link to tutorials
.
Plotting in Ruby notebooks is limited. The
GnuplotRB
package is available for plots; you'll have to search through the examples for x-y plots and histograms. Use the
svg
format for your plots; for some reason GnuplotRB won't display in
png
or
jpeg
within a Ruby kernel.
No,
Ruby on Rails
is not installed. We do
not want you building web applications on the
notebook
server.
SageMath
SageMath
is an open-source mathematics software system. It's a wrapper around different symbolic math and statistical packages. It's intended as a open-source replacement for Maple, Mathematica, and Matlab. Here's a SageMath
tutorial
.
If you ask me if we have
Mathematica
at Nevis, this is where I'll send you.
Tcl
Tcl
is another scripting language, frequently used with a cross-platform graphic user interface package
Tk
. The latter is included with Python, but it will not function properly in the web-browser environment of Jupyter (there's no X-windows environment inside a web browser). Here's a Tcl
tutorial
.
...and more
You don't necessarily need an explicit kernel to develop scripts. Jupyter has "cell magics" that let you redefine the language being used within a given cell. If you execute
%lsmagic
you'll see a list of available cell magic commands. Among the commands are those that switch between different languages within a single notebook.
Perl
For example, if you want to work on a
Perl
script, you can put lines like this in a cell in any kernel:
%%perl
# An uninteresting example: display all the regular files in my home directory
use strict;
use warnings;
use Path::Class;
my $dir = dir($ENV{'HOME'});
# Iterate over the content of my home directory
while (my $file = $dir->next) {
# Skip if it is a directory
next if $file->is_dir();
# Skip if the filename ends a ~ (emacs work file)
my $filename = $file->stringify;
next if $filename =~ /~$/;
# Print out the file name and path
print $filename . "\n";
}
Cython
Cython
is a superset of python in which the commands are compiled into C. For example:
%load_ext Cython
# Next cell:
%%cython -a
def geo_prog_cython(double alpha, int n):
cdef double current = 1.0
cdef double sum = current
cdef int i
for i in range(n):
current = current * alpha
sum = sum + current
return sum
# Test in next cell:
geo_prog_cython(4.0,5)
Want even more?
It is my intention to include every available Jupyter kernel that might have an application in physics, as long as there's a clear installation method for adding it to Jupyter. If you have a request for an additional kernel, or for a library or extension to be added to an existing language, please let
WilliamSeligman know.
Before you ask: I've already tried to install Jupyter kernels for Forth, Haskell, and Perl. Each presented a technical issue, ranging from "simply doesn't work" to "not compatible with being invoked from Jupyter".
Limitations and workarounds
- The
notebook
server is a shared resource for use by anyone in the Nevis particle-physics groups and/or the REU students to do light development tasks. If you need to run long, CPU-intensive, or multi-threaded parallel process via Jupyter, notebook
is not a good choice. You'll potentially interfere with everyone else trying to use it at the same time. For these high-resource tasks, you can run Jupyter on your workgroup's server instead. (You may interfere with everyone else in your workgroup, but that's between you and them, not you and everyone else with a Nevis Linux cluster account.)
- The Jupyter notebooks inherit your user environment, that is, the variables that you define in your shell startup scripts. However, if you modify certain variables such as $LD_LIBRARY_PATH or run customization programs (such as
module load root
) in your initialization, it can affect the execution of the notebook server. The typical symptoms are a notebook kernel that refuses to start or you get library load errors.
- There's no guarantee that software compiled on your workgroup server will run on
notebook
, especially if it relies of software libraries only found on that server. If you need libraries that were compiled for your workgroup server, you'll probably have to use them on your workgroup server. You'll know if this is the case if you get library errors when trying to use your own compiled libraries via notebook
.
The solution to most of these issues is to run Jupyter on your workgroup server, or consider a
container-based distribution.
Jupyter on your workgroup server
Jupyter has been made part of the
Python 3.6
distribution at Nevis, which is automatically set up when you type the
environment modules command at the terminal:
module load root
This will load ROOT 06.24 or later. See the
environment modules page for more information, including how to look up available ROOT versions.
Once you set up ROOT, in theory you'll be able to run Jupyter:
jupyter notebook
This will start up a web browser on the system on which you execute the command (not your laptop!), with the web page open to
localhost:8888. This is
not what you'll want to do normally.
Remote access
You probably want to see Jupyter via a web browser on your laptop. To do this, you
must port-forward a connection via ssh. The complete instructions are
here
. What follows is a brief summary.
On the workgroup server, this command will start up Jupyter for you:
jupyter notebook --no-browser --port=XXXX
... where XXXX is an unused port on the server; e.g., 7000. If multiple users want to run Jupyter on your server, you'll have to coordinate with them so that you don't use the same port. You will see a message on your terminal that includes something like this:
Copy/paste this URL into your browser when you connect for the first time,
to login with a token:
http://localhost:XXXX/?token=<string-of-hex-digits>
...where XXXX is your argument to the
--port
option. Copy that entire URL string.
On your laptop, forward that port XXXX to you:
ssh -N -f -L localhost:YYYY:localhost:XXXX <username@server.nevis.columbia.edu>
...where
<username@server.nevis.columbia.edu>
is your Nevis account and server on which you ran the
jupyter
command. YYYY can be any unused port on your laptop; often users pick YYYY=XXXX, but you don't have to.
Then to access the notebook, go to the web browser on your laptop and visit
http://localhost:YYYY/?token=<string-of-hex-digits>
. Note that this is almost the same as the URL you copied above, except that you'll have to substitute the laptop port YYYY for the XXXX in the message.
There's a potential problem with just using any value for YYYY: Most current web browsers won't let you visit random ports anymore. In
Firefox
, there is a
workaround
; in
Safari
there isn't. The simplest solution is to let YYYY be a generally-recognized port for internet access; e.g., 8080.
After all that...
You'll hopefully see a Jupyter home window similar to the one you see when using the
notebook
server. The chief difference is that you won't have the full range of exotic kernels available on that server, just Python and ROOT C++. As long as your jupyter command keeps running, you can login again without the token by making sure the ssh port forwarding is running on your laptop, then visiting
http://localhost:YYYY
in your browser.
If you want to keep your Jupyter process running even after you've closed the terminal window on your workgroup server, you may want to use the UNIX
tmux
command. The commands would look something like this:
tmux
module load root
jupyter notebook --no-browser --port=XXXX
# Copy the URL
# Switch to a different screen to work
<Ctrl-b c>
You can close the terminal window whenever you wish; your processes (including jupyter) will continue to run. When you login to your workgroup server again, the command
tmux attach
will reconnect you with the screen(s) you created before, including the jupyter screen.
Jupyter on your laptop
See the
Jupyter/ROOT containers page.