AlmaLinux 9 on lee.nevis.columbia.edu
The machine
lee.nevis.columbia.edu
(after
T.D. Lee
, a Noble-Prize-winning Columbia Professor) was formerly used for the Neutrino group's initial exploration of deep-learning techniques. It's been abandoned for a few years, so I've repositioned the system to act as a testbed for the Neutrino group as we contemplate a transition to AlmaLinux 9 (a free variant of
Redhat Enterprise Linux 9
).
Here's what will help you explore the system:
- Like
hopper
, the current principle machine-learning machine for the Neutrino group, lee
is behind the Nevis firewall. To access lee
, you'll either have to "double-hop" by ssh-ing to houston
and then to lee
, or use Nevis VPN.
- The disk partitions
lee:/share
and lee:/stage0
are unchanged from what they were when lee
was used for the initial deep-learning studies from 2016-2019. However, those files no longer have any relevance to any active papers or research. If you need the disk space for anything else, let me know and I'll clear out the old files.
- Since
lee
is part of the Nevis Linux cluster, your home directory on lee
is the same as it on any other machine. Probably it's on houston
. That means that if you use pip3
or conda
or other package-management tools, the related files will be stored in your home directory, not on lee
.
-
hopper
, which is still running CentOS 7 as of May-2023, is set up with the philosophy of "if it works, don't change it." lee
is set up as a "latest and greatest" system. The purpose of lee
is for you to be understand what you might have to revise for your work when we upgrade the rest of the Neutrino Linux cluster. This means:
- As noted, the operating system is AlmaLinux 9. There are some substantial differences
between AlmaLinux 9 and CentOS 7.
- The environment modules commands (like
module load root
) won't work on lee
, because of May-2023 I haven't re-compiled any packages for AlmaLinux 9 yet.
- However, AlmaLinux 9 is not the old-and-creaky operating system that CentOS 7 was. Its native packages may be good enough for your work:
- Linux kernel 5.14.0
- gcc 11.3
- python 3.9
- root 6.28/02
- I've also installed the following packages as part of the Python libraries on
lee
(note the inclusion of some Python DL libraries): jupyter jupyterlab iminuit numpy scipy matplotlib pandas sympy terminado urllib3 tables rootpy rootkernel uproot scikit-learn tensorflow keras torch torchvision scikit-hep h5py astropy gammapy fitsio healpy astropy-healpix cython numba numba_stats
.
- This means that instead of typing
module load root
to run ROOT, try just running it.
- In keeping with the "latest-and-greatest" approach of
lee
, I've installed:
- CUDA 12.1.1
- CUDnn 8.9.1.23
- On
hopper
, the kernel version is fixed to make sure NVIDIA drivers will work unchanged. On lee
, kernel updates are permitted, and the NVIDIA drivers will be updated as part of the process. It's possible that this will require lee
to be rebooted to use the new drivers after an update.
- Bear in mind that, just considering the hardware involved,
hopper
is a superior machine to lee
. This makes sense: lee
is a custom-built desktop machine, constructed using the same techniques that a hobbyist might use to create a video-game computer; hopper
is a dedicated GPU server designed from the ground up for machine-learning applications. Don't expect lee
to be faster!
Feature |
lee |
hopper |
Processor |
Intel Core i7-6850K |
Intel Xeon CPU E5-2680 |
Number of processors |
1 |
2 |
Processor speed |
4 GHz |
1.2 GHz |
Processor cache |
15 MB |
35 MB |
CPU queues |
12 |
56 |
RAM |
64 GB |
128 GB |
Disk |
4.5 TB |
8 TB |
GPU cards |
NVIDIA TITAN X |
GeForce GTX 1080 |
Number of GPU cards |
2 |
4 |
If you have any questions or suggestions about the new OS or this setup, please let me know.
--
William Seligman - 2023-05-08