AlmaLinux 9 on lee.nevis.columbia.edu

The machine lee.nevis.columbia.edu (after T.D. Lee, a Noble-Prize-winning Columbia Professor) was formerly used for the Neutrino group's initial exploration of deep-learning techniques. It's been abandoned for a few years, so I've repositioned the system to act as a testbed for the Neutrino group as we contemplate a transition to AlmaLinux 9 (a free variant of Redhat Enterprise Linux 9).

Here's what will help you explore the system:

  • Like hopper, the current principle machine-learning machine for the Neutrino group, lee is behind the Nevis firewall. To access lee, you'll either have to "double-hop" by ssh-ing to houston and then to lee, or use Nevis VPN.

  • The disk partitions lee:/share and lee:/stage0 are unchanged from what they were when lee was used for the initial deep-learning studies from 2016-2019. However, those files no longer have any relevance to any active papers or research. If you need the disk space for anything else, let me know and I'll clear out the old files.

  • Since lee is part of the Nevis Linux cluster, your home directory on lee is the same as it on any other machine. Probably it's on houston. That means that if you use pip3 or conda or other package-management tools, the related files will be stored in your home directory, not on lee.

  • hopper, which is still running CentOS 7 as of May-2023, is set up with the philosophy of "if it works, don't change it." lee is set up as a "latest and greatest" system. The purpose of lee is for you to be understand what you might have to revise for your work when we upgrade the rest of the Neutrino Linux cluster. This means:

  • As noted, the operating system is AlmaLinux 9. There are some substantial differences between AlmaLinux 9 and CentOS 7.

  • The environment modules commands (like module load root) won't work on lee, because of May-2023 I haven't re-compiled any packages for AlmaLinux 9 yet.

  • However, AlmaLinux 9 is not the old-and-creaky operating system that CentOS 7 was. Its native packages may be good enough for your work:
    • Linux kernel 5.14.0
    • gcc 11.3
    • python 3.9
    • root 6.28/02

  • I've also installed the following packages as part of the Python libraries on lee (note the inclusion of some Python DL libraries): jupyter jupyterlab iminuit numpy scipy matplotlib pandas sympy terminado urllib3 tables rootpy rootkernel uproot scikit-learn tensorflow keras torch torchvision scikit-hep h5py astropy gammapy fitsio healpy astropy-healpix cython numba numba_stats.

  • This means that instead of typing module load root to run ROOT, try just running it.

  • In keeping with the "latest-and-greatest" approach of lee, I've installed:
    • CUDA 12.1.1
    • CUDnn 8.9.1.23

  • On hopper, the kernel version is fixed to make sure NVIDIA drivers will work unchanged. On lee, kernel updates are permitted, and the NVIDIA drivers will be updated as part of the process. It's possible that this will require lee to be rebooted to use the new drivers after an update.

  • Bear in mind that, just considering the hardware involved, hopper is a superior machine to lee. This makes sense: lee is a custom-built desktop machine, constructed using the same techniques that a hobbyist might use to create a video-game computer; hopper is a dedicated GPU server designed from the ground up for machine-learning applications. Don't expect lee to be faster!

Feature lee hopper
Processor Intel Core i7-6850K Intel Xeon CPU E5-2680
Number of processors 1 2
Processor speed 4 GHz 1.2 GHz
Processor cache 15 MB 35 MB
CPU queues 12 56
RAM 64 GB 128 GB
Disk 4.5 TB 8 TB
GPU cards NVIDIA TITAN X GeForce GTX 1080
Number of GPU cards 2 4

If you have any questions or suggestions about the new OS or this setup, please let me know.

-- William Seligman - 2023-05-08

Edit | Attach | Watch | Print version | History: r2 < r1 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r2 - 2023-05-09 - WilliamSeligman
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback