Linux Cluster Disk Guide
The most important things to learn from this page:
- /home and
/share
partitions are backed up.
- /data and
/scratch
are not.
- The disk quota for temporary and guest accounts is 10GB.
This is a guide to issues with disk storage on the on the
Linux cluster.
- To find out how disks on one system can be accessed from another, see the automount page.
- To understand different partition names (e.g., why are there
/share
and /scratch
directories), see the disk sharing page.
- To learn about the server for temporary and guest accounts, see the student file server page.
How much disk space do I have?
To find out how much disk space you have available, use the
df
command. You'll probably always want to use the
-h
option, so the sizes appear in human-readable form:
df -h
You'll almost certainly see disks in the list that are mounted via
automount. If you find the automounted disks to be distracting, add
-l
to the command:
df -hl
Bear in mind that you
don't want to use the
-l
option if your home directory is not on the machine to which you've logged in. (As of Jan-2017, this mainly applies to ATLAS users logged onto
xenia
.)
Here's the result of executing
df -hl
on the machine
tanya on 28-Jan-2017:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VG-root 20G 9.2G 9.5G 50% /
tmpfs 494M 760K 493M 1% /dev/shm
/dev/md0 461M 98M 340M 23% /boot
/dev/mapper/VG-home 50G 16G 32G 33% /home
/dev/mapper/VG-data 222G 6.1G 205G 3% /data
If we ignore partitions that relate to the operating system, we're left with two key user-accessible filesystems:
/home
and
/data
. (Many systems have other key partitions, such as
/share
and
/scratch
.)
It's not much
Your first reaction may be: "There's not much disk space for my home directory, and I have to share that space with other people in the collaboration. Why, my watch has more storage than that!"
You're right. It's intended that the
/home
be used for "source" files (program code, scientific papers, plots, etc.);
/data
or
/scratch
should be used for large and re-creatable files (compiled binaries, data summaries, temporary work files, etc.). We have to ask you to use judgement and discipline, and to be aware that you're sharing space with your fellow scientists.
If you're just skimming this page, stop and read this
The reason why
/home
is small and
/data
is big is that the
/home
partition is backed up;
/data
is not. In fact, it goes one step further: the
/data
partition is always considered expendable for any type of system maintenance activity. If a system is being repaired, upgraded, or restored, the
/data
partition may be erased.
There's more about this in the section on backups below.
What do I do if I need more disk space?
First, look to
/data
partitions on other systems in your working group. The
/data
partitions on all the systems that belong to a group are intended to be a shared resource; if you don't have enough space on
/nevis/yourmachine/data
,
cd /nevis/othermachine/data
in your group and see how much free space it has.
I strongly advise you to exercise common courtesy as you're scrounging for disk space. If I found someone had used a big chunk of my server's
/data
partition without asking, I might be annoyed.
If you still don't have enough disk space on all your group's machines to satisfy your needs, you may have to request more disks be added to the existing systems (or buy a new box).
Backups
The Nevis Linux cluster is backed up nightly onto
shelley, the Nevis backup server.
For speed, we don't copy every file from every system; we use a program called
rsync
to copy over only those files that have changed since the day before.
We don't back up every file on every system on the cluster. The policy is: the
/home
partition and
/share
partitions are backed up;
/data
is not. There is a web page that contains the
list
of which partitions are backed up.
We maintain previous versions of old files on
shelley. (Actually we do an incremental
tar
of the disk images after the
rsync
procedure has run for all the machines in the cluster.) This means we can recover old versions of files if necessary. However, there's a time limit: we only keep old file versions for
10 days. We cannot recover files that were deleted or overwritten prior to that.
Long-term data storage
For the purposes of this section, "long-term" means more than six months or so.
By the above definition, there is no long-term data storage at Nevis. As noted above:
- we back up
/home
directories, but keep old back-ups for no more than a few weeks;
-
/data
directories are not backed up at all;
- RAID arrays can and do fail. (This section is being written on 25-Apr-06; on that day, we lost the contents of a RAID5 array.)
If you need long-term storage for any of your files, I suggest you consider the facilities at BNL, FNAL, or CERN.