Condor Basics 
This page describes some of the basics for setting up jobs on the particle-physics computer cluster at Nevis.
For a quick introduction to condor at Nevis, see the documents on batch processing and condor tutorial that are on 
WilliamSeligman's 
ROOT tutorial
 page. 
Warning: Using condor is not trivial. You'll have to learn quite a few 
details and about 
disk sharing. What follows are a few basic concepts, but it is 
not enough to get you started. 
 Documentation 
Condor
 was developed at the University of Wisconsin. Here is the 
User's Manual
.
 Steps 
There are usually three steps to developing a program to submit to condor.
 Program 
This is the code that you want condor to execute. I assume you've developed the program interactively, but you now want to automate its execution for condor. You probably don't have to change the program itself, but you may have to move the executable and any libraries to a disk that's 
visible to the condor batch system. 
 Script 
In theory, you can run a program directly without a script; most of the examples in 
/usr/share/doc/condor-*/examples do this.
In practice, programs in physics typically need scripts to organize a program's execution environment. A shell script would invoke the 
module load commands or scripts needed to run the program; if you have to type 
source my-experiment-setup-sh before running your program,  you'd put that command in a shell script. 
Don't forget to make the shell script executable; e.g., 
chmod +x myscript.sh. 
Scripts can become "mini-programs" themselves, as in this 
example
. For example, in a simulation script I wrote, the shell script determined the particle ID to be input into the Monte Carlo, the energy of that particle, the random number seeds, and determined a unique name for the output file. 
 Command file 
Condor requires that jobs be submitted via a condor command file; e.g., 
condor_submit mycommands.cmd. This command file tells condor the script to execute, what files to copy, and where to put the program's output. The command file also tells condor how many copies of the program to run; that's how you submit 1000 jobs with a single command. 
 Batch Clusters 
There is more than one separate particle-physics batch cluster at Nevis, due to the different analysis requirements of some groups: 
-  The general cluster available to all groups;
-  Neutrino cluster for use by that group;
-  the ATLAS Tier3 cluster, which has different procedures for dealing with files than the other two. for dealing with files than the other two.
The cluster which executes a job is determined by the machine on which you issue the 
condor_submit command. For example, if you submit a job from a Neutrino system, it will run on the Neutrino cluster; if you submit a job from 
kolya or 
karthur, it runs on the general cluster; if you submit a job from 
xenia, it runs on the ATLAS cluster. 
 Where to learn 
Let's start with the obvious: If someone else in your group has a set of condor scripts that work at Nevis, copy them! If you have to write your own:
 The standard condor examples 
One way to start is to copy the Condor examples: 
cp -arv /usr/share/doc/condor-*/examples .
cd examples 
Read the README file; type 
make to compile the programs; type 
sh submit to submit a few test jobs. Note that these examples are several years old, and you may have to do some debugging to get the compilation process to work. 
You may notice that the sh_loop script will not execute; it will sit in the "Idle" state indefinitely. It won't execute unless you submit it in the 
vanilla universe; see 
batch details.
Other programs in the examples may not work either. Look at the output, error, and log files; search the web for any error messages. This will provide experience when your "real" jobs begin to fail. 
 Some practical examples 
Many of the 
details have been combined into a set of example scripts. The Athena-related scripts are in 
~seligman/condor/; start with the README file, which will point you to the other relevant files in the directory. Note that these examples were prepared in 2005, before we figured out how to do 
disk sharing properly. 
 Submitting multiple jobs with one condor_submit command 
An ATLAS example: 
Running Multiple Jobs On Condor 
As of Jun-2008, you can find several examples of multiple job submission in 
/a/home/houston/seligman/nusong/aria/work; these go further with the tips in the above link, to generate both numeric and text parameters that vary according to condor's process ID.  Look in the 
*.cmd files, which will lead you in turn to some of the 
*.sh files in that directory.  There are hopefully enough comments in those scripts to get you started. Again, these examples were were written before we figured out how to do 
disk sharing. 
 What's next 
Now look at the pages on 
disk sharing and 
batch details. This will help you create scripts and command files that work in the current Nevis environment. Once you understand the concept of organizing the resources for your job, the rest of condor is relatively easy. 
Good luck!