ATLAS Web>SummerStudents2011 (2011-06-09, KatherineCopic)

Getting Started with Atlas

This page is organized into several sections. The primary goal is to get you logged into our computers and setup the software so you will be able to analyze data as quickly as possible. The first part will be steps that you'll all need to follow to log into your account as well as general setup and useful tips for working on our cluster. This will be followed by analysis-specific instructions and links to papers, talks etc. Finally, we've included some links to more general Atlas, CERN or particle physics webpages that are either useful or interesting both.

General Recipes:

First, you will probably need general CERN/Atlas accounts to be able to use our code repository (SVN) and to read our papers/talks etc. John Parsons will help with this step.

You should all have an account on the nevis computing cluster already, and this same username and password will allow you to log into the Atlas cluster, called xenia. There are ~150 processors in this cluster, but most of them are reserved for batch processing. We work interactively on two nodes: xenia.nevis.columbia.edu and xenia1.nevis.columbia.edu. You can log into them from a terminal window using the ssh command:

ssh -Y [USERNAME]@xenia1.nevis.columbia.edu

From campus, you can't log in to xenia1 directly, but you can still get there in one line with something like:

ssh -Y -t [USERNAME]@kolya.nevis.columbia.edu ssh -Y [USERNAME]@xenia1

(use your username, not "[USERNAME]"). You are now logged in and should see a prompt like:

 [summerstudent@xenia1]~%

To setup our environment you will need to add the following lines to your .zshrc file. You only have to do this one time. Open the .zshrc file with an editor (my favorite is emacs):

 emacs ~/.zshrc

and add these lines at the end:

 export ATLAS_LOCAL_ROOT_BASE=/a/data/xenia/share/atlas/ATLASLocalRootBase/
alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'

To run our Columbia framework, AnalysisUtilities, with minimal setup, it's helpful to have a compatable root version as your default. You can set this up by adding the following lines:

export ROOTSYS=/a/data/xenia/share/atlas/ATLASLocalRootBase/x86_64/root/5.27.02-slc4-gcc3.4

export PATH=$PATH:$ROOTSYS/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ROOTSYS/lib:.

I've made a script that you should run every time you log into xenia. This sets up the environment for using the atlas software. From my home area copy the file:

 cp /a/home/karthur/tandeen/forSummer2011.sh .

and then

 source forSummer2011.sh

We use SVN as a way to save, backup and share our analysis code. Think of it as a online library where we checkout (and eventually check in) our work. To checkout packages of code you will need to do the following:

export SVNAP=svn+ssh://summerstudent@svn.cern.ch/reps/apenson
cd /a/home/karthur/summerstudent (or, if this doesn't work, cd /a/home/kolya/summerstudent)
kinit  
svn --username summerstudent co $SVNAP/ArCondNevis 
rm  -rf Analysis 
svn --username summerstudent co $SVNAP/Analysis

This will give you a clean version of our analysis code to start with. You might want to add that first line to your .zshrc file so you don't have to type it in everytime you want to checkout or update the code from SVN.

Specific Analysis Recipes

Excited Electrons

Now that we have the tools setup we'll need to grab some code. First make a working area, we'll do this in your home area for now. I like to organize things so make a directory like "Summer2011" in your home area: (btw these first steps are a repetitive)

mkdir Summer2011

then in that directory do

export SVNAP=svn+ssh://tandeen@svn.cern.ch/reps/apenson 
kinit tandeen@CERN.CH 
svn --username tandeen co $SVNAP/ArCondNevis

This creates several new directories under ArCondNevis: arc_d3pd and Analysis. We'll look at arc_d3pd later: It is how we run our analysis in batch (ie: many jobs at once) on our cluster, but for now we will just start with 1 thing at a time! The other directory is Analysis. This is a placeholder, to get the latest, greatest versions we will do:

cd ArCondNevis 
rm -rf Analysis

and then

svn --username tandeen co $SVNAP/Analysis <---always your username instead of mine

Now, in case your cern accounts are not in order right away (which would prevent you from using svn) you can just copy all of this from my home directory like this:

cp -rp /a/home/karthur/tandeen/forSummerStudent4 .

( note for me: this is revision 2173)

It should be identical. However, as soon as your account are ready you should check it out of svn and use that version. You will need to do this to check your changes in eventually, so it is best to start with svn as soon as possible.

Within the Analysis directory there are many subdirectories, each with a specific analysis. You can poke around any of them you'd like, there are some good examples of how to code and do analysis on Atlas in there. But we are interested in ExcitedElectrons in particular. Go to this directory.

cd ExcitedElectrons

if you do "pwd" you should now be in something like:

/a/home/karthur/tandeen/forSummerStudent4/ArCondNevis/Analysis/ExcitedElectrons

You need to copy a couple of files over from my area that aren't in svn. These are files that we update regularly as we add more and more data, so we just keep the locally. They are here:

cp /a/home/karthur/tandeen/mu_hists/May27/mu*.root . 
cp /a/home/karthur/tandeen/fileLists/localListdata_2011skim_ExcitedEl* . 
cp /a/home/karthur/tandeen/GRLs/* .

then do:

source compileEverything.sh

This will take ~1/2 hour because the first time you run this it has to, well, compile everything. This includes code that is not in the ExcitedElectron directory, mainly AnalysisUtilities. Many of us use AnalysisUtilities, which contains common code that is useful in many analyses. This is a largish package, but the good news is we only need to compile this once (for now) and the code you will actually be editing in ExcitedElectrons will compile quickly after this.

While you wait here is some reading material:

Interesting Links:

Atlas: http://atlas.web.cern.ch/Atlas/Collaboration/ (which you probably already know)

General excited electrons Twiki: https://twiki.cern.ch/twiki/bin/view/AtlasProtected/ExcitedLeptonAnalysis2011

Long note about Z' searches, which is closely related to the excited electron search (published this spring): http://cdsweb.cern.ch/record/1325590/files/ATL-COM-PHYS-2011-083.pdf

The published paper about the Z' search: http://arxiv.org/abs/1103.6218

At the top of this page https://twiki.cern.ch/twiki/bin/view/AtlasProtected/ExcitedElectrons we have a quick intro to excited electrons.

The longer, original paper is here: http://prd.aps.org/pdf/PRD/v42/i3/p815_1

(which you should be able to access at cern or from within a university network, but maybe not otherwise.) It is a little heavy math-wise, but the intro and conclusions are more accessible.

The link to the D0 result is here: http://arxiv.org/abs/0801.0877

(D0 is an experiment much like Atlas, but at the Tevatron in Illinois. )

And the CDF result is here:

http://www-cdf.fnal.gov/physics/preprints/cdf7035_estar_prl_v5.ps

(like D0, at the Tevatron)

And I am including a longer, internal D0 note that has more specifics about the analysis here:

http://www.nevis.columbia.edu/twiki/pub/ATLAS/SummerStudents2011/D0_note_1.pdf

Running the Analysis

Once you have compiled the excited electrons code you should be in the ExcitedElectrons directory and you should see a file called "run". As you might guess, this is the main executable that you will run like this: (it takes a minute or two)

./run /a/data2/xenia/users/tandeen/mc/mc10b/signal/user.aabdelal.mc10_7TeV.119291.Comphep_Estaregam700.merge.AOD.e778_s933_s946_r2302_r2300.16.6.5.2.SMWZD3PD_01.110519185612/user.aabdelal.003598.StreamNTUP_SMWZ._00001.root physics EF_e20_medium results.root none -mc

There are a number of arguments here. The first tells you where the input file is, for now we will just run over this one input file. You should open this file with root and see what it in there. The general idea is that the event by event data is stored in collections called trees. The next argument is the tree name, in this case "physics". Variables in trees can be plotted from within this root file, but once one starts to make more complicated cuts and corrections we need an actual program to help us do the analysis. The next argument is the trigger name. There are collisions every 50 ns at Atlas right now. This is far to much data far to quickly to read out and save to disk, so we have "triggers" that run in real time on the detector that tell us at a very basic level if an event is interesting. you can learn more triggers in chapter 8, and about the rest of Atlas too, here:

https://twiki.cern.ch/twiki/pub/Atlas/AtlasTechnicalPaper/Published_version_jinst8_08_s08003.pdf

For us we are interested in "e"lectron objects with transverse energy (Et)>'"20", that pass "medium" cuts (as opposed to "loose" or "tight", we'll talk about the details later). Next we give it the output file where we want the data to go "results.root". This file will have our output histograms. Then "none" is an option for additional output that we won't use now and "-mc" indicates we are running over monte carlo.

Go ahead and run it. The output will be in result.root (or whatever name you have given it). We have just run over a Mote Carlo file. As you might know, we use a technique called monte carlo to simulate what we expect our data to look like, given a certain set of assumptions. With the Monte Carlo (MC) we generate events according to a physics model, in this case here a model in which the Standard Model of quarks and leptons is true, but there is also an extension that include excited electrons. We take these events and put them through a detector simulation, taking the decay products that we simulate (in this case mostly electrons and photons) and testing how the Atlas detector would observe them, ie. how these particles will interact with the detector. For example, for an event in which there was an excited electron which decayed into an electron and a photon we ask: Was the photon actually produced in a location and with a direction that would allow the detector to see it, or perhaps was the photon produced in such a way that it went between the cracks of the detector and was not actually observed?" This we call acceptance, and we can talk more about this later. We take into account all the effects we can think of when simulating these events, to make this look as much like the data we observe from the detector as possible. Essentially, this MC prediction is our hypothesis which we compare to the data to test our proposed model of new physics. Sometimes we can look at the MC and right away know that, given the detector and the energy we have with our LHC protons, we will not be able to observe the effect. Most of the time we are very optimistic and assume in our MC that this new effect is right around the corner. Then we compare the MC to the data and can say either that we see something, discovering something new, or that we don't see anything. Not discovering anything is a result too though, in that way we can limit the effect of this type of new physics. We don't usually say that the new physics can never be, we say that, at in the range of energies we can observe, this is not the case.

The MC does not have to represent some specific model of new physics either (in our case it does). It could just be the most precise description of the Standard Model we have, and then we compare the events we see in data to this and just look for any difference at all. And of course if we are really lucky as experimenters we will have an experiment which is sensitive to nature in a way that no one has ever probed before (and this is what we have!) and we can just look at the data and see something completely new and unexpected which no model (Standard or otherwise) predicts. As you probably already know, what is exciting right now is that Atlas might really be able to see new physics, and that every model we test, in every decay channel we search in, there might be something new.

What are excited electrons:

Let me add a few short words about excited electrons. An excited electron is an electron that has been "excited" above it's ground state. This excitation is massive so it is short-lived and quickly decays into a electron (an unexcited, normal, stable electron) and a photon (the energy it emits as it returns to it's ground state). I think people usually make a little joke/analogy here about excited electrons (or other excited particles) being like people at a club late into the night. They are filled with energy, but after dancing away enough energy they eventually (for some of us rather quickly) they decay into their more stable, everyday selves. So we're looking for the exciting electrons living it up in nightclubs, but so far we haven't made it past the bouncer. (meh-not so bad for physics humor I think.) Anyway, a more useful analogy is to think of these excited electrons as being like a hydrogen atom that is excited above it's ground state. In the case of an atom we think of the electron occupying a higher energy level (think back to chemistry I guess). But in the case of an electron…, well what could be occupying the higher energy level? We think of electrons as point-like, fundamental particles: How does an excited electron have more energy than a regular electron? What did we excite with in it? This is the "trick": If we see an excited electron it means that the electron itself is not fundamental after all. There must be some sort of structure that is smaller than and even more truly fundamental than electrons; like the electron is to the atom, or like quarks which make up protons and neutrons. This is why excited electrons are interesting. Why haven't we seen them yet? Because the energy it takes to produce one has not be available to us in our colliders. But perhaps it is now.

Analysis output

But back to the result.root file you have just made. Open it up in root like:

root results.root

From the root command prompt open a root browser with

TBrowser b

from here you should see something like an directory structure that you can click through. Go to "ROOT Files", where you should see "results.root". Click on this and you'll have a list of all the directories within this file. Each of these contains a set of histograms, the content of which is (albeit vaguely) hinted at by the directory title. For example, in "kin_electrons" you will find the certain kinematic variables plotted for the electrons that pass our preselection. You might try clicking on a few, like "et", which will show you the histogram of the transverse energy of the electrons. For an explanation of all the variables in here you can look here:

https://twiki.cern.ch/twiki/bin/view/AtlasProtected/D3PDContent or more specifically:

https://twiki.cern.ch/twiki/bin/view/AtlasProtected/D3PDContentElectron

and of course I'll tell you about some of them later on.

The most interesting root histograms are in "TwomedElLoosePh". This means we have selected events with 2 medium electrons (medium being a measure of how electron-like we think they are) and 1 loose photon. In here there are several more directories. For example we have "e0p_TwomedElLoosePh", which is where we take the electron with the most transverse energy and the photon and combine them into a single object (ie: we add their 4-vectors together) and we get interesting things like their mass and momentum. You should click on "m_log", which is the invariant mass of the object, but plotted out to several TeV. If the excited electrons exist they will form a resonance, and you will see a peak at a particular mass value. Here we used MC inputs with a mass of 700 GeV, and sure enough we have a significant peak there. You might try running over the other signal MC files we have made at different excited electron mass values and looking in the same directory:

/data2/users/tandeen/mc/mc10b/signal

You should see some significant differences.

(btw: for root help and tips ask me, or the grad students around there. everyone needs root help at some point. but also there is http://root.cern.ch, which has some decent tutorials, but with root often you can save hours of work by just asking).

You should take a look around at the various histograms. Some I think you'll be able to figure out, but others I'll explain later.

In the ExcitiedElectrons directory there is also a file "readme_myAnalysis.txt" which has some useful instructions which you should read. Eventually (soon) we will be combing several root files and comparing data and mc. You can take a look at some of the code that does this in ExcitedElectrons/results_post (which also has a readme file).

At this point I think it would be most interesting for you to run over the actual Atlas data, specifically 2 of our most recent runs ( http://atlas-runquery.cern.ch/query.py?q=find+run+180664 and http://atlas-runquery.cern.ch/query.py?q=find+run+00180776 -- which is from May 3rd, but that's still pretty fresh). You can run like this:

./run localListdata_2011skim_ExcitedEl_short physics EF_e20_medium results.root none GRL.xml &results.log &

(note that this overwrites the previous output file!). This will be short, and you should take a look at the histograms now. You might also try localListdata_2011skim_ExcitedEl instead of localListdata_2011skim_ExcitedEl_short, but the best way to run over data files (which are larger and take longer to run over) is using the batch system. That's the next section I think.

How to use the condor batch system on xenia

A useful twiki page is here:

http://www.nevis.columbia.edu/twiki/bin/view/ATLAS/ArCondNevis but I will add detailed instructions.

We submit files to the condor cluster from the directory

Summer2011/ArCondNevis/arc_d3pd/

Here we have arcond.conf. This is the main configuration file where we indicate the datasets (for example a dataset with part of the 2011 run, or a dataset with a particular type of MC) we want to run over. Instead of editing this file (for now) you will need to copy a few files from my area again. First:

cp -r /a/home/karthur/tandeen/ArcondFiles/arcond_2011 .

(you should be in the Summer2011/ArCondNevis/arc_d3pd/ directory). In this new directory there are config files set up for running over mc and data. We want to look at data so

cp arcond_2011/arcond_data2011.conf arcond.conf

(replaces the exiting arcond.conf file).

Looking at this new arcond.conf file you'll see the line:

input_data = /data/xrootd/data/xrootd/data11/ExcitedEl/

This is the "directory" where the skims are. It is not exactly the directory location on xenia, these files are distributed across the cluster and when you submit a job the condor system makes sure to send jobs to the nodes where your file are. (Instead of keeping the files in one place and coping the data to the cpu we run the job on we copy the analysis code to where the data is and run the job there. It turns out to be faster not to move the data files, which can be large). You can see what files are in here by doing

arc_nevis_ls /data/xrootd/data/xrootd/data11/ExcitedEl/

there should be 152 files there.

Now we need to copy two other files over. They will go in the "user" directory (so something like Summer2011/ArCondNevis/arc_d3pd/user).

cp /a/home/karthur/tandeen/ArcondFiles/user/* user/.

we have two files, one for running on mc, and one for data. Use the data file first. (the difference is the way we run the code, with or without a trigger selection for the data). We want the data one first again:

cd user
cp ShellScript_BASIC.sh_data ShellScript_BASIC.sh
cd ..

One last step, this you only have to do once (and it is only because I forgot to add these files last week). In the

Summer2011/ArCondNevis/arc_d3pd/patterns

directory do

svn update

This should add 6 new files: schema.site.xenia03.nevis.columbia.edu.cmd thru schema.site.xenia08.nevis.columbia.edu.cmd

One last step; for historical reasons we have to do

mkdir ~/Summer2011/ArCondNevis/Analysis/cmt

in the analysis area you will be using.

Now you should be able to submit jobs. You must be logged onto xenia (not xenia1) for this to work. From the Summer2011/ArCondNevis/arc_d3pd/ directory just do

arcond

This will ask you several questions, you should agree to everything (things like "do you really want to use these files" or "do you want to submit the jobs"). At the end it should tell you that you can check the submission with the command

condor_q

and you should see your jobs (and anyone else's who happen to be running) listed. Once you get to the top of the queue and your jobs start to run it should take only a few minutes.

You can check the output with

arc_check

It should report that there are output root files for all of the jobs submitted. Finally, you can add all the output files together with

arc_add

and the file Analysis_all.root should appear in Summer2011/ArCondNevis/arc_d3pd/. This is the output root file, and you've run over all the data we have on xenia. This data is the data thru the beginning of May, I hope to add more soon.

Randall-Sundrum Gravitons in Dielectron and Diphoton Final States

Here are some papers to get started:

D0 Paper link: http://arxiv.org/abs/1004.1826

Draft of ATLAS Z' Conference note (2011 data): https://svnweb.cern.ch/trac/atlasgrp/browser/Physics/Exotic/Analysis/Dilepton/Resonance/Papers/Y2011/CONF_PLHC/zp.pdf

ATLAS Diphoton Resonance Conference note (2010 data): http://cdsweb.cern.ch/record/1331846?

Original Theory Paper: http://arxiv.org/pdf/hep--ph/9905221

Check out Analysis, as described above.

cd AnalysisTemplate

Where to find data ntuples, with the signal selection applied (from xenia1):

/scratch/earwulf/ntuples/DiEMPairs_data/2011/

ntuples are organized by data taking period.

to run over all the data, you can so something like:

./run /scratch/earwulf/ntuples/DiEMPairs_data/2011/diEMPairs* analysisTree plots.data.root

Monte Carlo ntuples, with the same selections applied, as well as EventWeights calculated, can be found here:

/scratch/earwulf/ntuples/DiEMPairs_mc/2011/

to run over mc, you can do something like:

./run /scratch/earwulf/ntuples/DiEMPairs_mc/2011/diEMPairs.mc.[sample].root analysisTree plots.[sample].root -mc

where [sample] is one of the available samples:

zee

w+jets

ttbar_binned

ttbar_unbinned

diboson

(you probably don't want to run over all the MC at once like with the data, as it is helpful to resolve from which MC process an event came from)

Once you've made some plot files, you can plot them interactively in root, like:

root plots.data.root
my_plot->Draw()

etc. etc. but once you have a lot of plots to make, fits to do, and fussy style concerns, this can be impractical. Its better to use a script. Since you know something about root and C++, you can write a root macro and run it, but this isn't much faster than running interactively. Another option is to write a python script.

Python

Unfortunately, the version of root set up in your .zshrc file doesn't play well with python, and visa versa. You might want to start by opening up a new terminal to use for plotting.

To set up python, do:

setupATLAS

 

localSetupROOT

now you can try writing a little script:

emacs draw_plots.py

you need to import ROOT and the root classes you need:

import ROOT
from ROOT import (TFile, TLegend, THStack, TCanvas, TH1F, gROOT, gStyle)

to prevent python from opening up xwindows all the time, add:

gROOT.SetBatch()

to get nice looking, ATLAS style plots, import SetAtlasStyle

import SetAtlasStyle

for this to work, you need to have SetAtlasStyle.py somewhere in you PYTONPATH, a good place is the directory you are running from. If you are in AnalysisTemplate, you can try:

cp ../AnalysisUtilities/python/SetAtlasStyle.py .

Now, back to the script. We can try loading the root files and putting them into a python dictionary:

inputFiles = { "data": TFile("plots.data.root"),
               "zee": TFile("plots.zee.root") }

make a TCanvas (you need this to draw your plots on):

c1 = TCanvas()

you can retrieve the plots like this:

your_data_plot = inputFiles["data"].Get("[your data plot name]")
your_zee_mc_plot = inputFiles["zee"].Get("[your data plot name]")

and draw them:

your_data_plot.Draw("e")
your_zee_mc_plot.Draw("e,same")

where the plot option "e" is specifying that you want error bars, and "same" is telling you that you want to overlay the second plot on the first.

to make a nice eps file to look at, you can do:

c1.Print("your_plot_name.eps")

Try saving this script so we can try it out. To run the script, do:

python draw_plots.py

Your .eps file should appear. You can try opening it with ghostview:

gv your_plot_name.eps

Once you have this working, you can try sprucing up your plot by adding a legend (in an appropriate place in your script):

legend = TLegend(0.60,0.61,0.92,0.91)
legend.SetShadowColor(0)
legend.SetFillColor(0)
legend.SetLineColor(0)

legend.AddEntry(your_data_plot, "data")
legend.AddEntry(your_zee_mc_plot, "zee mc")

legend.Draw()

For more information on TLegend, TCanvas, TH1F, etc. etc., the root documentation is onvaluable. See, for instance: http://root.cern.ch/root/html/TLegend.html. I find that a quick search with google usually pulls up the appropriate page at root.cern.ch.

You might find the following script useful, both to include and as a python/root example:

/scratch/earwulf/projects/AnalysisUtilities/python/makeFigure.py

(or in your own AnalysisUtilities directory, if you can run svn update)

If you copy this script to you local directory or otherwise include it in sys.path, by, for instance, adding:

import sys
sys.path.append("../AnalysisUtilities/python")

(where you've checked that ../AnalysisUtilities/python contains makeFigure.py)

you can then do:

import makeFigure
from makeFigure import SetColor

then you can use:

SetColor(your_histogram, "blue")

to set the line/marker/fill color of your histogram to blue in one line (look over the script to see how its done)

I've also added a function that you might find handy for making TLegends:

from makeFigure import MakeLegend

legend = MakeLegend( plots = {"data" : your_data_plot, 
                              "zee mc" : your_zee_plot} )

Try running over some of the other MC ntuples, so you have a few mc plot files to play with. To represent the sum of the backgrounds, while still distinguishing sthe contributions from each component, THStack is very useful. In you script, you can do something like:

your_stack = THStack("your_stack", "invariant mass of MC backgrounds")
your_stack.Add(your_zee_plot)
your_stack.Add(your_ttbar_plot)
your_stack.Add(your_diboson_plot)
your_stack.Add(your_wjets_plot)

then you can draw it (along with the data we are comparing it to) with, say:

your_data_plot.Draw("e")

your_stack.Draw("same,hist")
your_data_plot.Draw("e,same")

where we redraw the data plot to make sure the axes are visible (try not doing this, maybe it isn't always necessary?)

if you have axis drawing issues, you might also find that adding the following after histogram drawing but before canvas printing is helpful:

c1.RedrawAxis()

One nice thing about python's speedy interpreter and concise syntax is that you can play around with different ways of doing things without wasting to much time. Since ROOT can be finiky (and at times poorly documented), this is useful.