# Run/File Tags

## Run/File Tags

In order to characterize physics data it is useful to assign meta data to different levels of data abstraction. For data produced and used in the ALICE experiment a three layered structure is implemented:run level meta data, file level meta data and event level meta data.

This page is dedicated to the first two categories the run and file level metadata. The run and file meta data will be stored in the ALICE File Catalog. A directory structure within this database will ensure minimization of meta data duplication. For detailed information about the metadata fields at the run and file level, an internal note is ready by Markus Oldenburg.

### Structure of the file catalog

The file catalog, where ALICE files will be registered, has the following structure:

• Simulated data: /sim////

where:

• Year: Can take a value like 2007, 2008 etc.
• AcceleratorPeriod: Can take a value relevant to the LHC period (LHC07a, LHC08b, ...).
• RunNumber: Is the run identifier (0001,...).
• ProductionType: Gives information about the specific simulation which was run, which includes the conditions applied: Ideal, Full, Residual (PDC06_Residual).

It is also foreseen to have several subdirectories where the files will be grouped according to their type. The decided subdirectories are the following:

• raw/: Storage of raw data.
• reco//cond/: This will link to the directory where all the calibration and condition files used for this Pass will be stored.
• reco//ESD/: The directory where the ESDs and the tag files will be stored for each Pass.
• reco//AOD/: The directory where the AODs of this Pass will be stored.

The file names of the stored ALICE data will be kept simple, and only unique in the current subdirectory, for example: .AliESDs.root for ESD files, where ‹ is the identifier of the corresponding raw data file (.raw.root, also called raw data chunk). Therefore different subdirectories (for example the reco//ESD directories for different parent directories) may contain files with the same name but with different content. Nevertheless, the GUID scheme (and the directory structure) makes sure that these files can be distinguished.

### Run/File meta data tags

 Tag Name Data Format ) Data Source ) run comment Text DAQ logbook run type Text (physics,pedestal,...) DAQ logbook run start time yyyymmddhhmmss DAQ logbook run stop time yyyymmddhhmmss DAQ logbook run stop reason Text (normal,beam loss...) DAQ logbook magnetic field setting Text (FullField,HalfField, ZeroField...) DCS collision system Text (PbPb,pp,...) DCS collision energy Text (5.5TeV) DCS trigger class DAQ logbook detectors present in run bitmap: 0=not included, 1=included DAQ logbook number of events in this run Integer DAQ logbook run sanity flag bit or bit mask, default 1=OK Manually

For the reconstructed data:

 Tag Name Data Format ) Data Source ) production tag Text reconstruction production software library version Text (AliRoot::v4-04-Rev-05) reconstruction calibration/alignment setting Text reconstruction

For simulated data:

 Tag Name Data Format ) Data Source ) generator Text (HIJING,PYTHIA,...) offline generator version Text offline generator comments Text offline generator parameters Text offline detector geometry offline detector configuration bitmap: 0=not included, 1=included offline simulation comments Text offline

### How to query the file catalog

We assume that a user wants to query the file catalog and extract a collection of ESD files. The files that he/she wants to have should fulfill the following criteria:

• They correspond to the 2008 run.
• They were created during the first period of the machine at that year.
• They were created from the third pass of the reconstruction.

The first thing one should do is have an active alien session.

$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root If he/she wants to store this collection then all is needed to be done is to redirect the output to a xml file:$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root > collection.xml

If there is a need to restrict further the output by imposing some selection criteria at the run/file level then if for example we need that the ESDS from the previous example:

• Correspond to pp collisions.
• Were created between the 19th and the 20th of March 2008 (10:20:30).

then the above command is modified accordingly:

\$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root Run:collision_system=”pp” and Run:stop"2008-03-19" > collection.xml