Run/File Tags
File/Run MetaData
In order to characterize physics data it is useful to assign meta data to different levels of data abstraction. For data produced and used in the ALICE experiment a three layered structure is implemented:run level meta data, file level meta data and event level meta data.
This page is dedicated to the first two categories the run and file level metadata. The run and file meta data will be stored in the ALICE File Catalog. A directory structure within this database will ensure minimization of meta data duplication. For detailed information about the metadata fields at the run and file level, an internal note is ready by Markus Oldenburg.
Structure of the file catalog
The file catalog, where ALICE files will be registered, has the following structure:
- Read data: /data/
/ / / - Simulated data: /sim/
/ / /
where:
- Year: Can take a value like 2007, 2008 etc.
- AcceleratorPeriod: Can take a value relevant to the LHC period (LHC07a, LHC08b, ...).
- RunNumber: Is the run identifier (0001,...).
- ProductionType: Gives information about the specific simulation which was run, which includes the conditions applied: Ideal, Full, Residual (PDC06_Residual).
It is also foreseen to have several subdirectories where the files will be grouped according to their type. The decided subdirectories are the following:
- raw/: Storage of raw data.
- reco/
/cond/: This will link to the directory where all the calibration and condition files used for this Pass will be stored. - reco/
/ESD/: The directory where the ESDs and the tag files will be stored for each Pass. - reco/
/AOD/: The directory where the AODs of this Pass will be stored.
The file names of the stored ALICE data will be kept simple, and only unique in the current subdirectory, for example: .AliESDs.root for ESD files, where ‹ is the identifier of the corresponding raw data file (.raw.root, also called raw data chunk). Therefore different subdirectories (for example the reco/
Run/File meta data tags
Tag Name | Data Format ) | Data Source ) |
run comment | Text | DAQ logbook |
run type | Text (physics,pedestal,...) | DAQ logbook |
run start time | yyyymmddhhmmss | DAQ logbook |
run stop time | yyyymmddhhmmss | DAQ logbook |
run stop reason | Text (normal,beam loss...) | DAQ logbook |
magnetic field setting | Text (FullField,HalfField, ZeroField...) | DCS |
collision system | Text (PbPb,pp,...) | DCS |
collision energy | Text (5.5TeV) | DCS |
trigger class | DAQ logbook | |
detectors present in run | bitmap: 0=not included, 1=included | DAQ logbook |
number of events in this run | Integer | DAQ logbook |
run sanity | flag bit or bit mask, default 1=OK | Manually |
For the reconstructed data:
Tag Name | Data Format ) | Data Source ) |
production tag | Text | reconstruction |
production software library version | Text (AliRoot::v4-04-Rev-05) | reconstruction |
calibration/alignment setting | Text | reconstruction |
For simulated data:
Tag Name | Data Format ) | Data Source ) |
generator | Text (HIJING,PYTHIA,...) | offline |
generator version | Text | offline |
generator comments | Text | offline |
generator parameters | Text | offline |
detector geometry | offline | |
detector configuration | bitmap: 0=not included, 1=included | offline |
simulation comments | Text | offline |
How to query the file catalog
We assume that a user wants to query the file catalog and extract a collection of ESD files. The files that he/she wants to have should fulfill the following criteria:
- They correspond to the 2008 run.
- They were created during the first period of the machine at that year.
- They were created from the third pass of the reconstruction.
The first thing one should do is have an active alien session.
$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root
If he/she wants to store this collection then all is needed to be done is to redirect the output to a xml file:
$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root > collection.xml
If there is a need to restrict further the output by imposing some selection criteria at the run/file level then if for example we need that the ESDS from the previous example:
- Correspond to pp collisions.
- Were created between the 19th and the 20th of March 2008 (10:20:30).
then the above command is modified accordingly:
$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root Run:collision_system=”pp” and Run:stop"2008-03-20 10:20:33" and Run:start>"2008-03-19" > collection.xml