Run/File Tags

Run/File Tags

 


 

File/Run MetaData

In order to characterize physics data it is useful to assign meta data to different levels of data abstraction. For data produced and used in the ALICE experiment a three layered structure is implemented:run level meta data, file level meta data and event level meta data.

This page is dedicated to the first two categories the run and file level metadata. The run and file meta data will be stored in the ALICE File Catalog. A directory structure within this database will ensure minimization of meta data duplication. For detailed information about the metadata fields at the run and file level, an internal note is ready by Markus Oldenburg.

Structure of the file catalog

The file catalog, where ALICE files will be registered, has the following structure:

  • Read data: /data////
  • Simulated data: /sim////

where:

  • Year: Can take a value like 2007, 2008 etc.
  • AcceleratorPeriod: Can take a value relevant to the LHC period (LHC07a, LHC08b, ...).
  • RunNumber: Is the run identifier (0001,...).
  • ProductionType: Gives information about the specific simulation which was run, which includes the conditions applied: Ideal, Full, Residual (PDC06_Residual).

It is also foreseen to have several subdirectories where the files will be grouped according to their type. The decided subdirectories are the following:

  • raw/: Storage of raw data.
  • reco//cond/: This will link to the directory where all the calibration and condition files used for this Pass will be stored.
  • reco//ESD/: The directory where the ESDs and the tag files will be stored for each Pass.
  • reco//AOD/: The directory where the AODs of this Pass will be stored.

The file names of the stored ALICE data will be kept simple, and only unique in the current subdirectory, for example: .AliESDs.root for ESD files, where ‹ is the identifier of the corresponding raw data file (.raw.root, also called raw data chunk). Therefore different subdirectories (for example the reco//ESD directories for different parent directories) may contain files with the same name but with different content. Nevertheless, the GUID scheme (and the directory structure) makes sure that these files can be distinguished.

Run/File meta data tags

Tag Name Data Format ) Data Source )
run comment Text DAQ logbook
run type Text (physics,pedestal,...) DAQ logbook
run start time yyyymmddhhmmss DAQ logbook
run stop time yyyymmddhhmmss DAQ logbook
run stop reason Text (normal,beam loss...) DAQ logbook
magnetic field setting Text (FullField,HalfField, ZeroField...) DCS
collision system Text (PbPb,pp,...) DCS
collision energy Text (5.5TeV) DCS
trigger class   DAQ logbook
detectors present in run bitmap: 0=not included, 1=included DAQ logbook
number of events in this run Integer DAQ logbook
run sanity flag bit or bit mask, default 1=OK Manually

For the reconstructed data:

Tag Name Data Format ) Data Source )
production tag Text reconstruction
production software library version Text (AliRoot::v4-04-Rev-05) reconstruction
calibration/alignment setting Text reconstruction

For simulated data:

Tag Name Data Format ) Data Source )
generator Text (HIJING,PYTHIA,...) offline
generator version Text offline
generator comments Text offline
generator parameters Text offline
detector geometry   offline
detector configuration bitmap: 0=not included, 1=included offline
simulation comments Text offline

How to query the file catalog

We assume that a user wants to query the file catalog and extract a collection of ESD files. The files that he/she wants to have should fulfill the following criteria:

  • They correspond to the 2008 run.
  • They were created during the first period of the machine at that year.
  • They were created from the third pass of the reconstruction.

The first thing one should do is have an active alien session.

$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root

If he/she wants to store this collection then all is needed to be done is to redirect the output to a xml file:

$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root > collection.xml

If there is a need to restrict further the output by imposing some selection criteria at the run/file level then if for example we need that the ESDS from the previous example:

  • Correspond to pp collisions.
  • Were created between the 19th and the 20th of March 2008 (10:20:30).

then the above command is modified accordingly:

$find -x collection /alice/data/2008/LHC08a/*/reco/Pass3/* AliESDs.root Run:collision_system=”pp” and Run:stop<"2008-03-20 10:20:33" and Run:start>"2008-03-19" > collection.xml