Introduction¶
The next sections of this document are going to list all the data products that are generated an used by the SGS pipeline. They are clearly not meant to be read one after the other, but rather to serve as a reference to the reader/user in search of information on a particular product. However, we feel important to provide here some elements that will help understanding the method use to produce and maintain this document, and its organization principles.
First this document is intended to exist in synchronization with the Euclid Common Data Model, which is at the centrer of the code development for the SGS system. Therefore the document itself is structured as the data model. This structure is obviously very close to the organizational structure of the SGS, and therefore the OUs feature prominently in the next section. To these the data model add three important elements:
LE1 is the processing function in charge of transforming the raw telemetry coming from the satellite into data productst that will enter the Euclid pipeline, through the VIS and NIR OU. This PF runs on the SOC computing infrastructure, and is a function integrated by the SOC, but made of code elements provided by the VIS and NIR OUs.
DQCT is that Data Quality Common Tools group, a group in the SGS that develop specific functions aims at informing on data products quality, functions that can be integrated in the main PFs of the pipeline.
OSS is the Operational Sky Survey. This is the data prodiuct describing the observations actually performed by Euclid. It is generated by SOC from the Reference Survey Definition prepared by the ECSURV group.
Then each section will follow a common (and thus repetitive) format: first an introduction to the processing function and its product. The purpose of these introductions is to provide the general structure of the processing function itself, so that the following data products can be understood in their context. These introduction are not meant to replace the detailed documentations produced by the OU themselves about their processing functions. Interested readers are referred to the Requirements Specification Document, Software Design Document, Validation Plan and Software Test Specification of each OU for this information. If need be, RD6 (see Related documents) can be consulted for a deeper overview.
Links to these documents should be welcome in the DPDD so that the reader could have a single access to any information, as made in LSST PDDD
Following these introductions, all the data product found in the corresponding section of the data model will be presented, each on its own section, as a sort of identity card. The name of these sections extracted from the data model schema file. It is not meant to be simple or complex, it is meant to allow a 1:1 matching between the content of a directory on the data model, and the content of a section in the DPDD. Only this can we build automated systems to verify the synchronization between the two data sets.
Think we should provide here to the custodian a kind of dpd filtering mechanism as some products are either infra products or fake products used for simulation/validation, does it make sense in the DPDD to refer to DpdExtConfigurationSet, DpdExtGaiaCutout, DpdExtSuperTile, DpdExtValidationxxx,…DpdMerMachineLearningModel, DpdValidationRequirements, DpdVisFileContainer, DpdAuxdataFiles.. that have very few definitions and not very meaningfull or with added values for the readers , should we refer all the calibration data products ??
We now detail the structure of these data product cards:
Data product name: this is extracted from the name of the schema file, so indeed there is a repetition with the card’s name. The content of this section is automatically generated and used to implement the synchronization mechanism.
To give some precisions here for any custodian and to provide here the cooking recipe : this name should be something like:DpdXXXyyyy
Data product custodian: this is the name of the folder in the data model where this product’s schema file is found. This is also automatically generated.
This Data product custodian tag is not needed here as it is a duplication from the layout (already placed at the top banner of the page), let’s alleviate each data product page
Data model tag: each release of the data model (from the trunk of the configuration control) has a tag. We automatically report it here, again for synchronization purposes. It will also be useful for developers to make sure this description is compatible with the version of the data model they use.
This dataModel tag is not needed here as it is a duplication for any data product and should be placed at the top of the DPDD as there’s a bijection 1:1 between DPDD and DM, let’s alleviate each data product page
Name of the schema file: this is the actual name of the file, with all its prefixes. Again automatically placed in the card.
Schema documentation description: the schema files are XML files, and as such contain a number of tags that we can extract automatically when building the data product card. In principle, the documentation tag should already contain information on the data product itself, but experience shows that this is rarely the case, and certainly not homogenous.
to be replaced with the scribe entry point that is a HTML tree allowing to browse the data model from the top level element
Data product elements: these are again tags that are extracted automatically when the data product cards are built. They are supposed to provide a quick description of the structure of the data product.
This Data product elements tag is not needed here as it is a duplication for any data product : Header/Data/QualityFlags/GenericParameters and should be placed at the top of the DPDD, let’s alleviate each data product page
Processing Element(s) creating/using the data products: This is the first field provided by the custodian of the card. When applicable it will indicate where, in the structure of the processing function the element is created or used. This can become important for future users of the data product.
Processing function using the data product: again an element provided by the custodian of the card that indicates the user of the product. It thus serves also as an indication of which interface this product belongs to.
Detailed description of the data product: as it implies, this is information provided by the custodian of the card that should provide enough information on the product content and structure for potential users.
We should add these two automatic sections in hte shape of formatted table that are the list of FITS keywords defined into the fits files of the products and list of colulns for catalogs with comlumnName/descriptio/format/unit