Reference Sample Product

Data Product Name

DpdReferenceSample

Data Product Custodian

Name of the Schema File

euc-phz-ReferenceSample.xsd

Data Product Elements

Header:

object of type sys:genericHeader

Data:

object of type phz:phzReferenceSample

QualityFlags:

object of type dqc:sqfPlaceHolder

Parameters:

object of type ppr:genericKeyValueParameters

Schema documentation tag

Documentation for data product element dpdReferenceSample:

Represents the reference sample used by NNPZ algorithm. This is produced outside the pipeline and follows an internal binary format.

Documentation for data product element Header:

The generic header of the product.

Detailed Description of the Data Product

Reference sample used by NNPZ algorithm.

Processing Element(s) creating/using the data product

  • This data product is produced outside the Euclid pipeline using the Phosphoros tool and is manually ingested in the EAS.

  • It is used as input of the NNPZ component of the PHZ PF.

Processing function using the data product

  • This product is produced by the PHZ Calibration Pipeline and used in the PHZ Production Pipeline.

This product contains a field containing the type of Reference Sample in the list:

  • PHZ Reference sample with PDZ information used to compute the PDZ of galaxy.

  • PP Reference sample with physical parameters informations

  • GAL_SED Reference sample for computing the galaxy SEDs

  • STAR_SED Reference sample for computing the star SEDs

and up to five types of files representing the NNPZ reference sample:

Index

This file makes the link between the reference object and the records associated to it by specifying in which file and at which position is located the corresponding record. As example if the reference sample contains PDZ the index file will contains the pdz_file : an integer identifying the PDZ file and pdz_offset the position of the record inside the file. Index file is a numpy array persisted through the numpy.save() procedure as a .npy binary file.

SED template data files

Because of the high volume of the SED template data, there are multiple of such files, to limit the size of each file to around 1GB. The files are named sed_data_XX.bin, where the XX is the number of the file which is referenced into the index file. File is a numpy array persisted through the numpy.save() procedure as a .npy binary file. It contains an array of SED each SED being stored as a 2D array of shape (N,2) where N is the number of sampling of the SED, (:,0) is the wavelength sampling and (:,1) the SEDs values. Note that each SED can have a different sampling.

PDZ data files

The PDZ data files store the Redshift Probability Density Function produced using higher quality photometry. The files are named pdz_data_XX.bin and the splitting of the files follows the same rules like the SED template files. Each file is a .npy record of shape (N,M) where N is the number of record stored into the file and M the number of sampling in z, the value at (n,m) being the probability for the reference object n to be at redshift sampling m.

Sampling data file

The Sampling data files store sampling of the physical parameters associated to the reference objects. The files are named pp_data_XX.bin and the splitting of the files follows the same rules like the SED template files. Each file is a .npy record of shape (N,M) where N is the number of record stored into the file and M the number of sample. The value at (n,m) is the m-th sample for the n-th object and consist in a list of physical parameters values. You can extract the name and type of each physical parameter by looking up the dtype of the file record.

Photometry

A FITS file containing the photometry values of the sample for a set of reference filters, and the filter transmissions.

The first HDU has the extension name (EXTNAME header keyword), which is always set to the string NNPZ_PHOTOMETRY. This name is used by NNPZ to detect photometry files, so it should never be changed.

The header also contains the keyword PHOTYPE which indicates the type of the photometry values stored in the file. The different photometry types are the following:

  • Photons: The photometry values are photon count rates, expressed in counts/s/cm2

  • F_nu: The photometry values are energy flux densities expressed in erg/s/cm^2/Hz

  • F_nu_uJy: The photometry values are energy flux densities expressed in {\mu}Jy

  • F_lambda: The photometry values are energy fluxes densities expressed in erg/s/cm^2/\AA

  • MAG_AB: The photometry values are AB magnitudes

The first column of the table is always named ID and contains 64 bit signed integers, which match the identifiers of the reference sample objects. The rest of the columns contain 32 bit floating point numbers and represent the photometry values. The names of the bands are extracted from the column names. The bands can optionally have an error associated with them, in which case there must be a column with the same name and the postfix _ERR. For example, if the table contain a column named g, the error column must be named g_ERR.

The rest of the HDUs in the fits file contain the filter transmissions of the bands. They are binary tables with two columns, the first of which contains the wavelength values (expressed in \AA) and the second the filter transmission (a number in the range [0,1]). Both columns contain 32 bit floating point numbers. The extension names (EXTNAME header keyword) is the same with the band name and it is used for identifying the filter transmissions (the order does not matter).