2dFGRS 100k Database

The 2dFGRS database consists of three main components:

  1. A set of FITS files, one per object in the source catalogue (the input photometric catalogue), that contains every piece of information about each object from the source catalogue, spectroscopic observations and subsequent analysis.
  2. An mSQL database, that contains all of the parameters for each object and allows complex searching and subsetting of the survey objects, and the retrieval of selected subsets of spectra and images.
  3. A WWW interface, that provides a number of different modes for querying the mSQL database and a variety of ways for returning the results of such queries.
This overview gives a basic introduction to each of these three database components.

1. Object FITS Files

There are 467214 target objects in the full FITS database, each with its own FITS file. This number is much larger than the number of objects surveyed, since it includes both objects outside the final area of the survey and objects fainter than the final magnitude limit of the survey. Each object in the survey source catalogue has been given a serial number (SEQNUM), and the name of the FITS file for that object is this serial number. The serial numbers for objects in the SGP strip are 1-193550 and 416694-467214, for objects in the NGP strip 193551-332694 and 389714-416693, and for objects in the random fields 332695-389713.

The 100k release database is a subset of the full database, and only includes the FITS files for the 102426 objects observed up to 31 January 2000.

Each FITS file has a primary part which contains all the source catalogue data about the object (as FITS keywords) and a DSS sky chart (postage stamp image) of the object. Spectra are appended, in chronological order, as additional FITS extensions. Each spectrum extension contains the spectrum of the object, the variance (error) array for the object spectrum, the spectrum of the mean sky that was subtracted from the object spectrum, and FITS keyword data giving information about the spectroscopic observation and derived parameters such as the redshift and spectral quality. Many targets contain multiple spectrum extensions corresponding to multiple observations.

Table 1 lists all the FITS keywords present in the primary part (extension 0) of the FITS files; Table 2 lists all the keywords present in spectrum extensions 1..spectra. The tables give the names of the keywords, example values, and the keyword definitions.


    Table 1: FITS keywords for the primary image (extension 0)
    
    Keyword          Example  Definition

    SIMPLE                T  file does conform to FITS standard
    BITPIX               16  number of bits per data pixel
    NAXIS                 2  number of data axes
    NAXIS1               49  length of data axis 1
    NAXIS2               49  length of data axis 2
    EXTEND                T  FITS dataset may contain extensions
    BSCALE           1.0000  REAL = (FITS * BSCALE) + BZERO
    BZERO            0.0000  zeropoint of conversion to REAL
    SEQNUM           100100  Serial number : database primary key
    NAME      'TGS469Z164'   2dFGRS assigned name
    IMAGE     'SKYCHART'     Existence of postage stamp image
    RA         0.7943429758  RA  (B1950) in radians :   3  2  3.00
    DEC       -0.5475286941  DEC (B1950) in radians : -31 22 15.9
    EQUINOX         1950.00  equinox of RA and DEC
    BJSEL             18.93  final b_J mag used in the object selection
    PROB             2335.4  psi star-galaxy classification parameter
    PARK              0.910  k   star-galaxy classification parameter
    PARMU             0.187  mu  star-galaxy classification parameter
    IGAL                  1  final classification flag (1 for a galaxy)
    JON                  -1  eyeball classification flag
    ORIENT             91.0  orientation in degrees clockwise from E-W
    ECCENT            0.270  eccentricity
    AREA              308.0  isophotal area in pixels
    X_BJ             2918.7  plate X(b_J) in 8 micron pixels
    Y_BJ             9123.1  plate Y(b_J) in 8 micron pixels
    DX                 43.0  corrected difference 100*(X(b_J)-x(R))
    DY                 49.0  corrected difference 100*(Y(b_J)-y(R))
    BJG               18.99  b_J without extinction correction
    RMAG              10.35  unmatched APM `total' mag
    PMAG              10.53  unmatched raw APM profile integrated mag
    FMAG               8.72  unmatched raw APM 2 arcsec `fibre' mag
    SMAG              10.74  unmatched raw stellar mag (from APMCAL)
    IFIELD              417  UKST field 
    IGFIELD            2007  galaxy number in UKST field
    REGION    'S417    '     GSSS region name
    OBJEQNX         2000.00  equinox of the plate reference frame
    OBJRA      0.8034094522  RA  (J2000) in radians :  03 04 07.673 
    OBJDEC    -0.5441390223  DEC (J2000) in radians : -31 10 36.73
    PLTSCALE        67.2000  Plate scale in arcsec per mm
    XPIXELSZ     25.2844500  X pixel size in microns
    YPIXELSZ     25.2844500  Y pixel size in microns
    OBJPLTX         7970.86  object X on plate (pixels)
    OBJPLTY         4148.11  object Y on plate (pixels)
    DATAMAX           14431  Maximum data value
    DATAMIN            4011  Minimum data value
    BJSELOLD          18.96  original b_J mag used in the object selection
    BJG_OLD           19.01  original b_J without extinction correction


    Table 2: FITS keywords for the spectra (extensions 1..spectra)

    Keyword    Example                  Definition

    XTENSION   'IMAGE   '               IMAGE extension
    BITPIX                         -32  number of bits per data pixel
    NAXIS                            2  number of data axes
    NAXIS1                        1024  length of data axis 1
    NAXIS2                           3  length of data axis 2
    PCOUNT                           0  required keyword; must = 0
    GCOUNT                           1  required keyword; must = 1
    CRVAL1                5802.8979492  coordinate value of axis 1
    CDELT1                4.3103027344  coordinate increment on axis 1
    CRPIX1              512.0000000000  reference pixel on axis 1
    CUNIT1     'Angstroms'              units for axis 1
    EXTNAME    'SPECTRUM'               2dFGRS spectrum
    OBSNAME    'TGS469Z164'             observed object name
    OBSRA                 0.7943429758  observed RA  (B1950) in radians
    OBSDEC               -0.5475286940  observed DEC (B1950) in radians
    MATCH_DR                    0.0000  position match error in arcsec
    Z                         0.178876  raw measured redshift
    Z_HELIO                   0.178860  heliocentric redshift
    QUALITY                          5  redshift measurement quality
    ABEMMA                           1  redshift type: abs=1,emi=2,man=3
    NMBEST                           0  # emission lines for emission z
    NGOOD                            0  number of good emission lines
    Z_EMI                      -9.9990  emission redshift
    Q_Z_EMI                          0  emission redshift quality
    KBESTR                           2  cross-correlation template
    R_CRCOR                    15.5600  cross-correlation peak
    Z_ABS                       0.1789  cross-correlation redshift
    Q_Z_ABS                          3  cross-correlation quality
    Q_FINAL                          3  suggested quality for redshift
    IALTER                           0  IALTER=1 if automatic z altered
    Z_COMM     '        '               observer's comment
    THPUT                      0.96613  fiber throughput
    SPFILE     'sgp469_991104_1z.fits'  2dF reduced data file
    PLATE                            1  2dF plate number
    PIVOT                          302  2dF pivot number
    FIBRE                           58  2dF fiber number
    OBSRUN     '99OCT   '               observation run
    GRS_DATE   '991104  '               2dF YYMMDD observed date
    UTDATE     '1999:11:04'             UT date of observation
    SPECTID    'A       '               2dF spectrograph ID
    GRATID     '300B    '               2dF grating ID
    GRATLPMM                       300  2dF grating line per mm
    GRATBLAZ   'COLLIMATOR'             2dF grating blaze direction
    GRATANGL                  25.30000  2dF grating angle
    LAMBDAC                   5782.700  central wavelength
    CCD        'TEKTRONIX_5'            CCD ID
    CCDGAIN                      2.790  CCD inverse gain (e per ADU)
    CCDNOISE                     5.200  CCD read noise (electrons)
    OBJX                        196833  2dF object X position
    OBJY                         10401  2dF object Y position
    OBJXERR                          6  2dF object X position error
    OBJYERR                         14  2dF object Y position error
    OBJMAG                       18.96  2dF object magnitude
    THETA                        4.526  2dF fibre angle
    PTRTYPE    'P       '               2dF ptrtype
    PID                              0  2dF pid
    OBSFLD     'sgp469  '               2dF observed field number
    NCOMB                            3  number of frames combined
    REFRUN                          31  AAT run number of reference
    UTSTART    '16:37:59.48'            UT start of reference exposure
    UTEND      '16:57:59'               UT end of reference exposure
    REFEXP                      1200.0  reference run exposure (secs)
    REFHASTA   '36.07264'               HA start of reference exposure
    REFHAEND   '41.08119'               HA end of reference exposure
    ETA_TYPE            -2.5934501E+00  eta spectral type parameter
    SNR                  2.0299999E+01  median S/N per pixel

2. mSQL Parameter Database

The mSQL database (see Jepson & Hughes 1998) can be thought of as a table. The rows of the table are labelled by the unique object serial number (serial, identical to the parameter SEQNUM in the primary extension of the FITS file) and the extension number (extnum). There are multiple rows for each target object corresponding to each of the extensions in the object's FITS file: the first row corresponds to the primary FITS extension (extension 0), while subsequent rows correspond to the spectrum extensions 1..spectra. The columns of the table correspond to the object parameters, and are labelled by the name of the corresponding keyword. N.B. case is significant in these keywords.

The object serial numbers (serial) provide the primary database key, but the objects are also indexed by their unique survey name (name, identical to the parameter NAME in the primary extension of the FITS file), which has the format TGhfffZnnn, where h is the hemisphere (N for the NGP strip and S for the SGP strip and random fields), fff is the number of the primary field to which the object is assigned and nnn is the number of the galaxy within that field.

Note that the observed name of the object (parameter OBSNAME in each spectrum extension) is the same as name (or NAME in the primary extension of the FITS file) except that: (i) if the field in which the object is observed (given by OBSFLD) is an overlapping field rather than its primary field (given by fff), then the first character of the name is changed from T to X; and (ii) if the object has been flagged as a possible merger, then the second character of the name is changed from G to M.

The first row for each object (extnum=0) contains the source catalogue data and the basic spectroscopic information from the best spectrum of that object. The keywords for that row are the FITS parameters for the primary image (Table 1) plus all the additional keywords listed in Table 3. The best spectrum is the one with the highest redshift quality parameter; if there is more than one spectrum of the same quality, then the most recent of these spectra is used. Subsequent rows for the same object (extnum=1..spectra, where spectra is the number of spectra obtained for that object) contain the FITS parameters pertaining to each spectroscopic observation (Table 2) plus the additional keywords in section (i) of Table 3. If there is no spectrum for the object then spectra=0 and only the row corresponding to extnum=0 will exist. Note that some information is duplicated between rows and that not all parameters are defined for all rows; undefined parameters return a NULL value.


    Table 3: Additional mSQL database keywords

    (i) Keywords in all extensions

    Keyword      Example  Definition

    serial        100100  2dFGRS serial number
    name      TGS469Z164  2dFGRS name
    UKST             417  UKST sky survey field number
    spectra            1  number of spectra for this object
    extnum             1  extension number
    obsrun         99OCT  observing run year and month
    TDFgg           -469  2dFGRS field number (+NGP,-SGP)
    pivot            302  2dF pivot
    plate              1  2dF plate
    fiber             58  2dF fiber
    z           0.178876  observed redshift
    z_helio     0.178860  heliocentric redshift
    abemma             1  redshift type (abs=1,emi=2,man=3)
    quality            5  redshift quality parameter
    
    (ii) Keywords in extension 0 only

    Keyword          Example  Definition

    alpha       0.7943429758  RA  (B1950) in radians
    delta      -0.5475286941  DEC (B1950) in radians
    ra            3  2  3.00  RA  (B1950) in HH MM SS.SS
    dec          -31 22 15.9  DEC (B1950) in DD MM SS.S 
    ra2000       03 04 07.68  RA  (J2000) in HH MM SS.SS
    dec2000      -31 10 36.8  DEC (J2000) in DD MM SS.S 
    l2        228.9258834424  Galactic longitude
    b2        -60.8572447739  Galactic latitude

Searches of the database use the mSQL query format (Jepson & Hughes, 1998), which has the general format
  SELECT list_of_parameters FROM database_name WHERE list_of_conditions
and list_of_conditions is a series of equalities and inequalities linked by Boolean relations.

An example is
  SELECT name, extnum, ra, dec, BJSEL, Z, QUALITY, z, quality FROM TDFgg
  WHERE name='TGS469Z164'

which selects the listed parameters for the object with 2dFGRS name TGS469Z164 (note the single quotes around the character string) from both the summary row (extnum=0) and for each spectrum (extnum=1..spectra). Note that the parameters with the same name in lower case and upper case are distinct: the former are generally from extnum=0, the latter from extnum>0 (parameters are returned as NULL in rows where they are not defined).

Another example with a more complex list of conditions is
  SELECT name FROM TDFgg
  WHERE extnum=0 AND ((BJSEL<15.5 AND quality>=3) OR quality>4)

which lists just the names of the objects which are either brighter than bJ=15.5 with redshift quality at least 3, or have quality greater than 4, or both; the search is restricted just to the summary row by requiring extnum=0

Simple searches on the two indexed parameters, serial and name), are quick - e.g. WHERE serial=69656 or WHERE name='TGS203Z081'; more complex searches take about 5 minutes. Further information about the mSQL database software and its structured query language is given in Yarger et al. (1999) and on the WWW at http://www.hughes.com.au.

3. WWW Interface

The 2dFGRS mSQL database can be searched via the WWW interface in a number of ways:

  1. search via a standard mSQL query as described above - this is the most general method;
  2. perform a standard mSQL query restricted to a list of named objects;
  3. perform a standard mSQL query restricted to objects within a specified radius of a given sky position;
  4. match objects to a supplied catalogue of positions.

The results of a query can be returned either directly as an HTML table (for relatively small datasets) or via an email giving the URL of the results file (for large datasets). The results file may be either a compressed text (gzipped ASCII) file containing the chosen parameters for the objects selected by the mSQL query, or a compressed archive (gzipped tarfile) of the FITS files for the objects selected by the query, depending upon the option selected. If results are returned as an HTML table, then objects can be selected interactively and their DSS images and spectra displayed. If the spectra have measured redshifts, then the plot of each spectrum shows the positions of prominent spectral features at the redshift associated with that spectrum.


Matthew Colless, colless@mso.anu.edu.au, Wed 8 Apr 2015