GRIB Feature Collection Tutorial

GRIB Feature Collection

For more information

The featureCollection element is a way to tell the TDS to serve collections of CDM Feature Datasets. Currently this is used mostly for gridded data whose time and spatial coordinates are recognized by the CDM software stack. In this tutorial, we will work with featureCollection for collections of GRIB files.

Creating a GRIB Feature Collection

Download catalogGribfc.xml, place it in ${tomcat_home}/content/thredds directory and add a catalogRef to it from your main catalog. Heres the first feature collection in it:

1)<featureCollection name="FNL" featureType="GRIB1" path="gribfc/LocalFNLCollection">
2) <metadata inherited="true">
3)   <serviceName>all</serviceName>
4)    <documentation type="summary">LOCAL FNL's TO TEST TIME PARTITION</documentation>
   </metadata>
5) <collection name="ds083.2"
        spec="/machine/tds/data/grib/fnl/**/fnl_.*_00_c$"
6)      timePartition="directory"
7)      dateFormatMark="#fnl_#yyyyMMdd_HH" />
8) <update startup="test"/>
</featureCollection>
  1. A THREDDS featureCollection is defined, of type GRIB1. All contained datasets will all have a path starting with gribfc/LocalFNLCollection.
  2. All the metadata contained here will be inherited by the contained datasets.
  3. The services to be used are defined in a compound service type called all.
  4. You can add any metadata that is appropriate.
  5. The collection of files is defined, using a collection specification string. Everything under /machine/tds/data/grib/fnl will be scanned for files with names that match the regular expression fnl_.*_00_c$
  6. The collection will be split into a time partition by directory.
  7. A date will be extracted from the filename by matching the characters after fnl_ with yyyyMMdd_HH. An example filename is fnl_20100104_12_00_c, so the date will be year 2010, month 01, day 04 and hour 12.
  8. Read in the collection when the TDS starts up, and test that the indices are up to date.

The resulting top level web page for the dataset looks like:

The TDS has created a number of datasets out of the GRIB collection, and made them available through the catalog interface. There is

For each seperate reference time, there is a logical dataset, each with a "Full" (two time dimensions) and "Best" dataset. Drilling down to the bottom of one of these:

You see that it has a "Best Timeseries" collection dataset as well as listing the individual files in the collection:

Here is listed all of the metadata for this dataset, as well as the possible access methods (OpenDAP, WMS, etc). This is the "HTML view" of the catalog, with URL:

http://localhost:8080/thredds/catalog/gribfc/LocalFNLCollection/ds083.2-2010/ds083.2-2010.01/ds083.2-2010.01-20100101-000000.ncx2/catalog.html?
	dataset=gribfc/LocalFNLCollection/ds083.2-2010/ds083.2-2010.01/ds083.2-2010.01-20100101-000000.ncx2/GC
Its instructive to look at the "XML view" of the catalog, by removing the query (after the "?") and changinf the "html" to "xml:, giving this URL:
http://localhost:8080/thredds/catalog/gribfc/LocalFNLCollection/ds083.2-2010/ds083.2-2010.01/ds083.2-2010.01-20100101-000000.ncx2/catalog.xml

and this is the result:

<?xml version="1.0" encoding="UTF-8"?>
<catalog xmlns="http://www.unidata.ucar.edu/namespaces/thredds/InvCatalog/v1.0" xmlns:xlink="http://www.w3.org/1999/xlink" name="ds083.2-2010.01-20100101-000000" version="1.0.1">
<service name="VirtualServices" serviceType="Compound" base="">
<service name="ncdods" serviceType="OPENDAP" base="/thredds/dodsC/" />
<service name="wcs" serviceType="WCS" base="/thredds/wcs/" />
<service name="wms" serviceType="WMS" base="/thredds/wms/" />
<service name="ncss" serviceType="NetcdfSubset" base="/thredds/ncss/grid/" />
<service name="cdmremote" serviceType="CdmRemote" base="/thredds/cdmremote/" />
<service name="ncml" serviceType="NCML" base="/thredds/ncml/" />
<service name="uddc" serviceType="UDDC" base="/thredds/uddc/" />
<service name="iso" serviceType="ISO" base="/thredds/iso/" />
</service>
<dataset name="ds083.2-2010.01-20100101-000000" ID="gribfc/LocalFNLCollection/ds083.2-2010/ds083.2-2010.01/ds083.2-2010.01-20100101-000000.ncx2/GC" urlPath="gribfc/LocalFNLCollection/ds083.2-2010/ds083.2-2010.01/ds083.2-2010.01-20100101-000000.ncx2/GC">
<documentation type="summary">Single reference time Grib Collection</documentation>
<metadata inherited="true">
<serviceName>VirtualServices</serviceName>
<dataType>GRID</dataType>
<documentation type="summary">LOCAL FNL's TO TEST TIME PARTITION</documentation>
<documentation type="Reference Time">2010-01-01T00:00:00Z</documentation>
<geospatialCoverage>
<northsouth>
<start>-90.0</start>
<size>180.0</size>
<resolution>-1.0</resolution>
<units>degrees_north</units>
</northsouth>
<eastwest>
<start>0.0</start>
<size>360.0</size>
<resolution>1.0</resolution>
<units>degrees_east</units>
</eastwest>
<name>global</name>
</geospatialCoverage>
<timeCoverage>
<start>2010-01-01T00:00:00Z</start>
<end>2010-01-01T00:00:00Z</end>
</timeCoverage>
<variableMap xlink:href="/thredds/metadata/gribfc/LocalFNLCollection/ds083.2-2010/ds083.2-2010.01/ds083.2-2010.01-20100101-000000.ncx2/GC?metadata=variableMap" xlink:title="variables" />
</metadata>
</dataset>
</catalog>
You can click around in these pages to familiarize yourself with the various datasets.

GRIB Feature Collection with multiple GDS

The second feature collection in catalogGribfc.xml has:

1)<featureCollection name="ECMWF Data" featureType="GRIB1" path="gribfc/ecmwf" serviceName="all">
2) <collection name="ECMWF_GNERA" spec="C:/tmp/gnera/ECMWF_GNERA_d000..#yyyyMMdd#" />
<gribConfig datasetTypes="Best LatestFile Files">
3) <gdsName hash='1562665966' groupName='HighResolution'/>
<gdsName hash='-104750013' groupName='LowResolution'/>

</gribConfig>
</featureCollection>
  1. A THREDDS featureCollection is defined, of type GRIB1. All contained datasets will all have a path starting with gribfc/ecmwf.
  2. The collection of files is defined, using a collection specification string. Subdirectories of /machine/tds/tutorial/ecmwf will be scanned for files with names that match the regular expression ECMWF_GNERA_d000..20121001$. A date will be extracted from the filename by matching the characters after the "." with yyyyMMdd. An example filename is ECMWF_GNERA_d0001.20121001, so the date will be year 2012, month 10, day 01.
  3. A configuration element that is specific to GRIB collections. In this case we are changing the name of the group by matching the GDS hash code.

Open up the ToolsUI IOSP/GRIB1/Grib1Collection tab, and enter "/work/tds/tutorial/ecmwf/ECMWF_GNERA_d000..20121001$" into the collection spec, you will see something like:

The bottom table shows that there are two distinct GDS in this collection. The column marked "hash" shows the GDS hash code that you use in the TDS configuration table. Click the first Info button ("generate gds xml") to generate XML that you can modify. You can then cut and paste this into your TDS catalog file:

After making the modifications in the TDS config catalog, the resulting HTML view is:

So we have given human meaningful names to the groups. This renaming can be done at any time, one just restarts the TDS for it to have affect.

GRIB Feature Collection with spurious GDS

The third feature collection in catalogGribfc.xml has:

1)<featureCollection name="NDFD-CONUS_5km_conduit" featureType="GRIB" path="gribfc/ndfd">
<metadata inherited="true">
2) <dataFormat>GRIB-2</dataFormat>
</metadata>
3) <collection spec="/machine/tds/tutorial/ndfd/.*grib2$" dateFormatMark="#NDFD_CONUS_5km_conduit_#yyyyMMdd_HHmm" />
4) <gribConfig>
<gdsHash from="-2121584860" to="28944332"/>
</gribConfig>
</featureCollection>
</featureCollection>
  1. A THREDDS featureCollection is defined, of type GRIB. All contained datasets will all have a path starting with gribfc/ndfd.
  2. Make sure you specify GRIB-2 dataFormat, or else nothing will work.
  3. Subdirectories of /machine/tds/tutorial/ndfd will be scanned for files with names that end with grib2. A date will be extracted from the filename by matching the characters after the "NDFD_CONUS_5km_conduit_" with yyyyMMdd_HHmm. An example filename is NDFD_CONUS_5km_conduit_20120124_2000.grib2, so the date will be year 2012, month 01, day 24, hour 20, minute 00.
  4. A configuration element that is specific to GRIB collections. In this case we are combining records with GDS hashcode -2121584860 into GDS 28944332.

Open up the ToolsUI IOSP/GRIB2/Grib2Collection tab, and enter the "/work/tds/tutorial/ndfd/.*grib2$" into the collection spec, you will see something like:

The bottom table shows that there are two distinct GDS in this collection. The column marked "hash" shows the GDS hash code that you use in the TDS configuration table. However, both GDS have the same nx and ny, which is a bit suspicious. Select both GDS, then right click on them and select "compare GDS" to get this:

This compares the x and y coordinates of the two GDS. These are displaced by .367 and .300 km, respectively. If you open this dataset up in the coordinate system tab, you will see that the x,y grid spacing is 2.5 km. Its possible that some of these variables are displaced 3/10 km, and its possible that there is a error in generating these GRIB records, and that in fact all of the variables should be on the same grid. If the latter, then the gdsConfig element in the TDS config catalog above will fix the problem.

This effects the generation of the CDM index (ncx2) files. To have this take affect, delete any ncx2 files and regenerate.