Describing Datasets
So far, we’ve used the name, serviceName, and urlPath attributes to tell THREDDS how to treat our datasets.
However, there are a lot of optional properties, or metadata, that can be added to help other applications and digital libraries know how to “do the right thing” with our data.
Here is a sample of them:
- The
collectionTypeattribute is used on collection datasets to describe the relationship of their nested datasets. - The
dataTypeis a simple classification that helps clients to know how to display the data (e.g.Image,Grid,Pointdata, etc). - The
dataFormatTypedescribes what format the data is stored in (e.g.NetCDF,GRIB-2,NcML, etc). This information is used by data access protocols like OPeNDAP and HTTP. - The combination of the naming
authorityand theIDattributes should form a globally-unique identifier for a dataset. In the TDS, it is especially important to add theIDattribute to your datasets.
<service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/"/>
<dataset name="SAGE III Ozone Loss Experiment" ID="Sage III" collectionType="TimeSeries">
<dataset name="January Averages" serviceName="odap" urlPath="sage/avg/jan.nc"
ID="jan.nc" authority="unidata.ucar.edu">
<dataType>Trajectory</dataType>
<dataFormatType>NetCDF</dataFormatType>
</dataset>
</dataset>
Exporting THREDDS Datasets To Digital Libraries
The harvest attribute indicates that the dataset is at the right level of granularity to be exported to digital libraries or other discovery services.
Elements such as summary, rights, and publisher are needed in order to create valid entries for these services.
<dataset name="SAGE III Ozone Loss Experiment" ID="Sage III" harvest="true">
<contributor role="data manager">John Smith</contributor>
<keyword>Atmospheric Chemistry</keyword>
<publisher>
<long_name vocabulary="DIF">Community Data Portal, National Center for Atmospheric Research, University Corporation for Atmospheric Research</long_name>
<contact url="http://dataportal.ucar.edu" email="cdp@ucar.edu"/>
</publisher>
</dataset>
Sharing Metadata
When a catalog includes multiple datasets, it can often be the case that they have share properties. For example:
<service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/"/>
<dataset name="SAGE III Ozone Loss Experiment" ID="Sage III">
<dataset name="January Averages" urlPath="sage/avg/jan.nc" ID="jan.nc" serviceName="odap" authority="unidata.ucar.edu" dataFormatType="NetCDF"/>
<dataset name="February Averages" urlPath="sage/avg/feb.nc" ID="feb.nc" serviceName="odap" authority="unidata.ucar.edu" dataFormatType="NetCDF"/>
<dataset name="March Averages" urlPath="sage/avg/mar.nc" ID="mar.nc" serviceName="odap" authority="unidata.ucar.edu" dataFormatType="NetCDF"/>
</dataset>
Rather than declare the same information on each dataset, you can use the metadata element to factor out common information:
<service name="odap" serviceType="OpenDAP" base="/thredds/dodsC/"/>
<dataset name="SAGE III Ozone Loss Experiment" ID="Sage III">
<metadata inherited="true"> <!-- 1 -->
<serviceName>odap</serviceName> <!-- 2 -->
<authority>unidata.ucar.edu</authority> <!-- 2 -->
<dataFormatType>NetCDF</dataFormatType> <!-- 2 -->
</metadata>
<dataset name="January Averages" urlPath="sage/avg/jan.nc" ID="jan.nc"/> <!-- 3 -->
<dataset name="February Averages" urlPath="sage/avg/feb.nc" ID="feb.nc"/> <!-- 3 -->
<dataset name="Global Averages" urlPath="sage/global.nc" ID="global.nc" authority="fluffycats.com"/> <!-- 4 -->
</dataset>
- The
metadataelement withinherited="true"implies that all the information inside themetadataelement applies to the current dataset and all nested datasets. - The
serviceName,authority, anddataFormatTypeare declared as elements. - These datasets use all the metadata values declared in the parent dataset.
- This dataset overrides
authority, but uses the other 2 metadata values
When Should I Use A Metadata Element?
Both the dataset and metadata elements are containers for metadata called the threddsMetadata group.
When the metadata is specific to the dataset, put it directly in the dataset element.
When you want to share it with all nested datasets, put it in a metadata inherited="true" element.