Ethan Davis
last updated 28 September 2005
This page has been superseded. The current and development versions can be found here.
This document describes NetCDF attributes recommended for describing a NetCDF dataset to discovery systems such as Digital Libraries. THREDDS tools will use these attributes for extracting metadata from datasets, and exporting to Dublin Core, DIF, ADN, FGDC, ISO 19115 etc. metadata formats.
These attributes parallel THREDDS catalog specification's digital library metadata. Attributes are used to add information inside the NetCDF file, while THREDDS catalog metadata adds information external to the NetCDF file.
Where appropriate, we use attributes described in the NetCDF Users Guide as well as some attributes defined in the
CF convention.
Some we use directly (e.g., "title" and "history") others we use unless
more detailed attributes defined here are given (e.g., "institution" vs
"creator_*").
NetCDF files conforming to this specification must add the global attribute:
:Metadata_Conventions = "Unidata Dataset Discovery v1.0";
When following multiple metadata conventions, list them with a comma separator.
Attribute | Description | THREDDS |
---|---|---|
title |
A short description of the dataset. |
dataset@name |
summary |
A paragraph describing the dataset. |
metadata/documentation[@type="summary"] |
keywords |
A comma separated list of key words and phrases. |
metadata/keyword |
Attribute | Description | THREDDS |
---|---|---|
id |
The
combination of the "naming authority" and the "id" should be a globally unique identifier for the dataset. |
dataset@id |
naming_authority |
dataset@authority metadata/authority |
|
keywords_vocabulary |
If you are following a guideline for the words/phrases in your "keywords" attribute, put the name of that guideline here. |
metadata/keyword@vocabulary |
cdm_data_type |
The THREDDS data type appropriate for this dataset. | metadata/dataType |
history |
Provides an audit trail for modifications to the original data. | metadata/documentation[@type="history"] |
comment |
Miscellaneous information about the data. | metadata/documentation |
date_created | The date on which the data was created. |
metadata/date[@type="created"] |
creator_name |
The
data creator's name, URL, and email. The "institution" attribute will
be used if the "creator_name" attribute does not exist. |
metadata/creator/name |
creator_url |
metadata/creator/contact@url |
|
creator_email |
metadata/creator/contact@email | |
institution |
metadata/creator/name | |
project |
The scientific project that produced the data. |
metadata/project |
processing_level | A textual description of the processing (or quality control) level of the data. |
metadata/documentation[@type="processing_level"] |
acknowledgment | A place to acknowledge various type of support for the project that produced this data. |
metadata/documentation[@type="funding"] |
geospatial_lat_min |
Describes a
simple latitude, longitude, and vertical bounding box. For a more detailed geospatial coverage, see the suggested geospatial
attributes. |
metadata/geospatialCoverage/northsouth/start |
geospatial_lat_max | metadata/geospatialCoverage/northsouth/size | |
geospatial_lon_min | metadata/geospatialCoverage/eastwest/start | |
geospatial_lon_max | metadata/geospatialCoverage/eastwest/size | |
geospatial_vertical_min |
metadata/geospatialCoverage/updown/start | |
geospatial_vertical_max | metadata/geospatialCoverage/updown/size | |
time_coverage_start | Describes the temporal coverage of the data as a time range. | metadata/timeCoverage/start |
time_coverage_end | metadata/timeCoverage/end | |
time_coverage_duration | metadata/timeCoverage/duration | |
time_coverage_resolution | metadata/timeCoverage/resolution | |
standard_name_vocabulary |
The name of the controlled vocabulary from which variable standard names are taken. |
metadata/variables@vocabulary |
license | Describe the restrictions to data access and distribution. | metadata/documentation[@type="rights"] |
Attribute | Description | THREDDS |
---|---|---|
contributor_name |
The name and role of any individuals or institutions that contributed to the creation of this data. |
metadata/contributor |
contributor_role |
metadata/contributor@role | |
publisher_name |
The data publisher's name, URL, and email. The publisher may be an individual or an institution. | metadata/publisher/name |
publisher_url |
metadata/publisher/contact@url |
|
publisher_email |
metadata/publisher/contact@email | |
date_modified |
The date on which this data was last modified. |
metadata/date[@type="modified"] |
date_issued |
The date on which this data was formally issued. |
metadata/date[@type="issued"] |
geospatial_lat_units |
Further
refinement of the geospatial bounding box can be provided by using
these units and resolution attributes. |
metadata/geospatialCoverage/northsouth/units |
geospatial_lat_resolution | metadata/geospatialCoverage/northsouth/resolution |
|
geospatial_lon_units |
metadata/geospatialCoverage/eastwest/units | |
geospatial_lon_resolution | metadata/geospatialCoverage/eastwest/resolution | |
geospatial_vertical_units |
metadata/geospatialCoverage/updown/units | |
geospatial_vertical_resolution |
metadata/geospatialCoverage/updown/resolution |
|
geospatial_vertical_positive |
metadata/geospatialCoverage@zpositive |
Attribute | Description | THREDDS |
---|---|---|
long_name |
A long descriptive name for the variable (not necessarily from a controlled vocabulary). |
metadata/variables/variable@vocabulary_name |
standard_name |
A long descriptive name for the variable taken from a controlled vocabulary of variable names. |
metadata/variables/variable@vocabulary_name |
units |
The units of the variables data values. This attributes value should be a valid udunits string. |
metadata/variables/variable@units |
The "acknowledgment" attribute provides a place to acknowledge various types of support for the project that produced the data. Use of this attribute is recommended.
These attributes provide the name and role of any individuals or institutions that contributed to the creation of the data. The use of these attributes is suggested.
These attributes provide the name, URL, and email contact
information for the creator of the data. The data creator may be an
individual or an institution. If the "creator_name" attribute does not
exist, the "institution" attribute will be used. If creator information
other than name is to be given, we recommend use of the "creator_*"
attributes.
Note: email address persistence
The "date_created" attribute gives the date on which the data was created. Its use is recommended.
The "date_issued" attribute provides the date on which this data was formally issued. Use of this attribute is suggested when relevant to the data and distinct from other dates used for this data.
The "date_modified" attribute provides the date on which the data was last modified. Use of this attribute is suggested if the data has been modified since the date of creation.
Use the min and max attributes to describe a simple latitude, longitude, vertical bounding box. If none of the other attributes are used, latitude is assumed to be in decimal degrees north, longitude is assumed to be in decimal degrees east, and vertical is assumed to be in meters above ground. The use of these min/max geospatial attributes is recommended.
Further
refinement of the geospatial bounding box can be provided by using
the units and resolution attributes. The geospatial_vertical_positive
attribute indicates which direction is positive (a value of "up" means
that z increases up, like units of height, while a value of "down"
means that z increases downward, like units of pressure or depth). The
use of these further geospatial attributes is suggested.
The "history" attribute provides an audit trail for modifications to the original data. It should contain a separate line for each modification with each line including a timestamp, user name, modification name, and modification arguments. Its use is recommended and its value will be used by THREDDS as a history-type documentation. The "history" attribute is recommended by the NetCDF Users Guide and the CF convention.
The "id" and "naming_authority" attributes are intended to provide a globally unique identification for each dataset. The "id" value should attempt to uniquely identify the dataset. The naming authority allows a further refinement of the "id". The combination of the two should be globally unique for all time. We recommend using reverse-DNS naming for the naming authority. For example, naming_authority="edu.ucar.unidata" and id="NCEP/NAM_211_2005-05-24_12Z".
The "keywords_vocabulary" attribute identifies the controlled list of keywords from which the values in the "keywords"
attribute are taken. If
you are following a guideline for the words/phrases in your "keywords"
attribute, put the name of that guideline here. The use of this
attribute is recommended and its value will be used by THREDDS to
identify the vocabulary from which the keywords come.
Common values for the
"keywords_vocabulary"
attribute include:
Vocabulary ID |
Reference URL |
---|---|
"AGU Index Terms" | http://www.agu.org/pubs/indexterms/ |
"GCMD Science Keywords" | http://gcmd.gsfc.nasa.gov/Resources/valids/gcmd_parameters.html |
The "license" attribute describes the restrictions to data access
and distribution. Use of this attribute is recommended, especially if
there are constraints on the use of the data.
Notes: information may change over time.
The "long_name" variable attribute provides a long descriptive name for the variable (not necessarily from a controlled vocabulary). Its use is highly recommended. If a "standard_name" attribute is not given (and a "standard_name_vocabulary" is given), the "long_name" attribute value will be used by THREDDS as the variable's name in the variable mapping. The "long_name" attribute is recommended by the "NetCDF Users Guide", the COARDS convention, and the CF convention.
The "processing_level" attribute provides a textual description of
the processing (or quality control) level of the data. The use of this
attribute is recommended.
The "project" attribute provides the name of the scientific project for which the data was created. The use of this attribute is recommended.
These attributes provide the data publisher's name, URL, and email.
The publisher may be an individual or an institution. The use of these
attributes is suggested.
Notes: multiple publishers; override information;
email address persistence
The "standard_name" variable attribute provides a name for the
variable from a standard list of names. I.e., the value is from a
controlled vocabulary of
variable names. We recommend using the CF convention and the variable names from the
CF standard name
table. Use of this attribute is highly recommended and its value will
be used by THREDDS as the variable's name in the variable mapping. (For
THREDDS use, this attribute takes precedence over the "long_name"
attribute.) This attribute is recommended by the CF convention.
Note: Just remember, for a file to be CF compliant, all the standard_name values must be from the CF standard name table.
Vocabulary ID |
Reference URL |
---|---|
"CF-1.0" |
http://www.cgd.ucar.edu/cms/eaton/cf-metadata/standard_name.html |
"GCMD Science Keywords" | http://gcmd.gsfc.nasa.gov/Resources/valids/gcmd_parameters.html |
The "summary" attribute gives a longer description of the dataset. Its use is highly recommended. In many discovery systems, the title and the summary will be displayed in the results list from a search. It should therefore capture the essence of the dataset it describes. For instance, we recommend this field include information on the type of data contained in the dataset, how the data was created (e.g., instrument X; or model X, run Y), the creator of the dataset, the project for which the data was created, the geospatial coverage of the data, and the temporal coverage of the data. This should just be a summary of this information, more detail should be provided in the recommended creator attributes, the recommended geospatial attributes, and the recommended temporal attributes.
These attributes are used to describe the temporal coverage of the data. The
temporal coverage of the data can be described with any of the following
pairs of values: start/end, start/duration, or end/duration. The start
and end values should be a date string like an ISO8601 date (e.g., "1999-07-04T22:30"), a
udunits date (e.g., "25 days since 1970-01-01"), or the string
"present". The duration value should be an ISO8601 duration string (e.g.,
"P10D"). The resolution provides an idea of the density of the data
inside the time range and should also be an ISO8601 duration string. The use of these attributes is recommended
The "title" attribute gives a brief description of the dataset. Its use is highly recommended and its value will be used by THREDDS as the name of the dataset. It therefore should be human readable and reasonable to display in a list of such names. The "title" attribute is recommended by the "NetCDF Users Guide" and the CF convention.
The "units" variable attribute gives the units of the data contained by that variable. The value of the "units" attribute should be a valid udunits string. Its use is highly recommended and its value will be used by THREDDS as the variable's units in the variable mapping. The "units" attribute is recommended by the "NetCDF Users Guide", the COARDS convention, and the CF convention.
Attribute | Description | THREDDS |
---|---|---|
date_available |
The date (often a range) on which this data was made available (or, if a range, during which the data was available). | metadata/date[@type="available"] |
date_valid | The date (often a range) for which the data is valid. | metadata/date[@type="valid"] |
comments to Ethan Davis