NetCDF Attribute Convention for Dataset Discovery

Ethan Davis

last updated 28 September 2005


This page has been superseded. The current and development versions can be found here.


This document describes NetCDF attributes recommended for describing a NetCDF dataset to discovery systems such as Digital Libraries. THREDDS tools will use these attributes for extracting metadata from datasets, and exporting to Dublin Core, DIF, ADN, FGDC, ISO 19115 etc. metadata formats.

These attributes parallel THREDDS catalog specification's digital library metadata. Attributes are used to add information inside the NetCDF file, while THREDDS catalog metadata adds information external to the NetCDF file.

Where appropriate, we use attributes described in the NetCDF Users Guide as well as some attributes defined in the CF convention. Some we use directly (e.g., "title" and "history") others we use unless more detailed attributes defined here are given (e.g., "institution" vs "creator_*").

Related Documents



Summary of Global Attributes

Conventions global attribute

NetCDF files conforming to this specification must add the global attribute:

  :Metadata_Conventions = "Unidata Dataset Discovery v1.0"; 

When following multiple metadata conventions, list them with a comma separator.

Highly Recommended


Attribute Description THREDDS
title
A short description of the dataset.
dataset@name
summary
A paragraph describing the dataset.
metadata/documentation[@type="summary"]
keywords
A comma separated list of key words and phrases.
metadata/keyword

Recommended

Attribute Description THREDDS
id
The combination of the "naming authority" and the "id" should be a globally unique identifier for the dataset.
dataset@id
naming_authority
dataset@authority
metadata/authority
keywords_vocabulary
If you are following a guideline for the words/phrases in your "keywords" attribute, put the name of that guideline here.
metadata/keyword@vocabulary
cdm_data_type
The THREDDS data type appropriate for this dataset. metadata/dataType
history
Provides an audit trail for modifications to the original data. metadata/documentation[@type="history"]
comment
Miscellaneous information about the data. metadata/documentation
date_created The date on which the data was created.
metadata/date[@type="created"]
creator_name
The data creator's name, URL, and email. The "institution" attribute will be used if the "creator_name" attribute does not exist.
metadata/creator/name
creator_url
metadata/creator/contact@url
creator_email
metadata/creator/contact@email
institution
metadata/creator/name
project
The scientific project that produced the data.
metadata/project
processing_level A textual description of the processing (or quality control) level of the data.
metadata/documentation[@type="processing_level"]
acknowledgment A place to acknowledge various type of support for the project that produced this data.
metadata/documentation[@type="funding"]
geospatial_lat_min
Describes a simple latitude, longitude, and vertical bounding box. For a more detailed geospatial coverage, see the suggested geospatial attributes.
metadata/geospatialCoverage/northsouth/start
geospatial_lat_max metadata/geospatialCoverage/northsouth/size
geospatial_lon_min metadata/geospatialCoverage/eastwest/start
geospatial_lon_max metadata/geospatialCoverage/eastwest/size
geospatial_vertical_min
metadata/geospatialCoverage/updown/start
geospatial_vertical_max metadata/geospatialCoverage/updown/size
time_coverage_start Describes the temporal coverage of the data as a time range. metadata/timeCoverage/start
time_coverage_end metadata/timeCoverage/end
time_coverage_duration metadata/timeCoverage/duration
time_coverage_resolution metadata/timeCoverage/resolution
standard_name_vocabulary
The name of the controlled vocabulary from which variable standard names are taken.
metadata/variables@vocabulary
license Describe the restrictions to data access and distribution. metadata/documentation[@type="rights"]

Suggested

Attribute Description THREDDS
contributor_name
The name and role of any individuals or institutions that contributed to the creation of this data.
metadata/contributor
contributor_role
metadata/contributor@role
publisher_name
The data publisher's name, URL, and email. The publisher may be an individual or an institution. metadata/publisher/name
publisher_url
metadata/publisher/contact@url
publisher_email
metadata/publisher/contact@email
date_modified
The date on which this data was last modified.
metadata/date[@type="modified"]
date_issued
The date on which this data was formally issued.
metadata/date[@type="issued"]
geospatial_lat_units
Further refinement of the geospatial bounding box can be provided by using these units and resolution attributes.
metadata/geospatialCoverage/northsouth/units
geospatial_lat_resolution metadata/geospatialCoverage/northsouth/resolution
geospatial_lon_units
metadata/geospatialCoverage/eastwest/units
geospatial_lon_resolution metadata/geospatialCoverage/eastwest/resolution
geospatial_vertical_units
metadata/geospatialCoverage/updown/units
geospatial_vertical_resolution
metadata/geospatialCoverage/updown/resolution
geospatial_vertical_positive
metadata/geospatialCoverage@zpositive

Summary of Variable Attributes

Highly Recommended

Attribute Description THREDDS
long_name
A long descriptive name for the variable (not necessarily from a controlled vocabulary).
metadata/variables/variable@vocabulary_name
standard_name
A long descriptive name for the variable taken from a controlled vocabulary of variable names.
metadata/variables/variable@vocabulary_name
units
The units of the variables data values. This attributes value should be a valid udunits string.
metadata/variables/variable@units


Attributes

acknowledgment Attribute

The "acknowledgment" attribute provides a place to acknowledge various types of support for the project that produced the data. Use of this attribute is recommended.

cdm_data_type Attribute

The "cdm_data_type" attribute gives the THREDDS data type appropriate for this dataset. E.g., "Grid", "Image", "Station", "Trajectory", "Radial". Its use is recommended.

comment Attribute

The "comment" attribute allows for miscellaneous information about the dataset. Use of this attribute is recommended as appropriate. This attribute originated in the CF convention.

contributor_name and contributor_role Attribute

These attributes provide the name and role of any individuals or institutions that contributed to the creation of the data. The use of these attributes is suggested.

creator_email, creator_name, creator_url, and institution Attributes

These attributes provide the name, URL, and email contact information for the creator of the data. The data creator may be an individual or an institution. If the "creator_name" attribute does not exist, the "institution" attribute will be used. If creator information other than name is to be given, we recommend use of the "creator_*" attributes.

Note: email address persistence

date_created Attribute

The "date_created" attribute gives the date on which the data was created. Its use is recommended.

date_issued Attribute

The "date_issued" attribute  provides the date on which this data was formally issued. Use of this attribute is suggested when relevant to the data and distinct from other dates used for this data.

date_modified Attribute

The "date_modified" attribute provides the date on which the data was last modified. Use of this attribute is suggested if the data has been modified since the date of creation.

geospatial_lat_max, geospatial_lat_min, geospatial_lat_resolution, geospatial_lat_units, geospatial_lon_max, geospatial_lon_min, geospatial_lon_resolution, geospatial_lon_units, geospatial_vertical_max, geospatial_vertical_min, geospatial_vertical_positive, geospatial_vertical_resolution, and geospatial_vertical_units Attributes

Use the min and max attributes to describe a simple latitude, longitude, vertical bounding box. If none of the other attributes are used, latitude is assumed to be in decimal degrees north, longitude is assumed to be in decimal degrees east, and vertical is assumed to be in meters above ground. The use of these min/max geospatial attributes is recommended.

Further refinement of the geospatial bounding box can be provided by using the units and resolution attributes. The geospatial_vertical_positive attribute indicates which direction is positive (a value of "up" means that z increases up, like units of height, while a value of "down" means that z increases downward, like units of pressure or depth). The use of these further geospatial attributes is suggested.

history Attribute

The "history" attribute provides an audit trail for modifications to the original data. It should contain a separate line for each modification with each line including a timestamp, user name,  modification name, and modification arguments. Its use is recommended and its value will be used by THREDDS as a history-type documentation. The "history" attribute is recommended by the NetCDF Users Guide and the CF convention.

id and naming_authority Attributes

The "id" and "naming_authority" attributes are intended to provide a globally unique identification for each dataset. The "id" value should attempt to uniquely identify the dataset. The naming authority allows a further refinement of the "id". The combination of the two should be globally unique for all time. We recommend using reverse-DNS naming for the naming authority. For example, naming_authority="edu.ucar.unidata" and id="NCEP/NAM_211_2005-05-24_12Z".

keywords Attribute

The "keywords" attribute lists key words and phrases that are relevant to the dataset. Its use is highly recommended. The values in the list may be taken from a controlled list of keywords (e.g., the AGU Index list or the GCMD Science Keywords). If a controlled list is used, the "keywords_vocabulary" attribute may be used to identify the list.

keywords_vocabulary Attribute

The "keywords_vocabulary" attribute identifies the controlled list of keywords from which the values in the "keywords" attribute are taken.  If you are following a guideline for the words/phrases in your "keywords" attribute, put the name of that guideline here. The use of this attribute is recommended and its value will be used by THREDDS to identify the vocabulary from which the keywords come.

Common values for the "keywords_vocabulary" attribute include:

Vocabulary ID
Reference URL
"AGU Index Terms" http://www.agu.org/pubs/indexterms/
"GCMD Science Keywords" http://gcmd.gsfc.nasa.gov/Resources/valids/gcmd_parameters.html
 

license Attribute

The "license" attribute describes the restrictions to data access and distribution. Use of this attribute is recommended, especially if there are constraints on the use of the data.

Notes: information may change over time.

long_name Attribute

The "long_name" variable attribute provides a long descriptive name for the variable (not necessarily from a controlled vocabulary). Its use is highly recommended. If a "standard_name" attribute is not given (and a "standard_name_vocabulary" is given), the "long_name" attribute value will be used by THREDDS as the variable's name in the variable mapping. The "long_name" attribute is recommended by the "NetCDF Users Guide", the COARDS convention, and the CF convention.

processing_level Attribute

The "processing_level" attribute provides a textual description of the processing (or quality control) level of the data. The use of this attribute is recommended.

project Attribute

The "project" attribute provides the name of the scientific project for which the data was created. The use of this attribute is recommended.

publisher_name, publisher_url, and publisher_email Attribute

These attributes provide the data publisher's name, URL, and email. The publisher may be an individual or an institution. The use of these attributes is suggested.

Notes: multiple publishers; override information; email address persistence

standard_name Attribute

The "standard_name" variable attribute provides a name for the variable from a standard list of names. I.e., the value is from a controlled vocabulary of variable names. We recommend using the CF convention and the variable names from the CF standard name table. Use of this attribute is highly recommended and its value will be used by THREDDS as the variable's name in the variable mapping. (For THREDDS use, this attribute takes precedence over the "long_name" attribute.) This attribute is recommended by the CF convention.

Note: Just remember, for a file to be CF compliant, all the standard_name values must be from the CF standard name table.

standard_name_vocabulary Attribute

The "standard_name_vocabulary" attribute indicates which controlled list of variable names has been used in the "standard_name" attribute. Use of this attribute is recommended and their value will be used by THREDDS in the variable mapping. If the file uses the CF convention (and the Convention attribute indicates this), THREDDS will assume the standard_name values are from the CF convention standard name table.

Common values for the "standard_name_vocabulary" attribute include:
Vocabulary ID
Reference URL
"CF-1.0"
http://www.cgd.ucar.edu/cms/eaton/cf-metadata/standard_name.html
"GCMD Science Keywords" http://gcmd.gsfc.nasa.gov/Resources/valids/gcmd_parameters.html
 

summary Attribute

The "summary" attribute gives a longer description of the dataset. Its use is highly recommended. In many discovery systems, the title and the summary will be displayed in the results list from a search. It should therefore capture the essence of the dataset it describes. For instance, we recommend this field include information on the type of data contained in the dataset, how the data was created (e.g., instrument X; or model X, run Y), the creator of the dataset, the project for which the data was created, the geospatial coverage of the data, and the temporal coverage of the data. This should just be a summary of this information, more detail should be provided in the recommended creator attributes, the recommended geospatial attributes, and the recommended temporal attributes.

time_coverage_start, time_coverage_end, time_coverage_duration, and time_coverage_resolution Attributes

These attributes are used to describe the temporal coverage of the data. The temporal coverage of the data can be described with any of the following pairs of values: start/end, start/duration, or end/duration. The start and end values should be a date string like an ISO8601 date (e.g., "1999-07-04T22:30"), a udunits date (e.g., "25 days since 1970-01-01"), or the string "present". The duration value should be an ISO8601 duration string (e.g., "P10D"). The resolution provides an idea of the density of the data inside the time range and should also be an ISO8601 duration string. The use of these attributes is recommended

title Attribute

The "title" attribute gives a brief description of the dataset. Its use is highly recommended and its value will be used by THREDDS as the name of the dataset. It therefore should be human readable and reasonable to display in a list of such names. The "title" attribute is recommended by the "NetCDF Users Guide" and the CF convention.

units Attribute

The "units" variable attribute gives the units of the data contained by that variable. The value of the "units" attribute should be a valid udunits string. Its use is highly recommended and its value will be used by THREDDS as the variable's units in the variable mapping. The "units" attribute is recommended by the "NetCDF Users Guide", the COARDS convention, and the CF convention.


Notes

  1. Since some datasets are made available from many sites, users may decide to not provide this information.
  2. Since this information may change over time, users may decide not to provide this information.
  3. Any information can be overridden at the THREDDS catalog level.
  4. Since data files are often archived, try using email address that will work for the long-term. Perhaps use an institutional email address like support@<institution> or data@<institution>

Changes



comments to Ethan Davis