NetCDF Users Guide
v1.1
|
While netCDF is intended for "self-documenting data", it is often necessary for data writers and readers to agree upon attribute conventions and representations for discipline-specific data structures. These agreements are written up as human readable documents called netCDF conventions.
Use an existing Convention if possible. See the list of registered conventions.
The CF Conventions are recommended where applicable, especially for gridded (model) datasets.
Document the convention you are using by adding the global attribute "Conventions" to each netCDF file, for example:
This document refers to conventions for the netCDF classic data model. For recommendations about conventions for the netCDF-4 enhanced data model, see Developing Conventions for NetCDF-4.
A coordinate variable is a one-dimensional variable with the same name as a dimension, which names the coordinate values of the dimension. It must not have any missing data (for example, no _FillValue
or missing_value
attributes) and must be strictly monotonic (values increasing or decreasing). A two-dimensional variable of type char is a string-valued coordinate variable if it has the same name as its first dimension, e.g.: char time( time, time_len); all of its strings must be unique. A variable's coordinate system is the set of coordinate variables used by the variable. Coordinates that refer to physical space are called spatial coordinates, ones that refer to physical time are called time coordinates, ones that refer to either physical space or time are called spatio\ temporal coordinates.
unit
and long_name
attributes to document its meaning.You may structure the data in a netCDF file in different ways, for example putting related parameters into a single variable by adding an extra dimension. Standard visualization and analysis software may have trouble breaking that data out, however. On the other extreme, it is possible to create different variables e.g. for different vertical levels of the same parameter. However, standard visualization and analysis software may have trouble grouping that data back together. Here are some guidelines for deciding how to group your data into variables:
NetCDF-3 does not have a primitive String type, but does have arrays of type char, which are 8 bits in size. The main difference is that Strings are variable length arrays of chars, while char arrays are fixed length. Software written in C usually depends on Strings being zero terminated, while software in Fortran and Java do not. Both C (nc_get_vara_text()) and Java (ArrayChar.getString()) libraries have convenience routines that read char arrays and convert to Strings.
Time as a fundamental unit means a time interval, measured in seconds. A Calendar date/time is a specific instance in real, physical time. Dates are specified as an interval from some reference time e.g. "days elapsed since Greenwich mean noon on 1 January 4713 BCE". The reference time implies a system of counting time called a calendar (e.g. Gregorian calendar) and a textual representation (e.g. ISO 8601).
There are two strategies for storing a date/time into a netCDF variable. One is to encode it as a numeric value and a unit that includes the reference time, e.g. "seconds since 2001-1-1 0:0:0" or"days since 2001-1-1 0:0:0". The other is to store it as a String using a standard encoding and Calendar. The former is more compact if you have more than one date, and makes it easier to compute intervals between two dates.
Unidata's udunits package provides a convenient way to implement the first strategy. It uses the ISO 8601 encoding and a hybrid Gregorian/Julian calendar, but udunits does not support use of other Calendars or encodings for the reference time. However the ncdump "-T" option can display numeric times that use udunits (and optionally climate calendars) as ISO 8601 strings that are easy for humans to interpret.
NetCDF-3 does not have unsigned integer primitive types.
_Unsigned = "true"
to indicate that integer data should be treated as unsigned.Packed data is stored in a netCDF file by limiting precision and using a smaller data type than the original data, for example, packing double-precision (64-bit) values into short (16-bit) integers. TheC-based netCDF libraries do not do the packing and unpacking. (The netCDF Java library will do automatic unpacking when the Variable Enhanced Interface is used. For details see EnhancedScaleMissing).
Each variable with packed data has two attributes called scale_factor and add_offset, so that the packed data may be read and unpacked using the formula:
> unpacked_data_value = packed_data_value * scale_factor + > add_offset
To avoid introducing a bias into the unpacked values due to truncation when packing, the data provider should round to the nearest integer rather than just truncating towards zero before writing the data:
> packed_data_value = nint((unpacked_data_value - > add_offset) / scale_factor)
Depending on whether the packed data values are intended to be interpreted by the reader as signed or unsigned integers, there are alternative ways for the data provider to compute the scale_factor and add_offset attributes. In either case, the formulas above apply for unpacking and packing the data.
A conventional way to indicate whether a byte, short, or int variable is meant to be interpreted as unsigned, even for the netCDF-3 classic model that has no external unsigned integer type, is by providing the special variable attribute _Unsigned
with value "true"
. However, most existing data for which packed values are intended to be interpreted as unsigned are stored without this attribute, so readers must be aware of packing assumptions in this case. In the enhanced netCDF-4 data model, packed integers may be declared to be of the appropriate unsigned type.
Let n be the number of bits in the packed type, and assume dataMin and dataMax are the minimum and maximum values that will be used for a variable to be packed.
If the packed values are intended to be interpreted as signed integers (the default assumption for classic model data), you may use:
> scale_factor =(dataMax - dataMin) / (2^n^ - 1)
> add_offset = dataMin + 2^n\ -\ 1^ * scale_factor
If the packed values are intended to be interpreted as unsigned (for example, when read in the C interface using the nc_get_var_uchar()
function), use:
> scale_factor =(dataMax - dataMin) / (2^n^ - 1)
> add_offset = dataMin
In either the signed or unsigned case, an alternate formula may be used for the add_offset and scale_factor packing parameters that reserves a packed value for a special value, such as an indicator of missing data. For example, to reserve the minimum packed value (-2^n\ -\ 1^) for use as a special value in the case of signed packed values:
> scale_factor =(dataMax - dataMin) / (2^n^ - 2)
> add_offset = (dataMax + dataMin) / 2
If the packed values are unsigned, then the analogous formula that reserves 0 as the packed form of a special value would be:
> scale_factor =(dataMax - dataMin) / (2^n^ - 2)
> add_offset = dataMin - scale_factor
variables: short data( z, y, x); data:scale_offset = 34.02f; data:add_offset = 1.54f;
units
attribute applies to unpacked values.Missing data is a general name for data values that are invalid, never written, or missing. The netCDF library itself does not handle these values in any special way, except that the value of a _FillValue
attribute, if any, is used in pre-filling unwritten data. (The Java-netCDF library will assist in recognizing these values when reading, see class VariableStandardized).
_FillValue
attribute should have the same data type as the variable it describes. If the variable is packed using scale_factor
and add_offset
attributes, the _FillValue
attribute should have the data type of the packed data.variables: float data( z, y, x); data:valid_range = -999.0f, 999.0f;
If the variable is packed using scale_factor
and add_offset
attributes, the valid_range
attribute should have the data type of the packed data.
If the variable is unsigned the valid_range
values should be widened if needed and stored as unsigned integers.
nc__create
and nc__enddef
(nf__create
and nf__enddef
for Fortran) for more details on reserving extra space in the header.There are 3 correct spellings of "netCDF":
#include <netcdf.h>