NetCDF
4.8.1
|
NetCDF-4 allows some interoperability with HDF5.
The HDF5 Files produced by netCDF-4 are perfectly respectable HDF5 files, and can be read by any HDF5 application.
NetCDF-4 relies on several new features of HDF5, including dimension scales. The HDF5 dimension scales feature adds a bunch of attributes to the HDF5 file to keep track of the dimension information.
It is not just wrong, but wrong-headed, to modify these attributes except with the HDF5 dimension scale API. If you do so, then you will deserve what you get, which will be a mess.
Additionally, netCDF stores some extra information for dimensions without dimension scale information. (That is, a dimension without an associated coordinate variable). So HDF5 users should not write data to a netCDF-4 file which extends any unlimited dimension, or change any of the extra attributes used by netCDF to track dimension information.
Also there are some types allowed in HDF5, but not allowed in netCDF-4 (for example the time type). Using any such type in a netCDF-4 file will cause the file to become unreadable to netCDF-4. So don't do it.
NetCDF-4 ignores all HDF5 references. Can't make head nor tail of them. Also netCDF-4 assumes a strictly hierarchical group structure. No looping, you weirdo!
Attributes can be added (they must be one of the netCDF-4 types), modified, or even deleted, in HDF5.
Assuming a HDF5 file is written in accordance with the netCDF-4 rules (i.e. no strange types, no looping groups), and assuming that every dataset has a dimension scale attached to each dimension, the netCDF-4 API can be used to read and edit the file, quite easily.
In HDF5 (version 1.8.0 and later), dimension scales are (generally) 1D datasets, that hold dimension data. A multi-dimensional dataset can then attach a dimension scale to any or all of its dimensions. For example, a user might have 1D dimension scales for lat and lon, and a 2D dataset which has lat attached to the first dimension, and lon to the second.
If dimension scales are not used, then netCDF-4 can still edit the file, and will invent anonymous dimensions for each variable shape. This is done by iterating through the space of each dataset. As each space size is encountered, a phony dimension of that size is checked for. It it does not exist, a new phony dimension is created for that size. In this way, a HDF5 file with datasets that are using shared dimensions can be seen properly in netCDF-4. (There is no shared dimension in HDF5, but data users will freqently write many datasets with the same shape, and intend these to be shared dimensions.)
Starting with version 4.7.3, if a dataset is encountered with uses the same size for two or more of its dataspace lengths, then a new phony dimension will be created for each. That is, a dataset with size [100][100] will result in two phony dimensions, each of size 100.