The netCDF-Java library can read datasets from a variety of sources. The dataset is named using a Uniform Resource Location (URL). This page summarizes the netCDF-Java API use of URLs.

Special Note: When working with remote data services, it’s important to note that not all servers handle encoded URLs. By default, netCDF-Java will encode illegal URI characters using percent encoding (e.g. [ will become %5B). If you find you are having trouble accessing a remote dataset due to the encoding, set the java System Property httpservices.urlencode to "false" using, for example System.setProperty("httpservices.urlencode", "false");.

ucar.nc2.NetcdfFile.open(String location)

Local Files

NetcdfFile can work with local files, e.g:

  • /usr/share/data/model.nc
  • file:/usr/share/data/model.nc
  • file:C:/share/data/model.nc (NOTE we advise using forward slashes everywhere, including Windows)
  • data/model.nc (relative to the current working directory)

When using a file location that has an embedded : char, eg C:/share/data/model.nc, it's a good idea to add the file: prefix, to prevent the C: from being misinterpreted as a URL schema.

Remote Files

HTTP

NetcdfFile can open HTTP remote files, served over HTTP, for example:

  • http://www.unidata.ucar.edu/software/netcdf-java/testdata/mydata1.nc

The HTTP server must implement the getRange header and functionality. Performance will be strongly affected by file format and the data access pattern.

To disambiguate HTTP remote files from OPeNDAP or other URLS, you can use httpserver: instead of http:, e.g.:

  • httpserver://www.unidata.ucar.edu/software/netcdf-java/testdata/mydata1.nc

AWS S3

NetcdfFiles and NetcdfDatasets can open files stored as a single object on S3 using the AWS RESTful API with byte range-requests, similar to HTTP. This new functionality is not available in the now deprecated NetcdfFile and NetcdfDataset open methods. You will also need to include the cdm-s3 artifact in your build. This is currently not part of netcdfAll.jar. To disambiguate S3 files from other URLs, you mush use the following URI pattern:

  • s3://bucket/key

In addition to knowing the bucket and key, you will need to specify the region in which the bucket you are trying to access is located, and potentially credentials. However, these are not specified through the S3 URI. The netCDF-Java S3RandomAccessFile class uses the Amazon S3 SDK library, which provides a few ways to specify both. We use the default Default Region Provider Chain and the Default Credential Provider Chain As a last resort, we try the AnonymousCredentialsProvider, which requires no configuration on your part.

As an example, if we would like to open a GOES 16 data file from the NOAA Big Data project’s AWS S3 bucket in the US East 1 region (open access), we could do the following:

String region = Region.US_EAST_1.toString();
String bucket = "noaa-goes16";
String key = "ABI-L1b-RadC/2019/363/21/OR_ABI-L1b-RadC-M6C16_G16_s20193632101189_e20193632103574_c20193632104070.nc";
String s3uri = "s3://" + bucket + "/" + key;

System.setProperty("aws.region", region);
try (NetcdfFile ncfile = NetcdfFiles.open(s3uri)) {
  ...
}

File Types

The local or remote file must be one of the formats that the netCDF-Java library can read. We call this set of files Common Data Model files, or CDM files for short, to make clear that the NetCDF-Java library is not limited to netCDF files.

If the URL ends with a with .Z, .zip, .gzip, .gz, or .bz2, the file is assumed to be compressed. The netCDF-Java library will uncompress/unzip and write a new file without the suffix, then read from the uncompressed file. Generally it prefers to place the uncompressed file in the same directory as the original file. If it does not have write permission on that directory, it will use the cache directory defined by ucar.nc2.util.DiskCache.

ucar.nc2.dataset.NetcdfDataset.openFile(String location)

NetcdfDataset adds another layer of functionality to the CDM data model, handling other protocols and optionally enhancing the dataset with Coordinate System information, scale/offset processing, dataset caching, etc.

  • openFile() can open the same datasets as NetcdfFile, plus those listed below.
  • openDataset() calls NetcdfDataset.openFile(), then optionally enhances the dataset.
  • acquireDataset() allows dataset objects to be cached in memory for performance.

OPeNDAP datasets

NetcdfDataset can open OPeNDAP datasets, which use a dods: or http: prefix, for example:

  • http://thredds.ucar.edu/thredds/dodsC/fmrc/NCEP/GFS/CONUS_95km/files/GFS_CONUS_95km_20070319_0600.grib1
  • dods://thredds.ucar.edu/thredds/models/NCEP/GFS/Global_5x2p5deg/GFS_Global_5x2p5deg_20070313_1200.nc

To avoid confusion with remote HTTP files, OPeNDAP URLs may use the dods: prefix. Also note that when passing an OPeNDAP dataset URL to the netCDF-Java library, do not include any the access suffixes, e.g. .dods, .ascii, .dds, etc.

For an http: URL, we make a HEAD request, and if it succeeds and returns a header with Content-Description="dods-dds" or "dods_dds", then we open as OPeNDAP. If it fails we try opening as an HTTP remote file. Using the dods: prefix makes it clear which protocol to use.

NcML datasets

NetcdfDataset can open NcML datasets, which may be local or remote, and must end with a .xml or .ncml suffix, for example:

  • /usr/share/data/model.ncml
  • file:/usr/share/data/model.ncml
  • https://www.unidata.ucar.edu/software/netcdf-java/testdata/mydata1.xml

Because xml is so widely used, we recommend using the .ncml suffix when possible.

THREDDS Datasets

NetcdfDataset can open THREDDS datasets, which are contained in THREDDS Catalogs. The general form is:

thredds:catalogURL#dataset_id

where catalogURL is the URL of a THREDDS catalog, and dataset_id is the ID of a dataset inside of that catalog. The thredds: prefix ensures that it is understood as a THREDDS dataset. Examples:

  • thredds:http://localhost:8080/test/addeStationDataset.xml#surfaceHourly
  • thredds:file:c:/dev/netcdf-java-2.2/test/data/catalog/addeStationDataset.xml#AddeSurfaceData

In the first case, http://localhost:8080/test/addeStationDataset.xml must be a catalog containing a dataset with ID surfaceHourly. The second case will open a catalog located at c:/dev/netcdf-java-2.2/test/data/catalog/addeStationDataset.xml and find the dataset with ID AddeSurfaceData.

NetcdfDataset will examine the thredds dataset object and extract the dataset URL, open it and return a NetcdfDataset. If there are more than one dataset access URL, it will choose a service that it understands. You can modify the preferred services by calling thredds.client.catalog.tools.DataFactory.setPreferAccess(). The dataset metadata in the THREDDS catalog may be used to augment the metadata of the NetcdfDataset.

THREDDS Resolver Datasets

NetcdfDataset can open THREDDS Resolver datasets, which have the form

thredds:resolve:resolverURL

The resolverURL must return a catalog with a single top level dataset, which is the target dataset. For example:

thredds:resolve:https://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/Global_0p25deg/latest.xml

In this case, https://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/Global_0p25deg/latest.html returns a catalog containing the latest dataset in the grib/NCEP/GFS/Global_0p25deg collection. NetcdfDataset will read the catalog, extract the THREDDS dataset, and open it as in section above.

CdmRemote Datasets

NetcdfDataset can open CDM Remote datasets, with the form

cdmremote:cdmRemoteURL

for example

  • cdmremote:http://server:8080/thredds/cdmremote/data.nc

The cdmRemoteURL must be an endpoint for a cdmremote web service, which provides index subsetting on remote CDM datasets.

DAP4 datasets

NetcdfDataset can open datasets through the DAP4 protocol. The url should either begin with dap4: or dap4:http:. Examples:

  • dap4:http://thredds.ucar.edu:8080/thredds/fmrc/NCEP/GFS/CONUS_95km/files/GFS_CONUS_95km_20070319_0600.grib1
  • dap4://thredds.ucar.edu:8080/thredds/models/NCEP/GFS/Global_5x2p5deg/GFS_Global_5x2p5deg_20070313_1200.nc

To avoid confusion with other protocols using HTTP URLs, DAP4 URLs are often converted to use the dap4: prefix. Also note that when passing a DAP4 dataset URL to the netCDF-Java library, do not include any of the access suffixes, e.g. .dmr, .dap, .dst, etc.

ucar.nc2.ft.FeatureDatasetFactoryManager.open()

FeatureDatasetFactory creates Feature Datasets for Coverages (Grids), Discrete Sampling Geometry (Point) Datasets, Radial Datasets, etc. These may be based on local files, or they may use remote access protocols.

FeatureDatasetFactoryManager can open the same URLs that NetcdfDataset and NetcdfFile can open, plus the following:

CdmrFeature Datasets

FeatureDatasetFactoryManager can open CdmRemote Feature Datasets, which have the form

cdmrFeature:cdmrFeatureURL

for example:

  • cdmrFeature:http://server:8080/thredds/cdmremote/data.nc

The cdmrFeatureURL must be an endpoint for a cdmrFeature web service, which provides coordinate subsetting on remote Feature Type datasets.

THREDDS Datasets

FeatureDatasetFactoryManager can also open CdmRemote Feature Datasets, by passing in a dataset ID in a catalog, exactly as in NetcdfDataset.open as explained above. The general form is

thredds:catalogURL#dataset_id

where catalogURL is the URL of a THREDDS catalog, and dataset_id is the ID of a dataset inside of that catalog. The thredds: prefix ensures that the URL is understood as a THREDDS catalog and dataset. Example:

  • thredds:http://localhost:8081/thredds/catalog/grib.v5/gfs_2p5deg/catalog.html#grib.v5/gfs_2p5deg/TwoD

If the dataset has a cdmrFeature service, the FeatureDataset will be opened through that service. This can be more efficient than opening the dataset through the index-based services like OPeNDAP and cdmremote.

Collection Datasets

FeatureDatasetFactoryManager can open collections of datasets specified with a collection specification string. This has the form

collection:spec

FeatureDatasetFactoryManager calls CompositeDatasetFactory.factory(wantFeatureType, spec) if found, which returns a FeatureDataset. Currently only a limited number of Point Feature types are supported. This is an experimental feature.

NcML referenced datasets

NcML datasets typically reference other CDM datasets, using the location attribute of the netcdf element, for example:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
     location="file:/dev/netcdf-java-2.2/test/data/example1.nc">
...

The location is passed to ucar.nc2.dataset.NetcdfDataset.openFile(), and so can be any valid CDM dataset location. In addition, an NcML referenced dataset location can be relative to the NcML file or the working directory:

  • A relative URL resolved against the NcML location (eg subdir/mydata.nc). You must not use a file: prefix in this case.
  • An absolute file URL with a relative path (eg file:data/mine.nc). The file will be opened relative to the working directory.

There are a few subtle differences between using a location in NcML and passing a location to the NetcdfDataset.openFile() and related methods:

  • In NcML, you MUST always use forward slashes in your paths, even when on a Windows machine. For example: file:C:/data/mine.nc. NetcdfFile.open() will accept backslashes on a Windows machine.
  • In NcML, a relative URL is resolved against the NcML location. In NetcdfFile.open(), it is interpreted as relative to the working directory.

NcML scan location

NcML aggregation scan elements use the location attribute to specify which directory to find files in, for example:

<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
  <aggregation dimName="time" type="joinExisting">
    <scan location="/data/model/" suffix=".nc" />
  </aggregation>
</netcdf>

Allowable forms of the location for the scan directory are:

  • /usr/share/data/
  • file:/usr/share/data/
  • file:C:/share/data/model.nc (NOTE we advise using forward slashes everywhere, including Windows)
  • data/model.nc (relative to the NcML directory)
  • file:data/model.nc (relative to the current working directory)

When using a directory location that has an embedded : char, e.g. C:/share/data/model.nc, its a really good idea to add the file: prefix, to prevent the C: from being misinterpreted as a URI schema.

Note that this is a common mistake:

<scan location="D:\work\agg" suffix=".nc" />

on a Windows machine, this will try to scan D:/work/agg/D:/work/agg. Use

<scan location="D:/work/agg" suffix=".nc" />

or better

<scan location="file:D:/work/agg" suffix=".nc" />