Dataset URLs

The netCDF-Java library can read datasets from a variety of sources. The dataset is named using a Uniform Resource Location (URL). This page summarizes the netCDF-Java API use of URLs.

Note: When working with remote data services, it's important to note that not all servers handle encoded URLs. By default, netCDF-Java will encode illegal URI characters using percent encoding (e.g. [ will become %5B). If you find you are having trouble accessing a remote dataset due to the encoding, set the java System Property httpservices.urlencode to "false" using, for example System.setProperty("httpservices.urlencode", "false");.

ucar.nc2.NetcdfFile.open(String location)

1. Local Files

NetcdfFile can work with local files, e.g:

When using a file location that has an embedded ':' char, eg C:/share/data/model.nc, its a good idea to add the file: prefix, to prevent the 'C:' from being misinterpreted as a URL schema.

2. HTTP Remote Files

NetcdfFile can open HTTP remote files, served over HTTP, for example:

The HTTP server must implement the getRange header. Performance will be strongly affected by file format and the data access pattern.

The local or remote file must be one of the formats that the netCDF-Java library can read. We call this set of files Common Data Model files, (or CDM files for short), to make clear that the NetCDF-Java library is not limited to netCDF files.

If the URL ends with a with ".Z", ".zip", ".gzip", ".gz", or ".bz2", the file is assumed to be compressed. The netCDF-Java library will uncompress/unzip and write a new file without the suffix, then read from the uncompressed file. Generally it prefers to place the uncompressed file in the same directory as the original file. If it does not have write permission on that directory, it will use the cache directory defined by ucar.nc2.util.DiskCache.

 

ucar.nc2.dataset.NetcdfDataset.openFile(String location)

NetcdfDataset.openDataset() simply calls a NetcdfDataset.openFile(), then optionally enhances the dataset.

NetcdfDataset.openFile can open the same URLs that NetcdfFile can open, plus the following:

1. OPeNDAP datasets

NetcdfDataset can open OPeNDAP datasets, which use a dods: or http: prefix, for example:

To avoid confusion with remote HTTP files, OPeNDAP URLs are often converted to use the dods: prefix. Also note that when passing an OPeNDAP dataset URL to the netCDF-Java library, do not include any of the access suffixes, e.g. .dods, .ascii, .dds, etc.

Theres an ambiguity as to whether http://server/something is an OPeNDAP or an HTTP remote file using range requests. There will be more ambiguities in the future, as other HTTP based protocols are added. Currently we do a HEAD request on http://server/something.dds, and if it succeeds, and returns a header  Content-Description="dods-dds" or "dods_dds", then we open as OPeNDAP, and if it fails we try opening as an HTTP file.

2. NcML datasets

NetcdfDataset can open NcML datasets, which may be local or remote, and must end with a .xml or .ncml suffix, for example:

Because xml is so widely used, we recommend using the .ncml suffix when possible.

3. THREDDS Datasets

NetcdfDataset can open THREDDS datasets, which are contained in THREDDS Catalogs. The general form is thredds:catalogURL#dataset_id , where catalogURL is the URL of a THREDDS catalog, and dataset_id is the ID of a dataset inside of that catalog. The thredds: prefix ensures that it is understood as a THREDDS dataset. Example:

In the first case, http://localhost:8080/test/addeStationDataset.xml must be a catalog containing a dataset with ID surfaceHourly. The second case will open a catalog stored at c:/dev/netcdf-java-2.2/test/data/catalog/addeStationDataset.xml and look for a dataset with ID AddeSurfaceData.

NetcdfDataset will examine the dataset and extract the dataset URL and open it and return a NetcdfDataset. (If there are more than one dataset access URL, it will choose a service that it understands the best, like OPeNDAP). The dataset metadata in the THREDDS catalog may be used to augment the metadata of the NetcdfDataset.

4. THREDDS Resolver Datasets

NetcdfDataset can open THREDDS Resolver datasets, which have the form thredds:resolve:resolverURL. In this case it expects that the resolverURL will return a catalog with a single top level dataset, which is the target dataset. Example:

In this case, http://motherlode.ucar.edu:8080/thredds/dodsC/model/NCEP/NAM/CONUS_12km/latest.xml returns a catalog contining the latest dataset in the NCEP/NAM/CONUS_12km collection. NetcdfDataset will read the catalog, extract the THREDDS dataset, and open it as in section 3 above.

5. CdmRemote Datasets

NetcdfDataset can open CDM Remote datasets, which have the form cdmremote:http://server:8080/thredds/cdmremote/data.nc. In this case it expects that the URL is an endpoint for a cdmremote web service, which provides index subsetting on remote CDM datasets. This is an experimental web service.

6. DAP4 datasets

NetcdfDataset can open datasets through the DAP4 protocol. The url should either begin with dap4: or dap4:http(s): Example might include the following.

To avoid confusion with remote HTTP files, DAP4 URLs are often converted to use the dap4: prefix. Also note that when passing a DAP4 dataset URL to the netCDF-Java library, do not include any of the access suffixes, e.g. .dmr, .dap, .dst, etc.

ucar.nc2.ft.FeatureDatasetFactoryManager.open()

A FeatureDatasetFactory creates Scientific Feature Type Datasets such as GridDatasets, PointFeatureDatasets, RadialDatasetSweep, etc. These may be based on local files, or they may use remote access protocols.

ucar.nc2.ft.FeatureDatasetFactoryManager.open(FeatureType wantFeatureType, String location, CancelTask task, Formatter errlog)

FeatureDatasetFactoryManager.open() looks for a FeatureDatasetFactory that knows how to create a FeatureDataset from the named location. If the wantFeatureType parameter is not null, it will only look for factories that return that type.

FeatureDatasetFactoryManager can open the same URLs that NetcdfDataset and NetcdfFile can open, plus the following:

1. CdmRemote Feature Datasets

FeatureDatasetFactoryManager can open CdmRemote Feature Datasets, which have the form cdmremote:http://server:8080/thredds/cdmremote/data.nc. In this case it expects that the URL is an endpoint for a cdmremote feature dataset web service, which provides coordinate subsetting on remote Feature Type datasets. This is an experimental web service.

2. Collection Datasets

FeatureDatasetFactoryManager can open collections of datasets specified with a collection specification string, which has the form collection:spec, and calls CompositeDatasetFactory.factory(wantFeatureType, spec) if found, which returns a FeatureDataset. Currently only a limited number of Point Feature tyeps are supported. This is an experimental feature.

 


NcML referenced datasets

NcML datasets typically reference other CDM datasets, using the location attribute of the netcdf element, for example:

<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" location="file:/dev/netcdf-java-2.2/test/data/example1.nc"> ...

The location is passed to ucar.nc2.dataset.NetcdfDataset.openFile(), and so can be any valid CDM dataset location. In addition, an NcML referenced dataset location can be reletive to the NcML file or the working directory:

There are a few subtle differences between using a location in NcML and passing a location to the NetcdfDataset.openFile() and related methods:

  1. In NcML, you MUST always use forward slashes in your paths, even when on a Windows machine. For example: file:C:/data/mine.nc. NetcdfFile.open() will accept backslashes on a Windows machine.
  2. In NcML, a relative URL is resolved against the NcML location. In NetcdfFile.open(), it is interpreted as relative to the working directory.

NcML scan location

NcML aggregation scan elements use the location attribute to specify which directory to find files in, for example:

<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
  <aggregation dimName="time" type="joinExisting">
 <scan location="/data/model/" suffix=".nc" />
  </aggregation>
</netcdf>

Allowable forms of the location for the scan directory are:

When using a directory location that has an embedded ':' char, eg C:/share/data/model.nc, its a really good idea to add the file: prefix, to prevent the 'C:' from being misinterpreted as a URI schema. Future versions of NcML may use URIs for the location.

Common mistakes:

 


This document is maintained by Unidata. Send comments to THREDDS support. Last updated: October 2009