|  | NetCDF 4.9.3-rc2
    | 
The NetCDF-C library supports limited access to cloud storage. Currently, that access is restricted to the Amazon S3 cloud storage, so this document is S3-centric. It is expected that over time, access to additional cloud stores will be added, and this document will be expanded to cover those additional cases.
At the moment, the NetCDF-C library provides access to S3 for the following purposes:
Three S3 storage drivers are available for accessing Amazon S3.
All three S3 drivers use the AWS profile mechanism to provide configuration information, and especially to provide authorization information. Specifically, the ''~/.aws/credentials'' file should contain something like this.
The NetCDF-C library contains a mechanism for accessing traditional netcdf-4 files stored on remote computers. The idea is to treat the remote data as if it was one big file, and to use the HTTP protocol to read a contiguous sequence of bytes from that remote "file". This is performed using the "byte-range" header in an HTTP request.
In the Amazon S3 context, a copy of a dataset, a netcdf-3 or netdf-4 file, is uploaded into a single object in some bucket. Then using the key to this object, it is possible to tell the netcdf-c library to treat the object as a remote file and to use the HTTP Byte-Range protocol to access the contents of the object. The dataset object is referenced using a URL with the trailing fragment containing the string #mode=bytes.
An examination of the test program nc_test/test_byterange.sh shows simple examples using the ncdump program. One such test is specified as follows:
Note that for S3 access, it is expected that the URL is in what is called "path" format where the bucket, noaa-goes16 in this case, is part of the URL path instead of the host.
The _::mode=bytes_ mechanism generalizes to work with most servers that support byte-range access.
Specifically, Thredds servers support such access using the HttpServer access method as can be seen from this URL taken from the above test program.
[1] Amazon Simple Storage Service Documentation
 [2] Amazon Simple Storage Service Library
 [11] Conda-forge / packages / aws-sdk-cpp
Currently the following build cases are known to work.
| Operating System | Build System | SDK | S3 Support | 
| Linux | Automake | aws-s3-sdk | yes | 
| Linux | Automake | nch5s3comms | yes | 
| Linux | CMake | aws-s3-sdk | yes | 
| Linux | CMake | nch5s3comms | yes | 
| OSX | Automake | aws-s3-sdk | unknown | 
| OSX | Automake | nch5s3comms | yes | 
| OSX | CMake | aws-s3-sdk | unknown | 
| OSX | CMake | nch5s3comms | yes | 
| Visual Studio | CMake | aws-s3-sdk | no (tests fail) | 
| Visual Studio | CMake | nch5s3comms | yes | 
| Cygwin | Automake | aws-s3-sdk | no (tests fail) | 
| Cygwin | Automake | nch5s3comms | yes | 
| Cygwin | CMake | aws-s3-sdk | no | 
| Cygwin | CMake | nch5s3comms | yes | 
| Mingw | Automake | aws-s3-sdk | unknown | 
| Mingw | Automake | nch5s3comms | yes | 
| Mingw | CMake | aws-s3-sdk | unknown | 
| Mingw | CMake | nch5s3comms | yes | 
There are several options relevant to Amazon S3 support. These are as follows.
A note about using S3 with Automake. If S3 support is desired, and using the Amazon "aws-sdk-cpp" SDK, and using Automake, then LDFLAGS must be properly set, namely to this.
The above assumes that these libraries were installed in '/usr/local/lib', so the above requires modification if they were installed elsewhere.
Note also that if S3 support is enabled, then you need to have a C++ compiler installed because the "aws-sdk-cpp" S3 support code is written in C++.
The necessary CMake flags are as follows (with defaults)
Note that unlike Automake, CMake can properly locate C++ libraries, so it should not be necessary to specify -laws-cpp-sdk-s3 assuming that the aws s3 libraries are installed in the default location. For CMake with Visual Studio, the default location is here:
It is possible to install the sdk library in another location. In this case, one must add the following flag to the cmake command.
where "awssdkdir" is the path to the sdk installation. For example, this might be as follows.
This can be useful if blanks in path names cause problems in your build environment.
As mentioned, three S3 storage drivers are available for accessing Amazon S3.
This driver is part of the HDF5 library codebase. It must be enabled at the time that the HDF5 library is built by using the –enable-ros3=vfd option. If built, then the NetCDF-C build process should detect it and make use of it.
Amazon provides (thru AWS-labs) an SDK for accessing the Amazon S3 cloud. This library, aws-sdk-cpp library, has a number of properties of interest:
For linux, the following context works. Of course your mileage may vary.
It is possible to build and install aws-sdk-cpp on Windows using CMake. Unfortunately, testing currently fails.
For Windows, the following context work. Of course your mileage may vary.
This command-line build assumes one is using Cygwin or Mingw to provide tools such as bash.
Notice that the sdk is being installed in the directory "c:\tools\aws-sdk-cpp" rather than the default location "c:\Program Files (x86)/aws-sdk-cpp-all" This is because when using a command line, an install path that contains blanks may not work.
In order for CMake to find the aws sdk libraries, the following environment variables must be set:
Then the following options must be specified for cmake.
This is an experimental SDK provided internally in the netcdf-c library.
In order to enable this SDK, the Automake option –enable-s3-internal or the CMake option -DNETCDF_ENABLE_S3_INTERNAL=ON must be specified.
The pure S3 test(s) are in the unit_tests directory. Currently, by default, testing of S3 is supported only for Unidata members of the NetCDF Development Group. This is because it uses a Unidata-specific bucket is inaccessible to the general user.
If byterange support is enabled, the netcdf-c library will parse the files
to extract profile names plus a list of key=value pairs. In case of duplicates, credentials takes precedence over config.
This example is typical of the contents of these files.
The keys in the profile will be used to set various parameters in the library
The algorithm for choosing the active profile to use is as follows:
The profile named "no" is a special profile that the netcdf-c library automatically defines. It should not be defined anywhere else. It signals to the library that no credentialas are to used. It is equivalent to the "--no-sign-request" option in the AWS CLI.
If the specified URL is of the form
Then this is rebuilt to this form:
However this requires figuring out the region to use. The algorithm for picking an region is as follows.
Picking an access-key/secret-key pair is always determined by the current active profile. To choose to not use keys requires that the active profile must be "no".
[Note: minor text changes are not included.]
Author: Dennis Heimbigner
 Email: dmh at ucar dot edu
 Initial Version: 3/8/2023
 Last Revised: 3/8/2023