Definitions
A forecast model is a scientific model that predicts the time evolution of a system starting from some initial state. Its output is a time series for each of its variables, which may be state variables or derived variables. The model is often run periodically, so one may have a collection of different runs. The collection of data output for a particular model run is called a forecast model run, which may be output in a single file or in several files. The model is run periodically, and so one may have a collection of forecast model runs, which we assume can be uniquely identified by the start of the model run, called the model run time, also called the analysis time or generating time.
The time series for a forecast model run is the list of forecast times, also known as the valid times. (For our purposes, the forecast times are just whatever the time series is, ignoring whether its a forecast or a nowcast, etc.). The difference between the run time and the forecast time is the forecast offset, sometimes called the forecast hour.
Grid datasets with two time dimensions
The ucar.nc2.dt.GridDatatype is the CDM scientific datatype for grids. It has been generalized to allow two time dimensions, called the runtime and time dimensions, in order to represent collections of forecast model runs. Such a dataset can be created by writing a single file, or by using NcML aggregation to create a virtual dataset out of multiple files.
Both the runtime and time coordinates may have type String</b> or char and hold ISO 8601 dates, or have a udunit date unit string in its units attribute. For example:
String runtime(run=8);
:long_name = "Run time for model";
:standard_name = "forecast_reference_time";
:_CoordinateAxisType = "RunTime";
data:
"2006-09-05T12:00:00Z", "2006-09-06T12:00:00Z", "2006-09-07T12:00:00Z", "2006-09-08T12:00:00Z",
"2006-09-09T12:00:00Z", "2006-09-10T12:00:00Z", "2006-09-11T12:00:00Z", "2006-09-12T12:00:00Z"
The time coordinate is the forecast (valid) time, and it will differ for each run, and so will be two dimensional:
double time(run=8, time=16);
:units = "hours since 2006-09-05T12:00:00Z";
:long_name = "forecast (valid) time";
:standard_name = "time";
:_CoordinateAxisType = "Time";
data:
{90.0, 96.0, 102.0, 108.0, 114.0, 120.0, 126.0, 132.0, 138.0, 144.0, 150.0, 156.0, 162.0, 168.0, 174.0, 180.0},
{114.0, 120.0, 126.0, 132.0, 138.0, 144.0, 150.0, 156.0, 162.0, 168.0, 174.0, 180.0, 186.0, 192.0, 198.0, 204.0},
{138.0, 144.0, 150.0, 156.0, 162.0, 168.0, 174.0, 180.0, 186.0, 192.0, 198.0, 204.0, 210.0, 216.0, 222.0, 228.0},
{162.0, 168.0, 174.0, 180.0, 186.0, 192.0, 198.0, 204.0, 210.0, 216.0, 222.0, 228.0, 234.0, 240.0, 246.0, 252.0},
{186.0, 192.0, 198.0, 204.0, 210.0, 216.0, 222.0, 228.0, 234.0, 240.0, 246.0, 252.0, 258.0, 264.0, 270.0, 276.0},
{210.0, 216.0, 222.0, 228.0, 234.0, 240.0, 246.0, 252.0, 258.0, 264.0, 270.0, 276.0, 282.0, 288.0, 294.0, 300.0},
{234.0, 240.0, 246.0, 252.0, 258.0, 264.0, 270.0, 276.0, 282.0, 288.0, 294.0, 300.0, 306.0, 312.0, 318.0, 324.0},
{258.0, 264.0, 270.0, 276.0, 282.0, 288.0, 294.0, 300.0, 306.0, 312.0, 318.0, 324.0, 330.0, 336.0, 342.0, 348.0}
The data variables will generally have both the runtime and time dimensions, as well as the z, y, and x dimensions:
float Dew_point_temperature(run=8, time=16, height_above_ground1=1, y=689, x=1073);
:units = "K";
:long_name = "Dew point temperature @ height_above_ground";
A dataset with a runtime dimension and a 2D time dimension as described here is called an FMRC (Forecast Run Model Collection) dataset. You can open it as an ordinary dataset and manipulate it through the NetcdfFile or NetcdfDataset APIs.
More typically you want to open it as a ucar.nc2.dt.GridDataset, so that the grid variables are found and made into ucar.nc2.dt.GridDatatype objects, and especially so that the time coordinates are found through methods on the ucar.nc2.dt.GridCoordSystem:
public CoordinateAxis1DTime getRunTimeAxis();
public CoordinateAxis1DTime getTimeAxisForRun(int run_index);
Possibly more interesting is to make it into a ucar.nc2.dt.fmrc.ForecastModelRunCollection object, which allows you to view the dataset in several ways. This option is described in section 4 below.
Aggregating Forecast Model Runs
A common case is that the model output is spread out in multiple files. A special kind of NcML aggregation can be used to create an FMRC dataset.
Case 1: All data for each forecast model run is in a single file
This case is similar to a JoinNew aggregation, in that a new, outer dimension is created, and each file becomes one slice of the new dataset.
<?xml version="1.0" encoding="UTF-8"?>
(1)<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="true" >
(2) <aggregation dimName="runtime" type="forecastModelRunCollection">
(3) <netcdf location="file:/data/ldm/NAM_CONUS_80km/Run_20060910_0000.grib1" coordValue="2006-09-10T00:00:00Z" enhance="true" />
<netcdf location="file:/data/ldm/NAM_CONUS_80km/Run_20060910_0600.grib1" coordValue="2006-09-10T06:00:00Z" enhance="true" />
<netcdf location="file:/data/ldm/NAM_CONUS_80km/Run_20060910_1200.grib1" coordValue="2006-09-10T12:00:00Z" enhance="true" />
</aggregation>
</netcdf>
- The netcdf element always has enhance=”true”, which adds the coordinate systems needed for a GridDataset.
- A forecastModelRunCollection aggregation is declared, and an outer dimension called runtime will be created.
- All the files in the collection are explicitly named, as well as their runtime coordinate values. The values must be ISO 8601 formatted dates. The files themselves must contain all the output times from one model run. The attribute enhance=”true” adds the coordinate systems needed to identify the (forecast) time coordinate.
Equivalently, you can use an NcML scan element:
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2" enhance="true" >
<aggregation dimName="run" type="forecastModelRunCollection">
(1) <scan location="/data/ldm/NAM_CONUS_80km/" suffix=".grib1" dateFormatMark="Run_#yyyyMMdd_HHmm" enhance="true" />
</aggregation>
</netcdf>
- All the files in the directory /data/ldm/CONUS_80km/, ending in .grib1 will be aggregated. The run time coordinate values will be extracted from the filename, using the dateFormatMark attribute.
A runtime dimension and coordinate variable is added, and the time coordinate is made into a 2D coordinate, as required for an FMRC datasset:
double time(run=3, time=11); :units = "hours since 2006-09-10T00:00:00Z"; :long_name = "Coordinate variable for time dimension"; :standard_name = "time"; :_CoordinateAxisType = "Time";
This example assumes that all the time coordinates in each of the files have the same units, in this example, “hours since 2006-09-10T00:00:00Z”. If that is not the case, then the time values must be read in and adjusted to have a common unit, which is indicated by adding the timeUnitsChange attribute on the aggregation element:
<aggregation dimName="run" type="forecastModelRunCollection" timeUnitsChange="true">
When you have a different number of forecast times in each model run, you must also use the timeUnitsChange attribute on the aggregation element (as of 4.0.18).
Case 2: Data for each forecast model run is in multiple files
In this case we can use nested aggregations, the inner aggregation to join the files together that make one run, then an outer aggregation to make the runs into an FMRC dataset. The following is a single FMRC that shows 3 variations on how to do the inner aggregations: