Writing your own Java class to identify Coordinate Systems

Overview

In order to use a dataset at the scientific datatype layer, the dataset's coordinate systems must first be identified. This is done by an implementation of ucar.nc2.dataset.CoordSysBuilderIF whose job is to examine the contents of the dataset and create coordinate system objects that follow this object model:

For more details, see the CDM Object Model.

A CoordSysBuilderIF class must be created for each type of dataset that encodes their coordinate systems differently. This obviously is burdensome, and data providers are encouraged to use existing Conventions for writing their datasets. If those are inadequate, then the next best thing is to define and document a new Convention in collaboration with others with similar needs. If you do so, read Writing NetCDF Files: Best Practices, look at other Convention examples, and get feedback form others before committing to it. Send us a URL to your documentation, and we will add it to the NetCDF Conventions page.

The steps to using your CoordSysBuilderIF class in the Netcdf-Java library:

  1. Write a class that implements ucar.nc2.dataset.CoordSysBuilderIF, such as by subclassing ucar.nc2.dataset.CoordSysBuilder.
  2. Add the class to your classpath.
  3. From your application, call ucar.nc2.dataset.CoordSysBuilder.registerConvention( String conventionName, Class c). This is called "plugging in" your code at runtime.
  4. When you open the dataset in enhanced mode, eg by calling
       NetcdfDataset.openDataset(String location, boolean enhance, ucar.nc2.util.CancelTask cancelTask);

an instance of your class will be called to add coordinate system objects to the NetcdfDataset.

The Coordinate System objects are then available through the classes in the ucar.nc2.dataset package, for example:

ucar.nc2.dataset.VariableDS:
  public List getCoordinateSystems();

ucar.nc2.dataset.CoordinateSystem:
  public List getCoordinateAxes();
  public List getCoordinateTransforms();

ucar.nc2.dataset.CoordinateAxis:
  public List getAxisType();

ucar.nc2.dataset.CoordinateTransform:
  public List getParameters();
  public List getTransformType();

Writing a CoordSysBuilderIF class

These are the steps taken by CoordSysBuilder to add Coordinate Systems:

  1. Identify which subclass should be used.
  2. Create a new object of that class.
  3. Call augmentDataset( netcdfDataset, cancelTask) to make any changes to the dataset (add attributes, variables, etc).
  4. Call buildCoordinateSystems( netcdfDataset) to add the coordinate system objects.

Your class must implement this interface:

public interface CoordSysBuilderIF {
public void setConventionUsed( String convName); public void augmentDataset( NetcdfDataset ncDataset, CancelTask cancelTask) throws IOException; public void buildCoordinateSystems( NetcdfDataset ncDataset); public void addUserAdvice( String advice); }

You can override the buildCoordinateSystems() method and completely build the coordinate system objects yourself. However, its usually easier to take advantage of the code in the CoordSysBuilder superclass, which translates standard _Coordinate attributes into coordinate system objects. The job of the subclass may then reduce to adding these _Coordinate attributes to the file in the augmentDataset() method. The subclass may also need to create and add new Coordinate Variables to the file, and/or to create Coordinate Transforms. Examples of existing CoordSysBuilder subclasses are in the ucar.nc2.dataset.conv package.

The ucar.nc2.dataset.CoordSysBuilder class uses the " _Coordinate attributes" ("underscore Coordinate attributes", described fully here) to create Coordinate System objects. An attribute that starts with an underscore is a "system attribute", which usually implies some special processing or behavior within the NetCDF library (both C and Java).

If you are subclassing ucar.nc2.dataset.CoordSysBuilder, you can ignore the setConventionUsed and addUserAdvice methods and let the superclass hanlde them. If not, you can just implement dummy methods.

The ToolsUI application has a CoordSys tab that is designed to help with the process of building coordinate systems. Open up your dataset in that tab, and 3 tables are presented: The data variables, the cooordinate systems, and the coordinate axes. The Info button (top right) will show various information from the CoordSysBuilder class that was used for the dataset.

Identifying which datasets your class should operate on

If your datasets use the global attribute Convention, then you only need to pass in the value of that attribute into CoordSysBuilder.registerConvention(String conventionName, Class c), and you do not need to implement the isMine() method.

Otherwise, your class must implement a static method isMine() that returns true when it is given a dataset that it knows how to handle. For example:

  public static boolean isMine( NetcdfFile ncfile) {
String s = ncfile.findAttValueIgnoreCase(null, "sensor_name", "none");
return s.equalsIgnoreCase("CRAFT/NEXRAD");
}

looks to see if the global attribute sensor_name has the value CRAFT/NEXRAD. Its important that the isMine() method be efficient, ideally using only the dataset metadata (attributes, variable names, etc) rather than having to do any data reading.

Adding Attributes to the Dataset

For the simple case where you only need to add attributes to the file, you might do it as in this example:

  protected void augmentDataset( NetcdfDataset ncDataset, CancelTask cancelTask) throws IOException {
this.conventionName = "ATDRadar";
Variable time = ds.findVariable("time"); time.addAttribute( new Attribute("_CoordinateAxisType", "Time")); // etc }

You may find it easier to do the same thing using an NcML file, for example:

  protected void augmentDataset( NetcdfDataset ncDataset, CancelTask cancelTask) throws IOException {
this.conventionName = "ATDRadar";
NcMLReader.wrapNcML( ncDataset, "file:/MyResource/ATDRadar.ncml", cancelTask);
}

The NcMLReader.wrapNcML() method wraps a NetcdfDataset in an NcML file, making whatever modifications are specified in the NcML file. You pass in the URL location of the NcML to use, typically a local file as above, but it may also be a remote access over http. Alternatively, you could add the /MyResource directory to your classpath, and call this variation:

 NcMLReader.wrapNcMLresource( ncDataset, "ATDRadar.ncml", cancelTask);

The NcMLReader.wrapNcMLresource() looks for the NcML document by calling Class.getResource(). The example NcML file might look like:

<?xml version='1.0' encoding='UTF-8'?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <attribute name="Conventions" value="ATDRadar"/> <variable name="latitude"> <attribute name="_CoordinateAxisType" value="Lat" /> </variable> <variable name="longitude"> <attribute name="_CoordinateAxisType" value="Lon" /> </variable> <variable name="altitude"> <attribute name="_CoordinateAxisType" value="Height" /> <attribute name="_CoordinateZisPositive" value="up" /> </variable> <variable name="time"> <attribute name="_CoordinateAxisType" value="Time" /> </variable> </netcdf>

The NcML adds the appropriate _CoordinateAxisType attribute to existing Coordinate Axes. Because the data variables all use coordinate variables, implicit Coordinate System objects are created and assigned. There is no need for Coordinate Transforms because all the coordinates are reference coordinates (lat, lon, height). Here is complete info on NcML.

If all you need to do is wrap the dataset in NcML, and the dataset already has a Convention attribute in it (before it is wrapped), then you can simply register the NcML directly, without having to write any code. For this, you use:

 CoordSysBuilder.registerNcML( String conventionName, String ncmlLocation);

Adding Coordinate Axes to the Dataset

When a Coordinate Axis is missing, you must add it. You can do this programatically or through an NcML file, for example:

  <variable name="latitude" shape="row" type="double">
<attribute name="long_name" value="latitide coordinate" />
<attribute name="units" value="degrees_north" />
<attribute name="_CoordinateAxisType" value="Lat" />
<values start="90.0" incr="5.0" />
</variable>

creates a new coordinate axis variable, and gives it evenly spaced values. You can also enumerate the values:

  <values>90.0 88.3 72.6 66.9</values>

When the values must be computed, then you need to do this programatically, for example:

 protected void augmentDataset( NetcdfDataset ds, CancelTask cancelTask) throws IOException {
    this.conventionName = "Zebra";
(1) NcMLReader.wrapNcMLresource( ds, CoordSysBuilder.resourcesDir+"Zebra.ncml", cancelTask);

    // the time coord var is created in the NcML
    // set its values = base_time + time_offset(time)
    Dimension timeDim = ds.findDimension("time");
    Variable base_time = ds.findVariable("base_time");
    Variable time_offset = ds.findVariable("time_offset");
(2) Variable time = ds.findVariable("time");
    Attribute att = base_time.findAttribute("units");
    String units = (att != null) ? att.getStringValue() : "seconds since 1970-01-01 00:00 UTC";
(3) time.addAttribute( new Attribute("units", units));

    Array data;
    try {
(4)   double baseValue = base_time.readScalarDouble();
(5)   data = time_offset.read();
      IndexIterator iter = data.getIndexIterator();
      while (iter.hasNext()) {
(6)     iter.setDoubleCurrent( iter.getDoubleNext() + baseValue);
(7)     if ((cancelTask != null) && cancelTask.isCancel()) return;
      }
     } catch (java.io.IOException ioe) {
(8)  parseInfo.append("ZebraConvention failed to create time Coord Axis for "+ ds.getLocation()+"\n"+ioe+"\n");
     return;
    }
(9) time.setCachedData( data, true);
(10)ds.finish();
}
  1. Its convenient to wrap the dataset in NcML, even when you also have to do some programming. For one thing, you can change the NcML file without recompiling.
  2. The time coordinate is created in the NcML file, and we will set its values here, based on other data in the file
  3. Set time coordinate units are set to be the same as the units on the base_time variable.
  4. Read in the (scalar) base_time.
  5. Read in the time_offset array.
  6. Add the baseValue to each value of the time_offset.
  7. For potentially long running calculations, you should check to see if the user has cancelled, and return ASAP.
  8. Error message if theres an excception.
  9. Set the data values of the time coordinate to the computed values.
  10. When adding new variables to a dataset, you must call finish() when all done.

Identifying Coordinate Axis Types

Another simple case to handle is when you are using Coordinate Variables for all data variables. Coordinate Variables are 1D variables with the same name as their dimension, which encode the coordinate values for that dimension. In that case, you only need to identify the Coordinate Axes types, which you do by overriding thye getAxisType() method. This will pass in all variables that have been identified as coordinate axes, and your job is to return theier AxisType, if they have one:

protected AxisType getAxisType( NetcdfDataset ncDataset, VariableEnhanced v) {
  String unit = v.getUnitsString();
  if (unit == null)
    return null;
  if ( unit.equalsIgnoreCase("degrees_east") ||
   unit.equalsIgnoreCase("degrees_E") ||
   unit.equalsIgnoreCase("degreesE") ||
   unit.equalsIgnoreCase("degree_east") ||
   unit.equalsIgnoreCase("degree_E") ||
   unit.equalsIgnoreCase("degreeE"))
     return AxisType.Lon;
  
  if ( unit.equalsIgnoreCase("degrees_north") ||
    unit.equalsIgnoreCase("degrees_N") ||
    unit.equalsIgnoreCase("degreesN") ||
    unit.equalsIgnoreCase("degree_north") ||
    unit.equalsIgnoreCase("degree_N") ||
    unit.equalsIgnoreCase("degreeN"))
      return AxisType.Lat;
      
  if (SimpleUnit.isDateUnit(unit) || SimpleUnit.isTimeUnit(unit))
    return AxisType.Time;
 
    // look for other z coordinate
  if (SimpleUnit.isCompatible("m", unit))
    return AxisType.Height;
  if (SimpleUnit.isCompatible("mbar", unit))
    return AxisType.Pressure;
  if (unit.equalsIgnoreCase("level") || unit.equalsIgnoreCase("layer") || unit.equalsIgnoreCase("sigma_level"))
    return AxisType.GeoZ;
   
  String positive = ncDataset.findAttValueIgnoreCase((Variable) v, "positive", null);
  if (positive != null) {
    if (SimpleUnit.isCompatible("m", unit))
      return AxisType.Height;
    else
      return AxisType.GeoZ;
  }
  return null;
}

Creating Coordinate Transformations

A more complex task is to create Coordinate Transforms, which map your coordinates to reference coordinates, such as lat/lon. A Coordinate Transform is typically represented by a Coordinate Transform Variable, which may be a dummy variable (ie has no data in it), and whose attributes document the meaning and specify any needed parameters for it. You can create arbitrary transforms by creating ucar.nc2.dataset.CoordinateTransform objects, which your code will have access to when it opens a NetcdfDataset.

However, for your Transform to be used by the Netcdf Java library and standard applications built on top of it, the CoordinateTransform must have a reference to a ucar.unidata.geoloc.Projection or a ucar.unidata.geoloc.vertical.VerticalTransform object which knows how to do the actual mathematical transformation. The Netcdf-Java library has a number of these, mostly following the CF-1.0 specification (Appendix F for projections, Appendix D for vertical transforms). You can also write your own implementation and add them at run time.

For this lesson, we will concentrate on what your CoordSysBuilder needs to do to use an existing standard or user written Projection or VerticalTransform class.

You can create the Coordinate Transform objects yourself, by overriding the makeCoordinateTransforms() and assignCoordinateTransforms() methods in CoordSysBuilder. Much easier is to use the existing machinery and create a Coordinate Transform Variable which represents the parameters of the transform in a way recognized by a CoordTransBuilder class.

Here is an example of one way to do that:

 public void augmentDataset( NetcdfDataset ds, CancelTask cancelTask) throws IOException {
   // read global parameters
1) double lat_origin = findAttributeDouble( ds, "LAT0");
   double lon_origin = findAttributeDouble( ds, "LON0");
   double scale = findAttributeDouble( ds, "SCALE");
   if (Double.isNaN(scale)) scale = 1.0; 

2) VariableDS v = new VariableDS( ds, null, null, "ProjectionPS", DataType.CHAR, "", null, null);
   v.addAttribute( new Attribute("grid_mapping_name", "polar_stereographic"));
   v.addAttribute( new Attribute("straight_vertical_longitude_from_pole", lon_origin));
   v.addAttribute( new Attribute("latitude_of_projection_origin", lat_origin));
   v.addAttribute( new Attribute("scale_factor_at_projection_origin", scale)); 
   
3) v.addAttribute( new Attribute(_Coordinate.TransformType, TransformType.Projection.toString());
4) v.addAttribute( new Attribute(_Coordinate.AxisTypes, "GeoX GeoY");
   // fake data
5) Array data = Array.factory(DataType.CHAR.getPrimitiveClassType(), new int[] {}, new char[] {' '});
   v.setCachedData(data, true);
6) ds.addVariable(v);
   ds.finish();
}
  1. Read the projection values that happen to be stored as non-standard global attributes in your dataset.
  2. A Coordinate Transform Variable is created, and the parameters are renamed according to the CF-1.0
  3. The _CoordinateTransformType identifies this variable unambiguously as a Coordinate Transform Variable.
  4. The _CoordinateAxisTypes attribute indicates that the transform is to be used for all Coordinate Systems that have a GeoX and GeoY coordinate axis. To be CF compliant, you would have to identify all data variables and add the attribute grid_mapping="ProjectionPS" to each.
  5. Fake data is added, in case someone accidently tries to read it.
  6. The Coordinate Transform Variable is added to the dataset. When adding new variables to a dataset, you must call finish() when all done.

This creates a Coordinate Transform Variable in your dataset that looks like this:

 char Projection;
:grid_mapping_name = "polar_stereographic";
:straight_vertical_longitude_from_pole = "-150.0";
:latitude_of_projection_origin = "90.0";
:scale_factor_at_projection_origin = "0.996";
:_CoordinateTransformType = "Projection";
:_CoordinateAxisTypes = "GeoX GeoY";

A similar way to do this, which creates the same result, creates ProjectionImpl and ProjectionCT objects, and calls the makeCoordinateTransformVariable utility method in CoordSysBuilder to handle the details:

 public void augmentDataset( NetcdfDataset ds, CancelTask cancelTask) throws IOException {
// read global parameters 1) double lat_origin = findAttributeDouble( ds, "LAT0"); double lon_origin = findAttributeDouble( ds, "LON0"); double scale = findAttributeDouble( ds, "SCALE"); if (Double.isNaN(scale)) scale = 1.0; 2) ProjectionImpl proj = new ucar.unidata.geoloc.projection.Stereographic( lat_origin, lon_origin, scale); 3) ProjectionCT projCT = new ProjectionCT("ProjectionPS", "FGDC", proj); 4) VariableDS v = makeCoordinateTransformVariable(ds, projCT); 5) v.addAttribute( new Attribute(_Coordinate.AxisTypes, "GeoX GeoY")); 6) ds.addVariable(v); ds.finish(); }
  1. Read the projection values that happen to be stored as non-standard global attributes in your dataset.
  2. A ProjectionImpl is created out of those parameters.
  3. A ProjectionCT wraps the ProjectionImpl
  4. The makeCoordinateTransformVariable method handles the details of creating the Coordinate Transform Variable. The ProjectionImpl knows what the standard names of its parameters are.
  5. The _CoordinateAxisTypes attribute indicates that the transform is to be used for all Coordinate Systems that have a GeoX and GeoY coordinate axis.
  6. The Coordinate Transform Variable is added to the dataset.

CoordSysBuilder Reference

These are the steps taken by NetcdfDataset to add Coordinate Systems:

  1. Identify which subclass should be used
  2. Create a new object of that class
  3. Call augmentDataset( ds, cancelTask)
  4. Call buildCoordinateSystems( ds)

The augmentDataset() method is where subclasses should modify the underlying dataset.

The buildCoordinateSystems() method is where CoordSysBuilder constructs the Coordinate Systems and adds them to the dataset. In some special cases, the subclass may need to override some of the methods that are called by buildCoordinateSystems.

protected void buildCoordinateSystems( NetcdfDataset ncDataset) {
// put status info into parseInfo to be shown to someone trying to debug this process
parseInfo.append("Parsing with Convention = "+conventionName+"\n"); // Bookeeping info for each variable is kept in the VarProcess inner class List vars = ncDataset.getVariables(); for (int i = 0; i < vars.size(); i++) { VariableEnhanced v = (VariableEnhanced) vars.get(i); varList.add( new VarProcess(ncDataset, v)); } // identify which variables are coordinate axes findCoordinateAxes( ncDataset); // identify which variables are used to describe coordinate system findCoordinateSystems( ncDataset); // identify which variables are used to describe coordinate transforms findCoordinateTransforms( ncDataset); // turn Variables into CoordinateAxis objects makeCoordinateAxes( ncDataset); // make Coordinate Systems for all Coordinate Systems Variables makeCoordinateSystems( ncDataset); // assign explicit CoordinateSystem objects to variables assignExplicitCoordinateSystems( ncDataset); // assign implicit CoordinateSystem objects to variables makeCoordinateSystemsImplicit( ncDataset); // optionally assign implicit CoordinateSystem objects to variables that dont have one yet if (useMaximalCoordSys) makeCoordinateSystemsMaximal( ncDataset); // make Coordinate Transforms makeCoordinateTransforms( ncDataset); // assign Coordinate Transforms assignCoordinateTransforms( ncDataset); }

To work at this level, you will need to study the source code of CoordSysBuilder, and existing subclasses in the ucar.nc2.dataset.conv package. As a subclass, you will have access to the list of VarProcess objects, which wrap each variable in the Dataset, and keep track of various information about them.


This document was last updated July 20113