Our goal in this section is to aggregate datafiles using NcML.
Example 1: JoinExisting
-
In the ToolsUI
Viewer
tab, open/machine/tds/data/ncmlExamples/aggAdvancedNcmlOne/data/archv.2012_240_00.nc
Note that the variableMT
has a shape of1
– there is only one time in the file. -
In the data file path, change
archv.2012_240_00.nc
toarchv.2012_240_01.nc
. Did you notice any changes between the two files?
The units forMT
in each file is -
Okay, both have an
MT
dimension, which is the dimension of the time variable, so let’s aggregate on that. Go to theNcML
tab of ToolsUI and enter the following:<?xml version="1.0" encoding="UTF-8"?> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation dimName="MT" type="joinExisting"> <scan location="data/" suffix=".nc" subdirs="false"/> </aggregation> </netcdf>
Save the file as
/machine/tds/data/ncmlExamples/aggAdvancedNcmlOne/joinExisting.ncml
Now, switch back to the Viewer tab and open the NcML file you just created. Note that the
MT
variable now has a size of2
– Yay! We aggregated the files. All done. Let’s close up shop… not so fast.Notice anything funky? What is the difference between the variable
MT
vs.Date
? Which one should be used to obtain the time values? How would someone know? -
Open up the NcML file we created in the
CoordSys
tab. Notice that there are five coordinate-related variables listed in the bottom pane? Notice the twoCoordinate Systems
listed in the middle pane include bothMT
andDate
? Is this correct? -
In the bottom pane of the
CoordSys
tab, right click on theMT
variable and selectShow Values as Date
.
Do the same for the variableDate
. What do you think we should do?Question: Why is Date even being used as a coordinate variable? -
Open the
NCDump
tab in ToolsUI and look at the attributes of a variable, sayu
. Notice anything?
The metadata explicitly states thatDate
is a coordinate variable, in addition to the other coordinate variables. -
Go back to the
NcML
tab and add the following below the aggregation section of the xml:<variable name="u"> <attribute name="coordinates" value="MT Depth Latitude Longitude"/> </variable>
Save the NcML edits and return to the CoordSys tab. What happens? Is this what we want?
-
Rinse, wash, and repeat for each variable.
Note: Aggregation often involves much more than simply combining files! You really have to know the data you are aggregating.Warning: Don’t do this with GRIB files! Use theGribFeatureCollection
instead.
Example 2: JoinNew
In this example, we will use NcML to aggregate data files produced from the same model (same run, actually). However, something key is missing, and we will have to add it ourselves. Once again, we will see that joining these files is only part of the battle!
-
In the ToolsUI
Viewer
tab, open/machine/tds/data/ncmlExamples/aggAdvancedNcmlTwo/data/umwmout_2013-06-04_23-00-00.nc
. -
In the data file path, change
umwmout_2013-06-04_23-00-00.nc
toumwmout_2013-06-05_00-00-00.nc
. Did you notice any changes between the two files?
Do you notice anything missing? What dimension will we use to aggregate? -
Open the file in the
CoordSys
tab. Anything important missing?
Oh, no worries, the TIME IS ENCODED IN THE FILE NAME! Good enough, right?
More common than should be advertised (no need to promote this behavior), so we have an NcML method to grab the date from file names. We will need to add a time dimension and variable. Open the NcML tab and enter the following:<?xml version="1.0" encoding="UTF-8"?> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation dimName="time" type="joinNew"> <scan dateFormatMark="umwmout_#yyyy-MM-dd_HH-mm-ss" location="data/" suffix=".nc" subdirs="false"/> </aggregation> </netcdf>
Save the file as
/machine/tds/data/ncmlExamples/aggAdvancedNcmlTwo/joinNew.ncml
-
Open your NcML file in the
Viewer
tab. Looks good, right? Ok, cool. All done. Let’s close up shop… not so fast. Ugh.
As it is, this will add atime
dimension to all variables in the file. Is that what we want? What about the1D
coordinate variables? -
We should explicitly list the variables that we want to aggregate. This can be very tedious. Go ahead and add the following to your NcML file inside the aggregation tag:
<variableAgg name="u_stokes" /> <variableAgg name="v_stokes" /> <variableAgg name="seamask" /> <variableAgg name="depth" /> <variableAgg name="wspd" /> <variableAgg name="wdir" /> <variableAgg name="uc" /> <variableAgg name="vc" /> <variableAgg name="rhoa" /> <variableAgg name="rhow" /> <variableAgg name="momx" /> <variableAgg name="momy" /> <variableAgg name="cgmxx" /> <variableAgg name="cgmxy" /> <variableAgg name="cgmyy" /> <variableAgg name="taux_form" /> <variableAgg name="tauy_form" /> <variableAgg name="taux_form_1" /> <variableAgg name="tauy_form_1" /> <variableAgg name="taux_form_2" /> <variableAgg name="tauy_form_2" /> <variableAgg name="taux_form_3" /> <variableAgg name="tauy_form_3" /> <variableAgg name="taux_skin" /> <variableAgg name="tauy_skin" /> <variableAgg name="taux_ocn" /> <variableAgg name="tauy_ocn" /> <variableAgg name="taux_bot" /> <variableAgg name="tauy_bot" /> <variableAgg name="taux_snl" /> <variableAgg name="tauy_snl" /> <variableAgg name="tailatmx" /> <variableAgg name="tailatmy" /> <variableAgg name="tailocnx" /> <variableAgg name="tailocny" /> <variableAgg name="cd" /> <variableAgg name="swh" /> <variableAgg name="mwp" /> <variableAgg name="mwl" /> <variableAgg name="mwd" /> <variableAgg name="dwp" /> <variableAgg name="dwl" /> <variableAgg name="dwd" />
-
Open the NcML file in
FeatureTypes → Grids
, click on a variable (sayseamask
), and click theRed Alien
to visualize the data.
Again, you really need to know your data to do this! Isseamask
something that should be aggregated? Maybe, maybe not.