Our goal in this section is to aggregate datafiles using NcML.
Example 1: JoinExisting
-
In the ToolsUI
Viewertab, open/machine/tds/data/ncmlExamples/aggAdvancedNcmlOne/data/archv.2012_240_00.nc
Note that the variableMThas a shape of1– there is only one time in the file. -
In the data file path, change
archv.2012_240_00.nctoarchv.2012_240_01.nc. Did you notice any changes between the two files?
The units forMTin each file isdays since 1900-12-31 00:00:00` – this is an important observation!. -
Okay, both have an
MTdimension, which is the dimension of the time variable, so let’s aggregate on that. Go to theNcMLtab of ToolsUI and enter the following:<?xml version="1.0" encoding="UTF-8"?> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation dimName="MT" type="joinExisting"> <scan location="data/" suffix=".nc" subdirs="false"/> </aggregation> </netcdf>Save the file as
/machine/tds/data/ncmlExamples/aggAdvancedNcmlOne/joinExisting.ncmlNow, switch back to the Viewer tab and open the NcML file you just created. Note that the
MTvariable now has a size of2– Yay! We aggregated the files. All done. Let’s close up shop… not so fast.Notice anything funky? What is the difference between the variable
MTvs.Date? Which one should be used to obtain the time values? How would someone know? -
Open up the NcML file we created in the
CoordSystab. Notice that there are five coordinate-related variables listed in the bottom pane? Notice the twoCoordinate Systemslisted in the middle pane include bothMTandDate? Is this correct? -
In the bottom pane of the
CoordSystab, right click on theMTvariable and selectShow Values as Date.
Do the same for the variableDate. What do you think we should do?Question: Why is Date even being used as a coordinate variable? -
Open the
NCDumptab in ToolsUI and look at the attributes of a variable, sayu. Notice anything?
The metadata explicitly states thatDateis a coordinate variable, in addition to the other coordinate variables. -
Go back to the
NcMLtab and add the following below the aggregation section of the xml:<variable name="u"> <attribute name="coordinates" value="MT Depth Latitude Longitude"/> </variable>Save the NcML edits and return to the CoordSys tab. What happens? Is this what we want?
-
Rinse, wash, and repeat for each variable.
Note: Aggregation often involves much more than simply combining files! You really have to know the data you are aggregating.Warning: Don’t do this with GRIB files! Use theGribFeatureCollectioninstead.
Example 2: JoinNew
In this example, we will use NcML to aggregate data files produced from the same model (same run, actually). However, something key is missing, and we will have to add it ourselves. Once again, we will see that joining these files is only part of the battle!
-
In the ToolsUI
Viewertab, open/machine/tds/data/ncmlExamples/aggAdvancedNcmlTwo/data/umwmout_2013-06-04_23-00-00.nc. -
In the data file path, change
umwmout_2013-06-04_23-00-00.nctoumwmout_2013-06-05_00-00-00.nc. Did you notice any changes between the two files?
Do you notice anything missing? What dimension will we use to aggregate? -
Open the file in the
CoordSystab. Anything important missing?
Oh, no worries, the TIME IS ENCODED IN THE FILE NAME! Good enough, right?
More common than should be advertised (no need to promote this behavior), so we have an NcML method to grab the date from file names. We will need to add a time dimension and variable. Open the NcML tab and enter the following:<?xml version="1.0" encoding="UTF-8"?> <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"> <aggregation dimName="time" type="joinNew"> <scan dateFormatMark="umwmout_#yyyy-MM-dd_HH-mm-ss" location="data/" suffix=".nc" subdirs="false"/> </aggregation> </netcdf>Save the file as
/machine/tds/data/ncmlExamples/aggAdvancedNcmlTwo/joinNew.ncml -
Open your NcML file in the
Viewertab. Looks good, right? Ok, cool. All done. Let’s close up shop… not so fast. Ugh.
As it is, this will add atimedimension to all variables in the file. Is that what we want? What about the1Dcoordinate variables? -
We should explicitly list the variables that we want to aggregate. This can be very tedious. Go ahead and add the following to your NcML file inside the aggregation tag:
<variableAgg name="u_stokes" /> <variableAgg name="v_stokes" /> <variableAgg name="seamask" /> <variableAgg name="depth" /> <variableAgg name="wspd" /> <variableAgg name="wdir" /> <variableAgg name="uc" /> <variableAgg name="vc" /> <variableAgg name="rhoa" /> <variableAgg name="rhow" /> <variableAgg name="momx" /> <variableAgg name="momy" /> <variableAgg name="cgmxx" /> <variableAgg name="cgmxy" /> <variableAgg name="cgmyy" /> <variableAgg name="taux_form" /> <variableAgg name="tauy_form" /> <variableAgg name="taux_form_1" /> <variableAgg name="tauy_form_1" /> <variableAgg name="taux_form_2" /> <variableAgg name="tauy_form_2" /> <variableAgg name="taux_form_3" /> <variableAgg name="tauy_form_3" /> <variableAgg name="taux_skin" /> <variableAgg name="tauy_skin" /> <variableAgg name="taux_ocn" /> <variableAgg name="tauy_ocn" /> <variableAgg name="taux_bot" /> <variableAgg name="tauy_bot" /> <variableAgg name="taux_snl" /> <variableAgg name="tauy_snl" /> <variableAgg name="tailatmx" /> <variableAgg name="tailatmy" /> <variableAgg name="tailocnx" /> <variableAgg name="tailocny" /> <variableAgg name="cd" /> <variableAgg name="swh" /> <variableAgg name="mwp" /> <variableAgg name="mwl" /> <variableAgg name="mwd" /> <variableAgg name="dwp" /> <variableAgg name="dwl" /> <variableAgg name="dwd" /> -
Open the NcML file in
FeatureTypes → Grids, click on a variable (sayseamask), and click theRed Aliento visualize the data.
Again, you really need to know your data to do this! Isseamasksomething that should be aggregated? Maybe, maybe not.