Class CollectionLevelScanner


  • public class CollectionLevelScanner
    extends Object
    CollectionLevelScanner maps between the CrawlableDataset realm and the InvCatalog/InvDataset realm. It scans a single level of a dataset collection and generates a catalog. The generated catalog contains InvCatalogRef objects for all contained collection datasets.

    Three different levels of the dataset collection must be provided to to properly map from CrawlableDataset to InvCatalog/InvDataset:

    1. the collection level is the top of the data collection (the data root);
    2. the catalog level is the level in the collection for which a catalog is to be constructed; and
    3. the current level (only different from catalog level when the resulting single-level catalog will be used in the construction of a multi-level catalog ) is the level in the collection for which a catalog is currently being constructed.

    Besides the three CrawlableDatasets that define the collection to be cataloged, there are a variety of ways to modify or enhance the resulting catalog. For more details, see the documentation for the various setters (setCollectionId(), setIdentifier(), setNamer(), setDoAddDataSize(), setSorter() , setProxyDsHandlers(), addChildEnhancer().

    Example

    Here we'll look at the parameters used to construct a CollectionLevelScanner and to generate a catalog for the following request:

     http://my.server:8080/thredds/ncep/nam/80km/catalog.xml
     

    In the constuctor, we have:

    • collection ID: "ncep";
    • collectionLevel.getPath(): "/my/data/collection/model/ncep";
    • catalogLevel.getPath(): "/my/data/collection/model/ncep/nam/80km";
    • currentLevel = null (so the catalogLevel is used);
    • filter and service: not really important for this example, so we'll ignore them for now.

    The two datasets we'll use in the example are:

    • childAtomicCrDs.getPath(): "/my/data/collection/model/ncep/nam/80km/20060208_1200_nam80km.grib"
    • childCollectionCrDs.getPath(): "/my/data/collection/model/ncep/nam/80km/2000archive"

    Following are the details on how the resulting InvDataset and InvCatalogRef objects are created.

    • The name of a dataset element (and the xlink:title of a catalogRef element) is the name of the corresponding CrawlableDataset. Example:
       <dataset name="20060208_1200_nam80km.grib"/>
       <catalogRef xlink:title="2000archive"/>
       
    • name = childAtomicCrDs.getName()
    • xlink:title = childCollectionCrDs.getName()
  • The ID of a catalog dataset element is the ID of the parent dataset and the name of the corresponding CrawlableDataset separated by a "/". So, it ends up being the path of the corresponding CrawlableDataset from the point where the collection CrawlableDataset path ends then prefixed by the collectionId which is set using the setCollectionId() string. Example:
     <dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib"/>
     <catalogRef xlink:title="2000archive" ID="NCEP/nam/80km/2000archive" />
     
    where the values were determined as follows:
    • ID = collectionId + childAtomicCrDs.getPath().substring( collectionLevel.getPath().length + 1)
    • ID = collectionId + childCollectionCrDs.getPath().substring( collectionLevel.getPath().length + 1)
  • The urlPath of a dataset element is the collectionPath plus the path of the corresponding CrawlableDataset starting at the point where the collection CrawlableDataset path ends. Example:
     <dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib"
     urlPath="ncep/nam/80km/20060208_1200_nam80km.grib" />
     
    where the values were determined as follows:
    • urlPath = collectionPath + "/" + childAtomicCrDs.getPath().substring( collectionLevel.getPath().length + 1)
  • The xlink:href of a catalogRef element is the path of the corresponding CrawlableDataset starting at the point where the catalogLevel CrawlableDataset ends plus "/catalog.xml". Example:
     <catalogRef xlink:title="2000archive" xlink:href="2000archive/catalog.xml"/>
     
    where the values were determined as follows:
    • xlink:href = childCollectionCrDs.getPath().substring( catalogLevel.getPath().length() + 1 ) + "/catalog.xml"

See DatasetScanCatalogBuilder for more details on how a THREDDS server config file (catalog.xml) and the contained datasetScan elements map into CollectionLevelScanner.

Multi-level Catalogs

Resulting single level catalogs can be used to construct multi-level catalogs by replacing InvCatalogRef objects with the catalogs generated for the corresponding CrawlableDataset objects. Construction of multi-level catalogs is supported in several ways:

  • The getCatRefInfo() method provides access to the list of InvCatalogRef objects and their corresponding CrawlableDataset objects.
  • In the CollectionLevelScanner constructor, the currentLevel parameter indicates the level for which the current single-level catalog is to be created rather than the top-level of the resulting multi-level catalog.

NOTE: The StandardCatalogBuilder class is an example of using ColletionLevelScanner to construct multi-level catalogs.

Since:
Jun 14, 2005T12:41:23 PM