CollectionLevelScanner (NetCDF-Java All API v5.9.0)

java.lang.Object
- thredds.cataloggen.CollectionLevelScanner

```
public class CollectionLevelScanner
extends java.lang.Object
```
CollectionLevelScanner maps between the CrawlableDataset realm and the InvCatalog/InvDataset realm. It scans a single level of a dataset collection and generates a catalog. The generated catalog contains InvCatalogRef objects for all contained collection datasets.
Three different levels of the dataset collection must be provided to to properly map from CrawlableDataset to InvCatalog/InvDataset:
1. the collection level is the top of the data collection (the data root);
2. the catalog level is the level in the collection for which a catalog is to be constructed; and
3. the current level (only different from catalog level when the resulting single-level catalog will be used in the construction of a multi-level catalog ) is the level in the collection for which a catalog is currently being constructed.
Besides the three CrawlableDatasets that define the collection to be cataloged, there are a variety of ways to modify or enhance the resulting catalog. For more details, see the documentation for the various setters (setCollectionId(), setIdentifier(), setNamer(), setDoAddDataSize(), setSorter() , setProxyDsHandlers(), addChildEnhancer().
Example

Here we'll look at the parameters used to construct a CollectionLevelScanner and to generate a catalog for the following request:
```
 http://my.server:8080/thredds/ncep/nam/80km/catalog.xml
 
```
In the constuctor, we have:
- collection ID: "ncep";
- collectionLevel.getPath(): "/my/data/collection/model/ncep";
- catalogLevel.getPath(): "/my/data/collection/model/ncep/nam/80km";
- currentLevel = null (so the catalogLevel is used);
- filter and service: not really important for this example, so we'll ignore them for now.
The two datasets we'll use in the example are:
- childAtomicCrDs.getPath(): "/my/data/collection/model/ncep/nam/80km/20060208_1200_nam80km.grib"
- childCollectionCrDs.getPath(): "/my/data/collection/model/ncep/nam/80km/2000archive"
Following are the details on how the resulting InvDataset and InvCatalogRef objects are created.
- The name of a dataset element (and the xlink:title of a catalogRef element) is the name of the corresponding CrawlableDataset. Example:
```
 <dataset name="20060208_1200_nam80km.grib"/>
 <catalogRef xlink:title="2000archive"/>
 
```
- name = childAtomicCrDs.getName()
- xlink:title = childCollectionCrDs.getName()
The ID of a catalog dataset element is the ID of the parent dataset and the name of the corresponding CrawlableDataset separated by a "/". So, it ends up being the path of the corresponding CrawlableDataset from the point where the collection CrawlableDataset path ends then prefixed by the collectionId which is set using the setCollectionId() string. Example:
```
 <dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib"/>
 <catalogRef xlink:title="2000archive" ID="NCEP/nam/80km/2000archive" />
 
```
where the values were determined as follows:
- ID = collectionId + childAtomicCrDs.getPath().substring( collectionLevel.getPath().length + 1)
- ID = collectionId + childCollectionCrDs.getPath().substring( collectionLevel.getPath().length + 1)
The urlPath of a dataset element is the collectionPath plus the path of the corresponding CrawlableDataset starting at the point where the collection CrawlableDataset path ends. Example:
```
 <dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib"
 urlPath="ncep/nam/80km/20060208_1200_nam80km.grib" />
 
```
where the values were determined as follows:
- urlPath = collectionPath + "/" + childAtomicCrDs.getPath().substring( collectionLevel.getPath().length + 1)
The xlink:href of a catalogRef element is the path of the corresponding CrawlableDataset starting at the point where the catalogLevel CrawlableDataset ends plus "/catalog.xml". Example:
```
 <catalogRef xlink:title="2000archive" xlink:href="2000archive/catalog.xml"/>
 
```
where the values were determined as follows:
- xlink:href = childCollectionCrDs.getPath().substring( catalogLevel.getPath().length() + 1 ) + "/catalog.xml"

See DatasetScanCatalogBuilder for more details on how a THREDDS server config file (catalog.xml) and the contained datasetScan elements map into CollectionLevelScanner.

Multi-level Catalogs

Resulting single level catalogs can be used to construct multi-level catalogs by replacing InvCatalogRef objects with the catalogs generated for the corresponding CrawlableDataset objects. Construction of multi-level catalogs is supported in several ways:

The getCatRefInfo() method provides access to the list of InvCatalogRef objects and their corresponding CrawlableDataset objects.
In the CollectionLevelScanner constructor, the currentLevel parameter indicates the level for which the current single-level catalog is to be created rather than the top-level of the resulting multi-level catalog.

NOTE: The StandardCatalogBuilder class is an example of using ColletionLevelScanner to construct multi-level catalogs.

Since:: Jun 14, 2005T12:41:23 PM

Constructor Summary

Constructors
Constructor and Description
`CollectionLevelScanner(CollectionLevelScanner cs)` Copy constructor
`CollectionLevelScanner(java.lang.String collectionPath, CrawlableDataset collectionLevel, CrawlableDataset catalogLevel, CrawlableDataset currentLevel, CrawlableDatasetFilter filter, InvService service)` Construct a CollectionLevelScanner.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`addChildEnhancer(DatasetEnhancer childEnhancer)` Add the given DatasetEnhancer to the list that will be applied to each of the child datasets.
`InvCatalogImpl`	`generateCatalog()`
`InvCatalogImpl`	`generateProxyDsResolverCatalog(ProxyDatasetHandler pdh)` Generate the catalog for a resolver request of the given ProxyDatasetHandler.
`protected java.lang.String`	`getCollectionId()`
`protected java.lang.String`	`getCollectionName()`
`protected boolean`	`getDoAddDataSize()`
`protected CrawlableDatasetLabeler`	`getIdentifier()`
`protected CrawlableDatasetLabeler`	`getNamer()`
`java.util.Map`	`getProxyDsHandlers()`
`CrawlableDatasetSorter`	`getSorter()`
`void`	`scan()` Scan the collection and gather information on contained datasets.
`void`	`setCollectionId(java.lang.String collectionId)` Set the value of the base dataset ID.
`void`	`setCollectionName(java.lang.String collectionName)` Set the value of the collection Name.
`void`	`setDoAddDataSize(boolean doAddDataSize)` Determines if datasetSize metadata will be added to each InvDataset built during catalog generation.
`void`	`setIdentifier(CrawlableDatasetLabeler identifier)` Set the CrawlableDatasetLabeler used to determine the ID of the InvDataset built during catalog generation.
`void`	`setNamer(CrawlableDatasetLabeler namer)` Set the CrawlableDatasetLabeler used to determine the name of each InvDataset built during catalog generation.
`void`	`setProxyDsHandlers(java.util.Map proxyDsHandlers)`
`void`	`setSorter(CrawlableDatasetSorter sorter)` Set the sorter with which to sort the list of child CrawlableDatasets.
`void`	`setTopLevelMetadataContainer(InvDatasetImpl topLevelMetadataContainer)` Set the InvDatasetImpl that contains the metadata for the top level dataset.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - CollectionLevelScanner
```
public CollectionLevelScanner(java.lang.String collectionPath,
                              CrawlableDataset collectionLevel,
                              CrawlableDataset catalogLevel,
                              CrawlableDataset currentLevel,
                              CrawlableDatasetFilter filter,
                              InvService service)
```
    Construct a CollectionLevelScanner.
    The collectionLevel and catalogLevel parameters are used to properly determine the dataset urlPath. The catalogLevel must either be the collectionLevel or be a decendent of the collectionLevel. The currentLevel, if not null, must either be the catalogLevel or be a decendent of the catalogLevel.
    - If the service base is relative to the catalog (i.e., an empty string), the urlPath needs to be relative to the catalog as well. Therefore, the urlPath thus becomes the dataset path minus the catalogLevel path.
    - If the service base is relative to the collection (absolute? relative to the server?), e.g., "/thredds/dodsC/", the urlPath needs to be relative to the collection as well. Therefore, the urlPath thus becomes the dataset path minus the collectionLevel path.
    The currentLevel parameter indicates what level is to be scanned. It is the same as the catalogLevel except for the case when catalogRefs are not used for all collection levels. (The urlPath is still determined as described above. Only the location of the datasets is changed.)
    Parameters:
    
    collectionPath - the path of the collection, used as the base of all resulting dataset@urlPath values (may be "", if null, "" is used).
    
    collectionLevel - the root of the collection to be cataloged (must not be a CrawlableDatasetAlias).
    
    catalogLevel - the location, within the collection, for which a catalog is being generated.
    
    currentLevel - the location, at or below the catalog level, which is to be scanned for datasets. Only necessary when multiple catalogs are to be aggregated. May be null. If null, assumed to be same as catalog level.
    
    filter - determines which CrawlableDatasets are accepted as part of the collection.
    
    service - the default service of all InvDatasets in the generated catalog.
    
    Throws:
    
    java.lang.IllegalArgumentException
  - CollectionLevelScanner
```
public CollectionLevelScanner(CollectionLevelScanner cs)
```
    Copy constructor
- Method Detail
  - getSorter
```
public CrawlableDatasetSorter getSorter()
```
  - setSorter
```
public void setSorter(CrawlableDatasetSorter sorter)
```
    Set the sorter with which to sort the list of child CrawlableDatasets.
    
    Parameters:
    
    sorter - the CrawlableDatasetSorter that will be used to sort the list of child CrawlableDatasets.
  - getProxyDsHandlers
```
public java.util.Map getProxyDsHandlers()
```
  - setProxyDsHandlers
```
public void setProxyDsHandlers(java.util.Map proxyDsHandlers)
```
  - setCollectionId
```
public void setCollectionId(java.lang.String collectionId)
```
    Set the value of the base dataset ID. The value is used to construct the value of the dataset@ID attribute for all datasets.
    
    Parameters:
    
    collectionId -
  - getCollectionId
```
protected java.lang.String getCollectionId()
```
  - setCollectionName
```
public void setCollectionName(java.lang.String collectionName)
```
    Set the value of the collection Name. The value is used to name the top-level dataset in the top-level collection catalog (that is, only when the catalog level is the same as the collection level).
    
    Parameters:
    
    collectionName -
  - getCollectionName
```
protected java.lang.String getCollectionName()
```
  - setIdentifier
```
public void setIdentifier(CrawlableDatasetLabeler identifier)
```
    Set the CrawlableDatasetLabeler used to determine the ID of the InvDataset built during catalog generation. The labeler is applied to the CrawlableDataset that corresponds to each InvDataset built.
    
    Parameters:
    
    identifier -
  - getIdentifier
```
protected CrawlableDatasetLabeler getIdentifier()
```
  - setNamer
```
public void setNamer(CrawlableDatasetLabeler namer)
```
    Set the CrawlableDatasetLabeler used to determine the name of each InvDataset built during catalog generation. The labeler is applied to the CrawlableDataset that corresponds to each InvDataset built.
    
    Parameters:
    
    namer -
  - getNamer
```
protected CrawlableDatasetLabeler getNamer()
```
  - setDoAddDataSize
```
public void setDoAddDataSize(boolean doAddDataSize)
```
    Determines if datasetSize metadata will be added to each InvDataset built during catalog generation. The CrawlableDataset.length() method is used to determine the size of the dataset.
    
    Parameters:
    
    doAddDataSize -
  - getDoAddDataSize
```
protected boolean getDoAddDataSize()
```
  - addChildEnhancer
```
public void addChildEnhancer(DatasetEnhancer childEnhancer)
```
    Add the given DatasetEnhancer to the list that will be applied to each of the child datasets. The DatasetEnhancer only modify InvDataset objects but can use the corresponding CrawlableDataset for information.
    
    Parameters:
    
    childEnhancer -
  - setTopLevelMetadataContainer
```
public void setTopLevelMetadataContainer(InvDatasetImpl topLevelMetadataContainer)
```
    Set the InvDatasetImpl that contains the metadata for the top level dataset.
    
    Parameters:
    
    topLevelMetadataContainer -
  - scan
```
public void scan()
          throws java.io.IOException
```
    Scan the collection and gather information on contained datasets.
    
    Throws:
    
    java.io.IOException - if an I/O error occurs while locating the contained datasets.
  - generateCatalog
```
public InvCatalogImpl generateCatalog()
                               throws java.io.IOException
```
    Throws:
    
    java.io.IOException
  - generateProxyDsResolverCatalog
```
public InvCatalogImpl generateProxyDsResolverCatalog(ProxyDatasetHandler pdh)
```
    Generate the catalog for a resolver request of the given ProxyDatasetHandler.
    
    Parameters:
    
    pdh - the ProxyDatasetHandler corresponding to the resolver request.
    
    Returns:
    
    the catalog for a resolver request of the given proxy dataset.
    
    Throws:
    
    java.lang.IllegalStateException - if this collection has not yet been scanned.
    
    java.lang.IllegalArgumentException - if the given ProxyDatasetHandler is not known by this CollectionLevelScanner.

Class CollectionLevelScanner

Example

Multi-level Catalogs

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

CollectionLevelScanner

CollectionLevelScanner

Method Detail

getSorter

setSorter

getProxyDsHandlers

setProxyDsHandlers

setCollectionId

getCollectionId

setCollectionName

getCollectionName

setIdentifier

getIdentifier

setNamer

getNamer

setDoAddDataSize

getDoAddDataSize

addChildEnhancer

setTopLevelMetadataContainer

scan

generateCatalog

generateProxyDsResolverCatalog