Class CollectionLevelScanner
- java.lang.Object
-
- thredds.cataloggen.CollectionLevelScanner
-
public class CollectionLevelScanner extends Object
CollectionLevelScanner maps between the CrawlableDataset realm and the InvCatalog/InvDataset realm. It scans a single level of a dataset collection and generates a catalog. The generated catalog contains InvCatalogRef objects for all contained collection datasets.Three different levels of the dataset collection must be provided to to properly map from CrawlableDataset to InvCatalog/InvDataset:
- the collection level is the top of the data collection (the data root);
- the catalog level is the level in the collection for which a catalog is to be constructed; and
- the current level (only different from catalog level when the resulting single-level catalog will be used in the construction of a multi-level catalog ) is the level in the collection for which a catalog is currently being constructed.
Besides the three CrawlableDatasets that define the collection to be cataloged, there are a variety of ways to modify or enhance the resulting catalog. For more details, see the documentation for the various setters (
setCollectionId()
,setIdentifier()
,setNamer()
,setDoAddDataSize()
,setSorter()
,setProxyDsHandlers()
,addChildEnhancer()
.Example
Here we'll look at the parameters used to construct a CollectionLevelScanner and to generate a catalog for the following request:
http://my.server:8080/thredds/ncep/nam/80km/catalog.xml
In the constuctor, we have:
- collection ID: "ncep";
- collectionLevel.getPath(): "/my/data/collection/model/ncep";
- catalogLevel.getPath(): "/my/data/collection/model/ncep/nam/80km";
- currentLevel = null (so the catalogLevel is used);
- filter and service: not really important for this example, so we'll ignore them for now.
The two datasets we'll use in the example are:
- childAtomicCrDs.getPath(): "/my/data/collection/model/ncep/nam/80km/20060208_1200_nam80km.grib"
- childCollectionCrDs.getPath(): "/my/data/collection/model/ncep/nam/80km/2000archive"
Following are the details on how the resulting InvDataset and InvCatalogRef objects are created.
- The name of a dataset element (and the xlink:title of a catalogRef
element) is the name of the corresponding CrawlableDataset. Example:
<dataset name="20060208_1200_nam80km.grib"/> <catalogRef xlink:title="2000archive"/>
- name = childAtomicCrDs.getName()
- xlink:title = childCollectionCrDs.getName()
- The ID of a catalog dataset element is the ID of the parent dataset and the name of the corresponding CrawlableDataset separated by a "/". So, it ends up being the path of the corresponding CrawlableDataset from the point where the collection CrawlableDataset path ends then prefixed by the collectionId which is set using the setCollectionId() string. Example:
<dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib"/> <catalogRef xlink:title="2000archive" ID="NCEP/nam/80km/2000archive" />
where the values were determined as follows:- ID = collectionId + childAtomicCrDs.getPath().substring( collectionLevel.getPath().length + 1)
- ID = collectionId + childCollectionCrDs.getPath().substring( collectionLevel.getPath().length + 1)
- The urlPath of a dataset element is the collectionPath plus the path of the corresponding CrawlableDataset starting at the point where the collection CrawlableDataset path ends. Example:
<dataset name="20060208_1200_nam80km.grib" ID="NCEP/nam/80km/20060208_1200_nam80km.grib" urlPath="ncep/nam/80km/20060208_1200_nam80km.grib" />
where the values were determined as follows:- urlPath = collectionPath + "/" + childAtomicCrDs.getPath().substring( collectionLevel.getPath().length + 1)
- The xlink:href of a catalogRef element is the path of the corresponding CrawlableDataset starting at the point where the catalogLevel CrawlableDataset ends plus "/catalog.xml". Example:
<catalogRef xlink:title="2000archive" xlink:href="2000archive/catalog.xml"/>
where the values were determined as follows:- xlink:href = childCollectionCrDs.getPath().substring( catalogLevel.getPath().length() + 1 ) + "/catalog.xml"
See
DatasetScanCatalogBuilder
for more details on how a THREDDS server config file (catalog.xml) and the contained datasetScan elements map into CollectionLevelScanner.Multi-level Catalogs
Resulting single level catalogs can be used to construct multi-level catalogs by replacing InvCatalogRef objects with the catalogs generated for the corresponding CrawlableDataset objects. Construction of multi-level catalogs is supported in several ways:
- The getCatRefInfo() method provides access to the list of InvCatalogRef objects and their corresponding CrawlableDataset objects.
- In the CollectionLevelScanner constructor, the currentLevel parameter indicates the level for which the current single-level catalog is to be created rather than the top-level of the resulting multi-level catalog.
NOTE: The
StandardCatalogBuilder
class is an example of using ColletionLevelScanner to construct multi-level catalogs.- Since:
- Jun 14, 2005T12:41:23 PM
-
-
Constructor Summary
Constructors Constructor Description CollectionLevelScanner(String collectionPath, CrawlableDataset collectionLevel, CrawlableDataset catalogLevel, CrawlableDataset currentLevel, CrawlableDatasetFilter filter, InvService service)
Construct a CollectionLevelScanner.CollectionLevelScanner(CollectionLevelScanner cs)
Copy constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addChildEnhancer(DatasetEnhancer childEnhancer)
Add the given DatasetEnhancer to the list that will be applied to each of the child datasets.InvCatalogImpl
generateCatalog()
InvCatalogImpl
generateProxyDsResolverCatalog(ProxyDatasetHandler pdh)
Generate the catalog for a resolver request of the given ProxyDatasetHandler.protected String
getCollectionId()
protected String
getCollectionName()
protected boolean
getDoAddDataSize()
protected CrawlableDatasetLabeler
getIdentifier()
protected CrawlableDatasetLabeler
getNamer()
Map
getProxyDsHandlers()
CrawlableDatasetSorter
getSorter()
void
scan()
Scan the collection and gather information on contained datasets.void
setCollectionId(String collectionId)
Set the value of the base dataset ID.void
setCollectionName(String collectionName)
Set the value of the collection Name.void
setDoAddDataSize(boolean doAddDataSize)
Determines if datasetSize metadata will be added to each InvDataset built during catalog generation.void
setIdentifier(CrawlableDatasetLabeler identifier)
Set the CrawlableDatasetLabeler used to determine the ID of the InvDataset built during catalog generation.void
setNamer(CrawlableDatasetLabeler namer)
Set the CrawlableDatasetLabeler used to determine the name of each InvDataset built during catalog generation.void
setProxyDsHandlers(Map proxyDsHandlers)
void
setSorter(CrawlableDatasetSorter sorter)
Set the sorter with which to sort the list of child CrawlableDatasets.void
setTopLevelMetadataContainer(InvDatasetImpl topLevelMetadataContainer)
Set the InvDatasetImpl that contains the metadata for the top level dataset.
-
-
-
Constructor Detail
-
CollectionLevelScanner
public CollectionLevelScanner(String collectionPath, CrawlableDataset collectionLevel, CrawlableDataset catalogLevel, CrawlableDataset currentLevel, CrawlableDatasetFilter filter, InvService service)
Construct a CollectionLevelScanner.The collectionLevel and catalogLevel parameters are used to properly determine the dataset urlPath. The catalogLevel must either be the collectionLevel or be a decendent of the collectionLevel. The currentLevel, if not null, must either be the catalogLevel or be a decendent of the catalogLevel.
- If the service base is relative to the catalog (i.e., an empty string), the urlPath needs to be relative to the catalog as well. Therefore, the urlPath thus becomes the dataset path minus the catalogLevel path.
- If the service base is relative to the collection (absolute? relative to the server?), e.g., "/thredds/dodsC/", the urlPath needs to be relative to the collection as well. Therefore, the urlPath thus becomes the dataset path minus the collectionLevel path.
The currentLevel parameter indicates what level is to be scanned. It is the same as the catalogLevel except for the case when catalogRefs are not used for all collection levels. (The urlPath is still determined as described above. Only the location of the datasets is changed.)
- Parameters:
collectionPath
- the path of the collection, used as the base of all resulting dataset@urlPath values (may be "", if null, "" is used).collectionLevel
- the root of the collection to be cataloged (must not be a CrawlableDatasetAlias).catalogLevel
- the location, within the collection, for which a catalog is being generated.currentLevel
- the location, at or below the catalog level, which is to be scanned for datasets. Only necessary when multiple catalogs are to be aggregated. May be null. If null, assumed to be same as catalog level.filter
- determines which CrawlableDatasets are accepted as part of the collection.service
- the default service of all InvDatasets in the generated catalog.- Throws:
IllegalArgumentException
-
CollectionLevelScanner
public CollectionLevelScanner(CollectionLevelScanner cs)
Copy constructor
-
-
Method Detail
-
getSorter
public CrawlableDatasetSorter getSorter()
-
setSorter
public void setSorter(CrawlableDatasetSorter sorter)
Set the sorter with which to sort the list of child CrawlableDatasets.- Parameters:
sorter
- the CrawlableDatasetSorter that will be used to sort the list of child CrawlableDatasets.
-
getProxyDsHandlers
public Map getProxyDsHandlers()
-
setProxyDsHandlers
public void setProxyDsHandlers(Map proxyDsHandlers)
-
setCollectionId
public void setCollectionId(String collectionId)
Set the value of the base dataset ID. The value is used to construct the value of the dataset@ID attribute for all datasets.- Parameters:
collectionId
-
-
getCollectionId
protected String getCollectionId()
-
setCollectionName
public void setCollectionName(String collectionName)
Set the value of the collection Name. The value is used to name the top-level dataset in the top-level collection catalog (that is, only when the catalog level is the same as the collection level).- Parameters:
collectionName
-
-
getCollectionName
protected String getCollectionName()
-
setIdentifier
public void setIdentifier(CrawlableDatasetLabeler identifier)
Set the CrawlableDatasetLabeler used to determine the ID of the InvDataset built during catalog generation. The labeler is applied to the CrawlableDataset that corresponds to each InvDataset built.- Parameters:
identifier
-
-
getIdentifier
protected CrawlableDatasetLabeler getIdentifier()
-
setNamer
public void setNamer(CrawlableDatasetLabeler namer)
Set the CrawlableDatasetLabeler used to determine the name of each InvDataset built during catalog generation. The labeler is applied to the CrawlableDataset that corresponds to each InvDataset built.- Parameters:
namer
-
-
getNamer
protected CrawlableDatasetLabeler getNamer()
-
setDoAddDataSize
public void setDoAddDataSize(boolean doAddDataSize)
Determines if datasetSize metadata will be added to each InvDataset built during catalog generation. The CrawlableDataset.length() method is used to determine the size of the dataset.- Parameters:
doAddDataSize
-
-
getDoAddDataSize
protected boolean getDoAddDataSize()
-
addChildEnhancer
public void addChildEnhancer(DatasetEnhancer childEnhancer)
Add the given DatasetEnhancer to the list that will be applied to each of the child datasets. The DatasetEnhancer only modify InvDataset objects but can use the corresponding CrawlableDataset for information.- Parameters:
childEnhancer
-
-
setTopLevelMetadataContainer
public void setTopLevelMetadataContainer(InvDatasetImpl topLevelMetadataContainer)
Set the InvDatasetImpl that contains the metadata for the top level dataset.- Parameters:
topLevelMetadataContainer
-
-
scan
public void scan() throws IOException
Scan the collection and gather information on contained datasets.- Throws:
IOException
- if an I/O error occurs while locating the contained datasets.
-
generateCatalog
public InvCatalogImpl generateCatalog() throws IOException
- Throws:
IOException
-
generateProxyDsResolverCatalog
public InvCatalogImpl generateProxyDsResolverCatalog(ProxyDatasetHandler pdh)
Generate the catalog for a resolver request of the given ProxyDatasetHandler.- Parameters:
pdh
- the ProxyDatasetHandler corresponding to the resolver request.- Returns:
- the catalog for a resolver request of the given proxy dataset.
- Throws:
IllegalStateException
- if this collection has not yet been scanned.IllegalArgumentException
- if the given ProxyDatasetHandler is not known by this CollectionLevelScanner.
-
-