Class DirectoryHarvester
- java.lang.Object
-
- de.pangaea.metadataportal.harvester.Harvester
-
- de.pangaea.metadataportal.harvester.SingleFileEntitiesHarvester
-
- de.pangaea.metadataportal.harvester.DirectoryHarvester
-
public class DirectoryHarvester extends SingleFileEntitiesHarvester
Harvester for traversing file system directories. Identifiers are build from the relative path of files against the base directory.This harvester supports the following additional harvester properties:
directory
: file system directory to harvestrecursive
: traverse in subdirs (default: false)identifierPrefix
: This prefix is appended before all relative file system pathes (that are the identifiers of the documents) (default: "")filenameFilter
: regex to match the filename (default: none)
- Author:
- Uwe Schindler
-
-
Field Summary
-
Fields inherited from class de.pangaea.metadataportal.harvester.Harvester
fromDateReference, harvestCount, HARVESTER_METADATA_FIELD_LAST_HARVESTED, harvestMessageStep, iconfig, log, processor
-
-
Constructor Summary
Constructors Constructor Description DirectoryHarvester(HarvesterConfig iconfig)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
enumerateValidHarvesterPropertyNames(Set<String> props)
This method is used by subclasses to enumerate all available harvester properties that are implemented by them.void
harvest()
This method is called by the harvester afterHarvester.open(de.pangaea.metadataportal.processor.ElasticsearchConnection, java.lang.String)
'ing it.void
open(ElasticsearchConnection es, String targetIndex)
Opens harvester for harvesting documents described by the givenHarvesterConfig
.-
Methods inherited from class de.pangaea.metadataportal.harvester.SingleFileEntitiesHarvester
addDocument, addDocument, cancelMissingDocumentDelete, close
-
Methods inherited from class de.pangaea.metadataportal.harvester.Harvester
addDocument, createMetadataDocumentInstance, deleteDocument, finishReindex, getValidHarvesterPropertyNames, isAllIndexes, isClosed, isDocumentOutdated, main, prepareReindex, runHarvester, runHarvester, setHarvestingDateReference, setValidIdentifiers
-
-
-
-
Constructor Detail
-
DirectoryHarvester
public DirectoryHarvester(HarvesterConfig iconfig)
-
-
Method Detail
-
open
public void open(ElasticsearchConnection es, String targetIndex) throws Exception
Description copied from class:Harvester
Opens harvester for harvesting documents described by the givenHarvesterConfig
. OpensHarvester.processor
for usage inHarvester.harvest()
method.
-
harvest
public void harvest() throws Exception
Description copied from class:Harvester
This method is called by the harvester afterHarvester.open(de.pangaea.metadataportal.processor.ElasticsearchConnection, java.lang.String)
'ing it. Overwrite this method in your harvester class. This method should harvest files from somewhere, generateMetadataDocument
s and add them withHarvester.addDocument(de.pangaea.metadataportal.processor.MetadataDocument)
.
-
enumerateValidHarvesterPropertyNames
protected void enumerateValidHarvesterPropertyNames(Set<String> props)
Description copied from class:Harvester
This method is used by subclasses to enumerate all available harvester properties that are implemented by them. Overwrite this method in your own implementation and append all harvester names to the suppliedSet
. The public API for client code requesting property names isHarvester.getValidHarvesterPropertyNames()
.- Overrides:
enumerateValidHarvesterPropertyNames
in classSingleFileEntitiesHarvester
- See Also:
Harvester.getValidHarvesterPropertyNames()
-
-