3.16. Crawl Filter
Note
This page is under construction.
- Plugin Key
mediatype_crawl_filter_factory
, wheremediatype
is a media type like text/html- Plugin Value Type
- Plugin Value Format
The value is the fully qualified name of a Java class implementing the
org.lockss.plugin.FilterFactory
interface.- Sample
<entry> <string>text/html_crawl_filter_factory</string> <string>edu.example.plugin.publisherx.PublisherXHtmlCrawlFilterFactory</string> </entry>
- Description
If files of a given media type need to be pre-processed (filtered) before URLs are extracted by the crawler using a Link Extractor, this plugin feature can be used to point at custom filtering code.
Crawl filters are somewhat related to hash filters.