?? writefilter.html
字號:
<html><head><META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>9. Writing a Filter</title><link href="../docbook.css" rel="stylesheet" type="text/css"><meta content="DocBook XSL Stylesheets V1.67.2" name="generator"><link rel="start" href="index.html" title="Heritrix developer documentation"><link rel="up" href="index.html" title="Heritrix developer documentation"><link rel="prev" href="ar01s08.html" title="8. Writing a Frontier"><link rel="next" href="ar01s10.html" title="10. Writing a Scope"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table summary="Navigation header" width="100%"><tr><th align="center" colspan="3">9. Writing a Filter</th></tr><tr><td align="left" width="20%"><a accesskey="p" href="ar01s08.html">Prev</a> </td><th align="center" width="60%"> </th><td align="right" width="20%"> <a accesskey="n" href="ar01s10.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="writefilter"></a>9. Writing a Filter</h2></div></div></div><p>Filters<sup>[<a href="#ftn.footnote_scope_problems">3</a>]</sup> are modules that take a CrawlURI and determine if it matches the criteria of the filter. If so it returns true, otherwise it returns false.</p><p>A filter could be used in several places in the crawler. Most notably is the use of filters in the Scope. Aside that, filters are also used in processors. Filters applied to processors always filter URIs out. That is to say that any URI matching a filter on a processor will effectively skip over that processor. This can be useful to disable (for instance) link extraction on documents coming from a specific section of a given website.</p><p>All Filters should subclass the <a href="http://crawler.archive.org/apidocs/org/archive/crawler/framework/Filter.html" target="_top">Filter</a> class. Creating a filter is just a matter of implementing the <a href="http://crawler.archive.org/apidocs/org/archive/crawler/framework/Filter.html#innerAccepts(java.lang.Object)" target="_top">innerAccepts(Object)</a> method. Because of the planned overhaul of the scopes and filters, we will not provide a extensive example of how to create a filter at this point. It should be pretty easy though to follow the directions in the javadoc. For your filter to show in the application interface, you'll need to edit <code class="filename">src/conf/modules/Filter.options</code></p></div><div class="navfooter"><hr><table summary="Navigation footer" width="100%"><tr><td align="left" width="40%"><a accesskey="p" href="ar01s08.html">Prev</a> </td><td align="center" width="20%"> </td><td align="right" width="40%"> <a accesskey="n" href="ar01s10.html">Next</a></td></tr><tr><td valign="top" align="left" width="40%">8. Writing a Frontier </td><td align="center" width="20%"><a accesskey="h" href="index.html">Home</a></td><td valign="top" align="right" width="40%"> 10. Writing a Scope</td></tr></table></div></body></html>
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -