Basically, you create classes to set up the DOM to work with your XML document, filter the XML nodes, and process the results of that filter. 基本上,您创建类来设置DOM使其与XML文档一起工作,过滤XML节点,并处理过滤器的结果。
Semantic focused crawler need to judge document content, compute similarity between theme and document, filter related result away from unrelated and store them into database. 语义专题爬虫需要对语义信息进行内容判断、计算语义信息与主题的相关性、筛选与主题相关的信息并保存到数据存储中。
With TRaX, it's not too difficult to preprocess an XML document through a filter. 有了TRaX,通过过滤器,对XML文档进行预处理是不太困难的。
Accepts all changes in document, ignoring filter settings. 接受文档中的所有更改,忽略筛选设置。
In the technique described in this document, a servlet filter will be used to block FM requests. 本文档描述的技术使用一个servlet筛选器阻止FM请求。