Research and Implementation of a Tree Structure Based Automatic Web Page Data(WPD) Extraction Method 基于树结构的网页数据自动抽取方法的研究与实现
At first, Search engine collects web page from internet using crawler. Then, the web page data are analysed by indexer and indexes are created. 搜索引擎首先通过信息采集器(Crawler)从Internet采集网页数据,然后通过索引器(Indexer)对采集数据进行分析,并建立索引。
Each machine is responsible for a specific domain index information collection and index, for storage on different machines on the web page data can be retrieved in parallel. 每一台索引机器负责特定域名信息的采集和索引,对于存储在不同机器上的网页数据可以进行并行检索。
Oneself test function of TOEFL model test warehouse. has been realized. It is been researched for Web page Data Mining. 实现了TOEFL式试题库的自测功能,为基于Web页中数据挖掘作了进一步的研究。
A common scenario in Web applications is form handling, where the user enters data into a Web page, data such as his or her name, address, preferences, and so on. Web应用程序中的一种常见场景是表单处理,用户通过表单将数据输入到Web页面内,数据可以是用户的名字、地址、首选项等。