This thesis designs a web information extraction system which based on semantic structure of the website. The system consists of three main components : website spider, website semantic structure generator, web information extractor. 本文构建了一个基于网站语义结构的信息抽取系统,系统由三个主要部分组成:网站网页搜索器,网站语义结构生成器,网页信息抽取器。