中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
基于垂直搜索技术的网络信息查询系统的设计与实现

文献类型:学位论文

作者胡晶晶
学位类别博士
答辩日期2007-05-26
授予单位中国科学院声学研究所
授予地点声学研究所
关键词垂直搜索 Ajax技术 网页信息抽取技术 正则表达式 结构化处理
其他题名Network Information Search System Design and Implementation Based on Vertical Search Technology
学位专业信号与信息处理
中文摘要互联网络的迅猛发展,促使Web页面的信息资源数量正在以惊人的速度增长,用户越来越迫切地需要一种更加精、准、专的搜索引擎来满足资源搜索的需求。随着搜索引擎技术的不断发展,垂直搜索技术受到了广大用户的欢迎。这种针对于专业领域的搜索引擎,可以快速地让用户获取到合适的结果。专业化的垂直搜索引擎,作为互联网的第三代搜索引擎,将成为今后搜索引擎发展的主流。 本论文依托中科院声学所网络与新媒体技术研究中心的“E游天下”网络新媒体综合业务平台,针对垂直搜索技术、基于模板方式的网页信息抽取技术、以及基于FFT的网页正文提取算法等网络信息处理关键技术展开研究,提出了基于垂直搜索技术的网络信息查询系统设计方案,并融合Ajax、数据库、HTTP协议等相关技术,实现了酒店信息查询系统和机票比价查询系统。 本论文具体研究的内容和成果如下: (1) 提出了酒店信息查询系统的设计框架,对系统结构进行了模块的划分,确定了此系统的技术实现方案。 (2) 实现了酒店信息查询系统,包括网页抓取模块、信息抽取模块、结构化处理模块及集成存储模块等。 (3) 设计了机票比价查询系统的系统框架,实现了系统的主要功能模块。 (4) 提出了一种优化机票比价查询系统网络层的方案,提高系统的搜索效率,并对优化前后的查询系统进行了测试对比。 (5) 参与了网页库级信息抽取技术的研究项目。完成了相关的实验研究和结果分析工作。
英文摘要With the fast development of internet, the amount of the webpage is increasing rapidly. Customers need a search engine which is more precise, exact and more professional. With this kind of search engine, they can search resource more easily. As the development of the search engine technology, the vertical search technology gets more and more attention. This kind of search engine only aims at a special domain; users can get the proper results much more quickly. As a third generation search engine, the professional vertical search engine will become dominant in the near future. Based on the integrated platform named eYooWorld in the Network and New-media Technology Research Center of the Institute of Acoustics for online travel service, this dissertation researches several key technologies of network and signal processing, which are object-based vertical search technology, information extraction technology by using template, and the FFT-based algorithm of extracting webpage’s effective content; presents the design method of network information search systems based on the vertical search, and by using Ajax, database, HTTP protocol and other related technology, implements the Hotel Search System and Airline Tickets Search System. The major research and achievements of this dissertation: (1) Present the architecture of the Hotel Search System, divide the system architecture into several modules, and present the implementation technology of the system. (2) Implement the Hotel Search System, including the webpage clawing module, information extraction module, structurized processing module, integrated storage module and other functional modules. (3) Design the Airline Tickets Search System, implement the function modules. (4) Present a method to optimize the network layer of the Airline Tickets Search System. The search performance is improved by using this method. Contrast the optimized system with the original system. (5) Participate in the research project of the webpage information extraction. Analyse the webpage property. Did an experiment and analysed the experiment result.
语种中文
公开日期2011-05-07
页码89
源URL[http://159.226.59.140/handle/311008/244]  
专题声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式
GB/T 7714
胡晶晶. 基于垂直搜索技术的网络信息查询系统的设计与实现[D]. 声学研究所. 中国科学院声学研究所. 2007.

入库方式: OAI收割

来源:声学研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。