一种基于带权集合的搜索引擎隐式反馈算法
A Weighted Set-Based Implicit Feedback Algorithm for Search Engine
摘要: 随着Internet的迅速发展,网络信息资源开始爆炸式增长。传统的搜索引擎很难从用户输入的检索词中获知其检索意图,只能返回大量匹配结果供用户选择。为了有效的提高搜索引擎的查准率,本文提出了一种基于带权集合的隐式反馈算法。本文通过分析搜索引擎返回结果页面的特点,提出了一种描述网页摘要的带权集合以及相应元素的权重计算方法,并设计了一种带权集合的交集运算方法,通过该运算可以获取用户隐含的检索意图,最后以查询扩展的方式提高搜索引擎的准确性。本文在Google搜索引擎上做了本算法的若干实验,验证了本算法的有效性。
Abstract: With the quick development of the Internet, online information recourses are becoming richer and richer. However, traditional search engines are becoming hard to acquire the users’ retrieval intention because of the short keywords user inputted, so it is difficult to satisfy the needs of the user. In this paper, we proposed a Weighted Set-based implicit feedback algorithm to improve accuracy of the search engines effectively. Th- rough analyzing the characteristics of the snippets of Web pages search engine returned, we present a kind of Weighted-Set for representing Web page snippets and a method for calculating the weight of elements in this Set. In addition, we design an approach to calculate the intersection between any two Weighted-Sets, which can obtain users’ implicit search intention automatically, so that we can improve search accuracy in the way of query expansion. Query expansion experiments on the popular Google search engine show that our algo- rithm can improve search accuracy effectively.
文章引用:张辉, 陈岩. 一种基于带权集合的搜索引擎隐式反馈算法[J]. 计算机科学与应用, 2011, 1(3): 128-133. http://dx.doi.org/10.12677/csa.2011.13026

参考文献