基于协同过滤的推荐算法研究
文献类型:学位论文
作者 | 汪家升![]() |
学位类别 | 硕士 |
答辩日期 | 2016-05-25 |
授予单位 | 中国科学院沈阳自动化研究所 |
导师 | 宋宏 |
关键词 | 推荐系统 协同过滤 数据填充 监督学习 奇异值分解 |
其他题名 | Research of Recommender Algorithm based on Collaborative Filtering |
学位专业 | 模式识别与智能系统 |
中文摘要 | 随着互联网化的程度越来越高,数据充斥在我们身边,人们从数据匮乏的时代走向了数据的海洋时代。如今人们在享受着海量数据带来的便利的同时也渐渐的感觉到搜寻对自己有价值的信息变得愈发困难,这就是“信息过载”问题。针对“信息过载”问题出现了两种解决方案:其中之一就是搜索引擎,但是搜索引擎需要人主动准确的描述输入意图,并且搜索引擎对于所有的回应是一致的无法做到用户的个性化信息需求。针对这种情况,另一种无须主动意图输入的个性化的信息推送系统得到了迅速发展,这就是推荐系统。推荐系统通过分析用户历史交互数据识别用户兴趣从而为用户提供信息推荐服务。如今在学术界和工业界推荐系统均得到了非常深入的研究,推荐系统已经成功的应用在了音乐、电影、电子商务、新闻推荐等领域,在为用户带来便捷性信息服务的同时也创造了大量的商业价值。本文的主要工作如下:1)对推荐系统的研究背景以及研究现状进行了详细介绍,对推荐系统的定义以及基本概念进行了阐述,分析了如今主流推荐算法的原理以及优缺点。2)提出了一种基于双聚类填充的协同过滤算法。该算法通过对评分矩阵中的缺失数据采用双聚类算法进行填充,增加矩阵的稠密度改善了评分矩阵的稀疏性问题,通过引入一种变权重矩阵对原始值和填充值进行变权重区分,改进了相似度函数和预测函数。实验结果表明本文提出的基于双聚类的协同过滤算法能够有效改善数据稀疏性,提高预测的精准度。3)提出了一种基于监督学习的协同过滤算法,该算法首先将评分矩阵运用SVD基准模型进行分解成用户偏置向量、物品偏置向量、用户特征矩阵和物品特征矩阵;然后将这些特征进行组合生成新的特征空间,产生适用于监督学习算法能运用的特征矩阵;最后结合监督学习算法中的随机森林算法进行训练和预测。实验结果表明本位提出的基于监督学习的协同过滤算法比如今的主流算法如SVD、基于用户的协同过滤、基于物品的协同过滤、LFM模型精度更高更好,并且模型非常稳定,只需少量特征就可以达到非常高的推荐精度。除此以外基于监督学习的协同过滤算法有效的缓解了数据稀疏性问题,增强了可扩展性。4)将本文提出的算法和现有的算法进行了实验对比验证,实验选择Movielens数据集进行验证,采用Python编程实现。实验对比了SVD算法、LFM算法、基于用户的协同过滤算法、基于物品的协同过滤算法等并给出了实验结果,实验结果表明本文提出的算法比现有算法能获得更高的预测精准度和推荐质量。 |
英文摘要 | With the increasing development of the Internet, we are all surrounded with data, people walk through the age of data scarcity into the ocean era of data. People enjoy the convenient of the big data but at the same time also feel that it’s getting harder to find out the information we really need, this problem is called the problem of data overload. Aiming at the problem of "information overload" people put forward two kind of method: one is search engine, the search engine returns the information required by the keyword submitted by the user. but search engine needs people take the initiative to accurately describe the input intention, and then search engine always respond with the same result on the same input, it can not offer personalized information service. In view of this kind of situation, one method that do not need the active intention to enter and the personalized information push service has obtained the rapid development, this is the recommender system. Recommender system through the analysis of user interaction data to identify user interest in order to provide users with information recommendation service. Nowadays, both in the academic and industrial circles, recommender system has been very careful researched, and has been successfully applied in the fields of music, film, electronic commerce, news recommendation and so on. While bringing convenience to users, it also creates a lot of business value. The main work of this paper is as follows: 1) This paper introduced the recommender system research background and present research situation in detail, the definition of recommendation system and basic concepts are described, and then analyzed the principle and the advantages and disadvantages of the mainstream of the recommendation algorithm nowadays. 2) This paper proposed a kind of collaborative filtering algorithm based on bicluster. The algorithm filling in the missing data of rating matrix using biclustering algorithm, thus increasing the matrix density and improve the sparsity of rating matrix. by introducing a variable weight matrix, the original value and the filling value of variable weights can be distinguished, the similarity function and the prediction function are also improved. The experimental results show that the proposed algorithm can effectively improve the data sparsity problem and increase the precision of prediction. 3) This paper proposed a collaborative filtering algorithm based on supervised learning, firstly the algorithm use singular value decomposition model to decompose the rating matrix, then user bias vector, item bias vector, user feature matrix and item feature matrix are obtained. This four can be transformed into a feature matrix that can be trained and predicted by supervised learning. The experimental results show that the collaborative filtering algorithm based on supervised learning outperform mainstream algorithms such as singular value decomposition (SVD), user-based collaborative filtering, item-based collaborative filtering, LFM model, and this stable method can get very good result with only a few feature. In addition, the collaborative filtering algorithm based on supervised learning can effectively alleviate the data sparsity problem and get better scalability. 4) The proposed algorithm and the existing algorithms are compared with experiments, the experiment selected movielens data sets to verify, in the python programming. Comparison of the SVD algorithm, LFM algorithm, User-based Collaborative Filtering algorithm, Item-based Collaborative Filtering algorithm is given based on the experimental results. The experimental results show that the algorithm this paper proposed get better precision and the quality of recommendations than the existing algorithm compared to. |
语种 | 中文 |
产权排序 | 1 |
页码 | 72页 |
源URL | [http://ir.sia.cn/handle/173321/19631] ![]() |
专题 | 沈阳自动化研究所_数字工厂研究室 |
推荐引用方式 GB/T 7714 | 汪家升. 基于协同过滤的推荐算法研究[D]. 中国科学院沈阳自动化研究所. 2016. |
入库方式: OAI收割
来源:沈阳自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。