中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
基于用户生成内容的推荐算法研究

文献类型:学位论文

作者徐松
学位类别工程硕士
答辩日期2015-05-21
授予单位中国科学院大学
授予地点中国科学院自动化研究所
导师王亮
关键词用户生成内容 推荐系统 主题模型 排序学习 User-Generated Content Recommendation Systems Topic Model Learning to Rank
其他题名Research on User-Generated Content based Recommendation Algorithms
学位专业计算机技术
中文摘要随着互联网进入Web2.0时代,以文本、图片、视频等为代表用户生成内容已逐渐受到人们的重视,并为推荐系统提供了重要的数据来源。本文重点针对推荐系统场景下的用户生成内容,比如用户对项目的评论、标签等描述用户与项目交互行为的内容。作为用户与项目之间的一种交互信息,这种用户生成内容既能反映用户的兴趣,又能反映项目的属性,这种特性恰好满足了推荐系统场景中的需求。因此我们认为,合理有效地挖掘用户生成内容有助于推荐系统取得更好的效果。如何有效挖掘大量的用户生成内容并获取有用信息,从而辅助进行项目推荐,这是本论文的工作重心。 传统推荐算法对稀疏评分数据往往存在过拟合问题,另外矩阵分解中学习到的隐含因子不具有可解释性。针对上述几个问题,本文结合用户生成内容,分别从评分预测和Top-N推荐两个方面分别提出了改进算法。本文研究的主要内容如下: 1、提出了一种基于耦合主题模型的推荐算法。鉴于用户生成内容既能体现用户兴趣和项目属性,我们首先从用户生成内容中提取用户文档和项目文档,然后利用耦合主题模型同时建模文档以及评分,挖掘用户潜在兴趣,并进行评分预测以及项目推荐。 2、提出了基于用户生成内容的个性化语义排序模型。该模型直接建模“用户-项目-用户生成内容”的三元关系,分别学习到用户隐含向量表征用户兴趣,项目隐含向量表征项目属性。然后根据学习到的隐含因子向量对所有项目进行排序,从而产生个性化推荐列表进行Top-N推荐。
英文摘要With the development of Web2.0, user-generated content such as text, image and video has gradually attracted more and more attention. It also becomes one of the most important data source for recommendation system. Here we focus on the user-generated content in recommendation scenarios, such as the review or tag generated by user to item. As a kind of information which associated with user-item interaction, user-generated content provides us a clue not only on the user interest but also on the item characteristic. This kind of property is exactly what recommender systems need. By incorporating UGC, recommendation systems have the potential to generate more meaningful and effective recommendations for users. In this thesis, we try to capture useful information in user-generated content to improve the recommendation accuracy. Traditional algorithms tend to overfit when the user feedback is sparse. Besides, the latent factor vector in collaborative filtering approach has difficulty in providing convincing explanations showing that the proposals made by the system are reasonable. We develop some algorithms both in rating prediction and top-N recommendation to improve the recommendation accuracy. This thesis covers two topics below: 1. We present an novel model, named Coupled Topic Model, for recommendation. It first extracts user documents and item documents from user-generated content, and then models the documents and rating matrix simultaneously. After learning the latent vectors, we can make rating prediction and recommendation. 2. We present a model, named Personalized Semantic Ranking, for top-N recommendation. It models the user-item-UGC directly. After learning the latent vectors, we can rank all the items first for each user, and then generate a personalized item list for recommendation.
语种中文
其他标识符2012E8014661100
源URL[http://ir.ia.ac.cn/handle/173211/7778]  
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
徐松. 基于用户生成内容的推荐算法研究[D]. 中国科学院自动化研究所. 中国科学院大学. 2015.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。