中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
网络教学资源的时间动态推荐算法研究及应用

文献类型:学位论文

作者张海东
学位类别工学博士
答辩日期2016-11-30
授予单位中国科学院研究生院
授予地点北京
导师杨一平
关键词协同过滤 推荐系统 隐马尔科夫模型 时间动态变化 隐状态转换 课程序列
中文摘要       随着信息技术在教育领域的不断深化,教学信息呈爆炸式增长,随之而来的信息过载问题,使得教学工作者和学习者难以获取有效信息资源,严重制约着他们的工作学习效率,推荐系统通过分析用户信息和历史行为,主动为用户筛选其感兴趣的信息资源,是解决信息过载问题的一种有效途径。而对教育领域的资源推荐,由于用户对教学资源的兴趣,受课程知识点、用户知识水平、理解能力等因素影响,呈现出显著动态变化的特点,因此,对网络教学资源推荐务必考虑用户兴趣的时间动态演变;同时在当前教育信息化应用中,教学资源库的建设往往重视资源内容的丰富,而忽略了用户行为数据的收集,从而用户行为数据稀疏现象十分常见,常用推荐算法难以取得良好效果和性能;而在教学网站运行初期,数据稀疏现象尤为严重,时间动态推荐算法,难以从行为数据中学习用户兴趣动态变化模式,并且大量教学资源,由于缺少用户行为记录,存在冷启动问题。
本文针对网络教育资源推荐所面临的用户兴趣动态演变、用户行为数据缺失和数据极度稀疏问题,研究网络教学资源的时间动态推荐算法,并将其应用于北京鸿合科技的企业教育云平台中。本文具体研究内容如下:
(1)针对用户对教学资源的兴趣随知识点迁移而呈现明显时间动态变化的特性,提出一种基于隐半马尔科夫模型的协同过滤推荐算法,该算法使用隐状态表示用户的潜在兴趣,并引入状态逗留时间,来表征用户潜在兴趣的驻留时间,利用半马尔科夫过程所包含的状态转换和状态持续,实现对用户潜在兴趣变化和驻留时间的跟踪,并利用多个时间点隐状态和状态转移,以及持续时间的概率分布,可以更加稳定准确地推理用户下一潜在兴趣和对每个资源的选择概率,实验分析表明,该算法能有效对用户潜在兴趣驻留时间的异质性(不同变化模式)进行建模和表征,从而在具有时间动态变化特征的应用场景中,能够显著提高推荐算法效果。
(2)针对用户行为数据不完备(数据稀疏)的状况,提出一种基于抑制隐马尔科夫模型的协同过滤推荐算法,该算法引入带不同情形(抑制或活跃)的隐状态,来表示用户的潜在兴趣,利用隐状态的抑制情形,来表征用户潜在兴趣的空置状态,实现对由于数据稀疏引起的用户行为在时间点上缺失的建模,利用隐状态的活跃情形,来表征用户潜在兴趣的激活状态,实现对用户操作资源行为的建模,并结合马尔科夫过程的状态转换,跟踪用户兴趣的动态变化过程,同时,根据用户当前兴趣和状态转移概率分布,推理下一时刻用户在兴趣激活状态条件下,选择资源的概率分布。实验表明,该方法可以有效控制数据稀疏所引起的用户行为时间不连续现象,从而能够提高在不同稀疏度状况下的算法效果。
(3)进一步,针对实际网络教学环境中,系统初上线所面临的用户行为数据极度稀疏及新上线资源缺乏用户数据(资源冷启动)问题,提出一种基于课程序列结构的混合推荐算法。该算法以用户正关注的课程作为用户兴趣点表征,利用决策树和马尔科夫链模型,对课程教材分类层级结构和课程编排先验知识进行建模;通过资源的文本内容分析计算课程与资源、用户之间的关系,实现对用户兴趣的跟踪,从而解决资源冷启动问题;并以加权的方式引入基于用户行为的资源关联度计算结果,对有限行为数据进行充分利用。该方法将融合了课程序列先验知识、用户行为和资源内容的信息进行推荐,实验表明,可以实现在数据极度稀疏状况下,对用户兴趣的动态跟踪,提高推荐算法效果。
最后,论文将研究成果应用到北京鸿合科技的实际工程项目中,开发了一个由数据模型分析、离线算法分析和在线资源推荐三层结构组成的混合推荐系统,实现了其教育资源库中12个年级,9个教材版本和10个学科教育资源的个性化推荐,并完成了混合推荐系统与该企业软件产品的集成。
英文摘要    With the increasing development of information technology in education field, the explosive growth of online teaching resources results in information overload. It makes teaching staffs and students hard to get useful information, and decreases their efficiency of work and study. Recommender systems, an effective method to overcome information overload, take active ways to filter information that users are interested in by analyzing users’ information and historical records. However, in the recommender systems for education filed, users’interests on teaching resources are changing and dynamic over time, which results from many factors, such as the development of courses, their knowledge levels and understanding capacity. Hence, users’ time-dependent interests must be taken into consideration in online teaching resources recommendation. Moreover, in the informationlized education, constructors of teaching resources library attach importance to the resources content, ignorance of the collections of users’ historical records. It results in that data sparsity is very common in their records dataset. Some typical recommendation methods have poor performance. Such cases are even more worse in the launching of website platform. Time-dependent dynamic recommendation methods are hard to get users’ changing patterns from the historical records. And cold start exsits in a larger number of online teaching resources because they lack of users’ behaviors.
    Considering users’ dynamic interests, users’ missing behaviors and extremely sparse data in online teaching resources recommendation, we study the time-dependent recommendation methods for online teaching resources, and apply them into the teaching cloud platform of Beijing Honghe technology company. The details of this paper are listed as follows:
       (1) Users’ interests on teaching resources, which are affected by the development of knowledge points, are changing over time. Considering this, we propose an hidden semi-Markov model for collaborative filtering. It uses the hidden states to denote users’ latent interests, and introduces state duration to denote the time duration of users’ latent interest. It tracks the transition and duration of users’ interests with the transitive states and state duration in semi-Markov process. Based on the distributions of states transition and state duration, we use the hidden states in last multiple time periods to derive users’ next interests and the probability for items might be preferred. The experiments show that this method can model the heterogeneity of users’ interests duration (different changing patterns), which explains the reason of our algorithm prior to other methods in time-dependent applications.
    (2) Considering data sparsity in users’ historical records, we propose a inhibited hidden Markov model for collaborative filtering. It introduces a binary variable (inhibited or active) into the hidden states to denote users’ latent interests. It uses the inhibited hidden states to represent users’ interests with idle states and model the missing historical records in time which results from data sparsity. It uses the active hidden states to represent users’ interests with active states and model users’ behaviors. It tracks users’ changing interests over time with states transition in Markov process. Moreover, based on the distributions of users’ current states and states transition, we predict the distribution of users’ next active latent states, and the probability of each item might be preferred. The experiments show that it can handle the time discontinuity of users’ records, and improve the performance under different data sparsities.
     (3) Furthermore, online teaching platform has the problems of extremely sparse historical data and new teaching resources that lack of users’ records (cold starts) when it just launches its system. Hence, we propose a hybrid recommendation method based on the sequential structure of courses. This algorithm represents users’ interests with their studying courses. It combines decision tree and Markov model to model the prior knowledge of the hierarchical structure of teaching books and the sequential structure of courses. We perform content-based analysis on the keywords of teaching resources and courses, and get their association and the association between courses and users to track users’ dynamic interests, which can solve the cold starts of teaching resources. At last, we make full use of historical records and employ weighted method to combine the above results with association rules mining for users’ records. This method combines prior knowledge of courses sequences, users’ records and resources’ content for recommendation. The experiments show that it can track users’ dynamic changes under the high data sparsity and have a better performance.
    At the end, we apply the research into the practice, and develop a hybrid recommender system consisting of three layers: data layer for data analysis, off-line layer with a collection of recommendation methods and online layer for making recommendation in real time. It can recommend teaching resources associated with twelve grades, nine versions and ten subjects to users. And we have integrated this hybrid recommender system into the enterprise software. 
学科主题计算机应用技术
源URL[http://ir.ia.ac.cn/handle/173211/13022]  
专题毕业生_博士学位论文
作者单位中国科学院自动化研究所
推荐引用方式
GB/T 7714
张海东. 网络教学资源的时间动态推荐算法研究及应用[D]. 北京. 中国科学院研究生院. 2016.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。