CAS IR GRID研究单元&专题: 软件研究所中国科学院软件研究所成立于1985年3月1日,是一所致力于计算机科学理论和软件高新技术的研究与发展的综合性基地型研究所。所址位于北京海淀区中关村南四街4号中国科学院软件园内。 中国科学院软件研究所以计算机科学、计算机软件、计算机应用技术、信息安全为重点学科领域,学科方向为:计算机科学与软件理论,基础软件技术与系统,互联网信息处理的理论、方法与技术,综合信息系统技术。http://www.irgrid.ac.cn:8080/handle/1471x/243712024-03-29T05:09:00Z2024-03-29T05:09:00Z面向偏微分方程数值求解的稀疏三角方程组高性能并行求解技术研究陈道琨http://www.irgrid.ac.cn:8080/handle/1471x/71892292022-10-11T16:20:56Z2022-10-09T14:51:06Z题名: 面向偏微分方程数值求解的稀疏三角方程组高性能并行求解技术研究
作者: 陈道琨2022-10-09T14:51:06Z分类任务中融合句子级上下文信息的文本表示方法研究亢良伊http://www.irgrid.ac.cn:8080/handle/1471x/71790982022-09-27T16:16:48Z2022-09-23T16:15:47Z题名: 分类任务中融合句子级上下文信息的文本表示方法研究
作者: 亢良伊2022-09-23T16:15:47Z区块链协议的攻击分析与检测技术研究夏清http://www.irgrid.ac.cn:8080/handle/1471x/71579782022-07-19T16:22:13Z2022-07-15T11:15:08Z题名: 区块链协议的攻击分析与检测技术研究
作者: 夏清2022-07-15T11:15:08Z基于用户偏好建模的视频推荐关键技术研究李贝贝http://www.irgrid.ac.cn:8080/handle/1471x/71579772022-07-19T16:22:13Z2022-07-14T18:08:35Z题名: 基于用户偏好建模的视频推荐关键技术研究
作者: 李贝贝
摘要: <div>
现有的个性化推荐技术可以大体分为两类:基于协同过滤的推荐和基于特征的推荐。基于协同过滤的推荐需要输入用户-物品的历史交互数据,基于特征的推荐则除了历史交互数据以外,还需要输入用户、物品或交互上下文的特征信息。基于协同过滤的推荐可以分为基于近邻的协同过滤、基于随机游走的协同过滤、基于矩阵分解的协同过滤和基于深度学习的协同过滤等四类。基于特征的推荐利用用户特征、物品特征以及场景特征等信息预测用户-物品的交互概率。传统的基于特征的推荐模型可以分为基于逻辑回归的推荐模型、基于特征组合的推荐模型、基于模型组合的推荐模型和基于深度学习和特征的推荐模型等。目前也有一些视频推荐的研究,按照视频类型,个性化视频推荐可以分为个性化长视频推荐和个性化短视频推荐。</div>
<div>
</div>
<div>
本文重点研究了基于用户偏好建模的视频推荐。针对复杂多元的视频推荐场景,本文深入分析用户交互行为特点,建模用户偏好向量,进而预测用户对候选物品的交互分数/概率,产生最终推荐结果。本文工作主要包括三个方面的研究。第一个是基于个体用户多兴趣建模的短视频推荐,第二个是基于个体用户多层次偏好的视频列表推荐,第三个是基于群体用户偏好的点播影片推荐。具体而言,本论文的主要贡献体现在如下几个方面:</div>
<div>
</div>
<div>
(1)针对短视频场景中用户具有多兴趣的行为特点,本文提出了一种新颖的异质多兴趣模型 OPAL,该模型通过预训练-微调两阶段训练可以从用户历史交互序列中为用户学习多个兴趣向量,然后依托每个兴趣向量为用户产生一路推荐结果,进而融合多路推荐形成最终的推荐结果。特别地,在预训练阶段,OPAL通过保守解离的用户软兴趣提升兴趣解离的置信度;在微调阶段,OPAL 引入激进解离的用户硬兴趣提升兴趣之间的异质性并建模单个兴趣随时间的演化。通过基于真实数据集的离线实验和基于真实生产环境的线上 A/B测试,本文验证了所提 OPAL 模型可以提升推荐结果的准确性和多样性,并进而提升日人均播放时长等线上指标。此外,针对短视频场景中用户的正反馈数据有噪声的特点,本文将对比学习引入多兴趣建模,提出一种基于对比多兴趣的模型 CMI。该模型通过随机采样进行数据增强,并认为同一序列经过两次数据增强后的序列所体现的用户偏好相同,进而降低噪声数据的影响,提升用户兴趣向量的鲁棒性。基于真实数据集的离线实验表明了CMI的有效性。</div>
<div>
</div>
<div>
(2)针对视频列表场景中用户对列表的点击行为受多种因素影响的特点,本文提出了一种基于多层次用户偏好的列表推荐模型 MULTIPLE。MULTIPLE模型通过用户-视频和用户-视频列表层面的交互数据分别学习用户在视频层次和列表层次的偏好。考虑列表内每个视频对用户重要度的不同,MULTIPLE通过一个蒸馏注意力机制计算列表内每个视频与用户视频层次偏好的匹配程度,然后加权聚合出个性化的视频列表内容,进而学习用户混合层次的偏好。MULTIPLE同时考虑目标列表的整体风格与列表层次用户偏好、列表所包含视频与视频层次用户偏好、列表的内容嵌入与混合层次用户偏好的契合程度,预测用户对候选物品的偏好分数。基于真实数据集的离线实验表明 MULTIPLE 可以提升列表推荐的召回率;基于 MX Player 生产环境的 A/B 测试说明 MULTIPLE 可以提升线上点击率等指标。</div>
<div>
</div>
<div>
(3)针对面向点播影院的影片推荐任务需要综合考虑影院所有潜在观众偏好的特点,本文对影院潜在观众群体的用户偏好进行建模,进而为点播影院产生影片推荐结果。针对已在运营的点播影院,本文提出了一个时空感知的点播影片推荐模型 Pegasus,该模型从空间近邻影响和空间对影片流行度的影响两个方面建模空间影响力,从观众群体偏好的周期性、近期效应以及观众群体漂移三个方面建模偏好的时间动态性。基于爱奇艺点播影院的真实点播数据的离线实验结果表明 Pegasus 能有效提升推荐的准确性。针对新开业的影院,由于缺乏历史点播数据,无法直接学习观众群体偏好。因此,本文利用影院周围的 POI 数据推测用户群体偏好,提出了一个面向新开业点播影院的影院冷启动推荐模型。此外,该模型充分利用视频间的共同流行关系提升视频嵌入质量。离线实验表明该模型可以为新开业点播影院提供有效的推荐结果。</div>
<div>
</div>
<div>
</div>
摘要: <p>
With the development of wireless communication technology and the Internet, video platforms have grown rapidly and have accumulated a large number of users and high market value. Along with the explosive growth of the number of users and videos, the ability of video platforms to accurately and efficiently show users the videos they are interested in directly affects the user experience. Therefore, video recommendation has become a really important and popular research direction. In recent years, new video application scenarios such as micro-video platforms and offline on-demand cinemas have emerged one after another, and video applications have shown a trend of diversification and complexity, which brings great challenges to video recommendation. However, existing video recommendation models do not provide in-depth analysis and modeling of user behavior based on the characteristics of emerging complex video application scenarios, which limits the accuracy and diversity of recommendations. In addition, since existing video recommendation models mainly target individual users, research on video recommendation for group users is deficient. Therefore, it is necessary to conduct an exhaustive study on video recommendations.</p>
<p>
The key to achieving high-quality video recommendations is to accurately mine user preferences. This dissertation presents a comprehensive and in-depth study on the preference modeling of individual and group users from the scenario characteristics of different video applications and proposes multiple effective video recommendation models to achieve highly accurate and diverse video recommendations. The main research work is summarized as follows.</p>
<p>
(1) Micro-video recommendation based on multiple interests of individual users. For the micro-video scenario, this dissertation disentangles multiple interests of users from their historical interaction sequences and proposes a heterogeneous multi-interest-based micro-video recommendation model named OPAL. OPAL relies on a set of implicit, mutually orthogonal micro-video category vectors to conservatively soft-classify and aggressively hard-classify each micro-video that users have historically interacted with, and then aggregate the micro-videos in the same category to form the soft and hard interests of the user. In addition, OPAL adopts a two-stage training strategy of pre-training and fine-tuning, using soft interests to improve the confidence of interest disentanglement in the pre-training stage, and applying hard interests to improve the interest heterogeneity and modeling the evolution of individual interests over time in the fine-tuning stage. The offline experimental results show that OPAL enhances the recall of recommendation results compared to existing multi-interest models and improves the diversity of recommendations compared to its single-interest model variant. In addition, OPAL conducts the online A/B test on the MX TakaTak micro-video platform, and the A/B test results show that OPAL can effectively improve online metrics such as playtime. Further, this dissertation employs contrastive learning for multi-interest-based recommendation and proposes a contrastive multi-interest loss as well as a contrastive interest-based model CMI. The experimental results show that the combination of contrastive learning and multi-interest modeling is feasible and effective.</p>
<p>
(2) Video list recommendations based on multi-level preferences of individual users. For the video list recommendation scenario, this dissertation explicitly models users' preferences at different levels such as a list level, a video level, and a hybrid level, and proposes the video list recommendation model MULTIPLE on the basis of multi-level user preferences. A sequence learning module is built to learn users' list-level preferences and video-level preferences from user-list interaction data and user-video interaction data, respectively. Further, hybrid-level user preferences are learned by using the hierarchical structure between list-video as a bridge. For the prediction, MULTIPLE integrates users' preferences at multiple levels to predict the preference scores for candidate lists. Results of offline evaluation experiments on real datasets show that the MULTIPLE can effectively improve the recall of recommendation results. In addition, the results of online A/B tests on MX Player show that it can effectively improve online metrics such as CTR.</p>
<p>
(3) On-demand movie recommendation based on group user preferences. To address the problem of on-demand movie recommendations for offline on-demand cinemas, this dissertation models the preferences of potential audience groups, and then provides movie recommendations for on-demand cinemas. For the operating cinemas, this paper proposes a spatial-temporal on-demand movie recommendation model Pegasus by mining historical on-demand records, POI (Point Of Interest) data around the cinemas, and content description information of the movie to model the influence of time and space on group preferences. The offline experimental results show that Pegasus is not only highly interpretable but also has high recommendation accuracy. For the newly-opened cinemas, it is impossible to model the audience preferences directly due to the lack of interaction data. Therefore, we estimate the location classification of the cinemas by the POI information around the cinemas, then predict the audience preferences indirectly, and propose a cold-start recommendation model based on matrix decomposition. In order to make the model learn more expressive movie embeddings, the model constructs the SPPMI matrix based on the co-popular relationship of movies and jointly decomposes it with the interaction matrix. The experimental results show that the cold-start model can effectively improve the performance of movie recommendations for newly opened cinemas.</p>
<p>
In summary, this dissertation focuses on the video recommendation and proposes a series of recommendation models for complex and multifaceted video application scenarios by investigating the multi-level preference and multi-interest modeling of individual users and the preference modeling of group users to improve the recommendation performance. In particular, the service scope of the proposed models in this dissertation covers individual and group users, online and offline applications, long-video and micro-video applications, video streaming and video list streaming applications, etc., which shows these models have significant practical application value.</p>
<p>
</p>2022-07-14T18:08:35Z面向异质图的在线图划分算法设计与实现赵新朋http://www.irgrid.ac.cn:8080/handle/1471x/71579762022-07-19T16:22:13Z2022-07-13T19:00:51Z题名: 面向异质图的在线图划分算法设计与实现
作者: 赵新朋2022-07-13T19:00:51Z面向Spark的缓存优化关键技术研究李慧http://www.irgrid.ac.cn:8080/handle/1471x/71574532022-07-12T16:22:22Z2022-07-11T15:41:05Z题名: 面向Spark的缓存优化关键技术研究
作者: 李慧2022-07-11T15:41:05Z软件漏洞自动分析与利用方法研究黄桦烽http://www.irgrid.ac.cn:8080/handle/1471x/71574522022-07-12T16:22:22Z2022-07-08T16:31:04Z题名: 软件漏洞自动分析与利用方法研究
作者: 黄桦烽2022-07-08T16:31:04Z基于本地差分隐私的数据保护算法研究叶宇桐http://www.irgrid.ac.cn:8080/handle/1471x/71574542022-07-12T16:22:22Z2022-07-08T15:48:39Z题名: 基于本地差分隐私的数据保护算法研究
作者: 叶宇桐2022-07-08T15:48:39Z基于仿真环境的复杂场景目标检测蜕变测试方法研究魏松江http://www.irgrid.ac.cn:8080/handle/1471x/71537182022-07-05T16:18:27Z2022-07-05T14:48:54Z题名: 基于仿真环境的复杂场景目标检测蜕变测试方法研究
作者: 魏松江2022-07-05T14:48:54Z面向通用信息抽取的统一结构生成陆垚杰http://www.irgrid.ac.cn:8080/handle/1471x/71537122022-07-05T16:18:27Z2022-07-04T16:22:12Z题名: 面向通用信息抽取的统一结构生成
作者: 陆垚杰2022-07-04T16:22:12Z