中国科学院机构知识库网格系统: 基于视觉的人的行为表达与识别方法研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

基于视觉的人的行为表达与识别方法研究

文献类型：学位论文


作者	王时全
学位类别	工学博士
答辩日期	2013-06-02
授予单位	中国科学院大学
授予地点	中国科学院自动化研究所
导师	谭铁牛
关键词	动态图像序列理解行为识别方向统计学 semantic interpretation of dynamic image sequences action recognition directional statistics
其他题名	Visual Representation and Classification of Human Actions
学位专业	模式识别与智能系统
中文摘要	人的行为识别是计算机视觉、模式识别领域的重要研究问题。其目的是让计算机自动识别出视频序列中人的行为。这在多种应用场合都有强烈需求，尤其是智能视频监控。一般来说人的行为识别首先在视频序列中提取特征，然后描述行为的空间、时间信息并识别。本文从人的视觉认识、学习行为的机理出发，分析表达行为需要的信息，结合方向统计学处理梯度和光流进而完成行为表达和识别。本文主要工作和贡献有： 1. 提出了行为表达的三类信息：状态、状态转移的过程和状态转移序列，以此对行为识别相关工作进行整理，作为该论文的工作基础。 2. 引入方向统计学并结合梯度提出了一种描述状态的特征和一个对应的相似度度量。该特征验证了方向梯度直方图描述目标结构信息的有效性，在行为识别方面具有较好的表现。 3. 引入方向统计学并结合光流提出了一种紧致的状态转移特征，该特征针对不规则行为、多视角行为具有较好的鲁棒性，而且计算量小适用于实时应用，还有良好的跨数据库能力。 4. 建立了一个单人行为识别数据库，该数据库具有大规模、真实场景、多视角、贴近智能视频监控实际需求的特点，基于该数据库提出了一种统筹管理多行为识别数据库的元数据关系模型。总的说来，本文在行为识别领域进行了有益的探索，并取得了一些有创新意义的成果。
英文摘要	Human action recognition is very important in the field of Computer Vision and Pattern Recognition. Many applications, especially intelligent video surveillance, need human action recognition. Human action recognition is usually addressed by extracting features, which capture spatial and temporal information, from video sequences. This thesis analyzes how human recognizes and learns actions visually to explore information needed to express actions visually. Directional statistics is adopted to work with gradients and optical flow for action representation and recognition. The contributions of this thesis include: 1. We propose three kinds of information to represent actions visually: state, the process of state transition and the sequence of state transition. Related action recognition work is organized based on it to put our work in context. 2. We propose a new descriptor for state through directional statistics and histogram of oriented gradients. A corresponding similarity measurement is also proposed. 3. We develop a compact descriptor for the process of state transition through directional statistics and optical flow. The descriptor is robust to irregular activities and view changes. It is efficient enough for real-time applications. 4. We create an action recognition dataset. It is large, real scene, multi-view. We propose a method to organize multiple action recognition datasets.
语种	中文
其他标识符	200818014628064
源URL	[http://ir.ia.ac.cn/handle/173211/6558]
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	王时全. 基于视觉的人的行为表达与识别方法研究[D]. 中国科学院自动化研究所. 中国科学院大学. 2013.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。