中国科学院机构知识库网格系统: 智能服务型机器人的视觉人机交互与导航

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

智能服务型机器人的视觉人机交互与导航

文献类型：学位论文


作者	魏育成
学位类别	工学博士
答辩日期	2004-07-01
授予单位	中国科学院研究生院
授予地点	中国科学院自动化研究所
导师	谭铁牛
关键词	智能服务型机器人视觉人机交互头部姿态的检测与估计动态手势的跟踪与识别移动机器人导航自定位位置识别 intelligent service robots vision-based human-robot interactions head pose estimation and tracking dynamic hand gesture tracking
其他题名	Vision-based Human-robot Interaction and Navigation of Intelligent Service Robots
学位专业	模式识别与智能系统
中文摘要	智能服务型机器人，也称为第三代机器人，泛指具有感知能力和独立判断、行动能力，为人们日常工作和生活服务的机器人装备。“感知”和“判断”是智能服务型机器人的两个主要特征。“感知”体现在机器人能够看懂人的姿态和表情，能听懂人的话语并与人友好交互。“判断”表现为机器人能够根据感知到的外界信息，自主地对自己的导航行为加以推理和判断。本文的研究重点是基于视觉信息的感知和判断，主要包括“基于姿态识别的人机交互”和“非结构场景下的移动机器人自主导航”两方面的内容，其中涉及到的问题有人脸定位、头部姿态识别与跟踪、手势跟踪、动态手势识别、自定位、路径规划和场景学习等。在本文中，我们以智能轮椅为平台，对这些智能服务型机器人的共性和关键问题逐一开展了研究。智能服务型机器人所处的环境往往是动态和非结构的场景，环境的不确定性给机器人的感知和判断增加了难度。因此，我们使用概率模型来描述场景的不确定性，把目标的检测、跟踪和位置识别等问题转化成基于观测或观测序列的状态估计问题来处理，这个思路贯穿本文的所有章节。本文工作的贡献体现在以下四个方面： ①遵循Asimov的机器人法则，提出智能轮椅设计应遵循的四个原则，并设计出了一台多模态交互式智能轮椅。这个机器人平台具有硬件体系上的信息多通道性、软件体系上的多任务性和控制模式的多类型性等特点。 ②深入研究了动态场景下的人脸头部姿态的识别、定位和跟踪问题。通过定义头部姿态的状态量，使用概率推理方式对头部姿态的识别、定位和跟踪问题给出了一个统一的解释。其中，头部姿态识别是作为系统状态的观测模型而提出的。为了提高姿态识别对于变化光照的鲁棒性，我们提出先用碗状带通滤波器提取出较稳定的头部姿态的表现信息，再用概率主元分析对姿态簇建模的方法实现头部姿态识别。头部姿态的检测和跟踪表现为对系统后验概率的估计，我们采用粒子滤波的方法估计后验概率，将头部的肤色信息、形状信息和运动信息结合到粒子的采样过程中，通过分区采样的策略实现了实时的检测和跟踪。 ③提出了一种改进的粒子滤波器，并将其应用于手势的跟踪。这种粒子滤波器将粒子的均值漂移加入到粒子滤波过程当中，在一定程度上避免了粒子滤波中的采样恶化问题和采样枯竭问题，提高了有效粒子的数目。在实时手势跟踪的基础之上，我们提出基于
英文摘要	Intelligent service robots, namely the third generation of robots, are robots which serve for daily lives of the people and have abilities of interactive perception, independent judgment and autonomous navigation. Two characteristics of service robots are perception and judgment. Perception means that robots can recognize human's gestures or expressions and understand human's voice and intentions. Obviously the perceptual interface is an important way of humamrobot interaction. Judgment means robots can infer and decide their navigation actions independently and autonomously based on current environment information acquired by the perceptual interface. The thesis takes a human-centered perspective, focusing on gesture based human-robot interaction and autonomous navigation within a unstructured environment. Some common and essential subtopics, such as head pose estimation and tracking, hand gesture tracking and recognition, self-localization during navigation and path planning, are discussed in the thesis based on a robot prototype, namely NLPRwheelchair. Service robots usually work in a highly dynamic and unstructured environment whose uncertainties challenge the perception and judgment of robots. Therefore, probabilistic models are better to describe environment's uncertainties than deterministic models. The characteristic of this thesis is to treat problems of object detection, tracking and recognition as problems of state estimation within probabilistic frameworks. In this way, the intelligent wheelchair can perform vision-based human-robot interaction and navigation even in the unstructured environment like lobby.The novelty of this thesis comes from four points: ① Following Asimov's rules, four criterions of intelligent wheelchair design are proposed. Based on those criterions, an intelligent wheelchair is developed with multi-modal information channel, parallel tasks implementation software architecture and multiple control modes. ②Problems of head pose estimation, localization and tracking are unified as a Maximum a Posterior (MAP) problem of head pose state which is composed with position sub-state and pose sub-state. Within the probabilistic framework, head pose estimation is proposed as the system observation model (likelihood). A bowl-like band pass filter is designed to get discriminating appearance features of head poses under variable illuminations. The likelihood of head poses is defined by statistical modelling of pose clusters by Probabilistic Principal Component Analysis (PPCA). Particle filter is used to estimate the MAP value of pose state and multiple cues fusion is realized by partitioned particle sampling. ③ The Mean Shift embedded Particle Filter (MSEPF) is proposed for hand tracking in a cluttered environment. Compared with the conventional particle filter, MSEPF adds mean shift step before particle wei
语种	中文
其他标识符	801
源URL	[http://ir.ia.ac.cn/handle/173211/5827]
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	魏育成. 智能服务型机器人的视觉人机交互与导航[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2004.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。