中国科学院机构知识库网格系统: Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes

Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes

文献类型：期刊论文

;
作者	Cui, Shaowei1,2 ; Wang, Rui2,3 ; Wei, Junhang1,2 ; Hu, Jingyi 1,2; Wang, Shuo2,4,5
刊名	IEEE ROBOTICS AND AUTOMATION LETTERS ; IEEE ROBOTICS AND AUTOMATION LETTERS
出版日期	2020-10-01 ; 2020-10-01
卷号	5 期号:4 页码:5827-5834
关键词	Grasping Grasping perception for grasping and manipulation multi-modal perception force and tactile sensing perception for grasping and manipulation multi-modal perception force and tactile sensing
ISSN号	2377-3766 ; 2377-3766
DOI	10.1109/LRA.2020.3010720 ; 10.1109/LRA.2020.3010720
通讯作者	Wang, Shuo(shuo.wang@ia.ac.cn)
英文摘要	Predicting whether a particular grasp will succeed is critical to performing stable grasping and manipulating tasks. Robots need to combine vision and touch as humans do to accomplish this prediction. The primary problem to be solved in this process is how to learn effective visual-tactile fusion features. In this letter, we propose a novel Visual-Tactile Fusion learning method based on the Self-Attention mechanism (VTFSA) to address this problem. We compare the proposed method with the traditional methods on two public multimodal grasping datasets, and the experimental results show that the VTFSA model outperforms traditional methods by a margin of 5+% and 7+%. Furthermore, visualization analysis indicates that the VTFSA model can further capture some position-related visual-tactile fusion features that are beneficial to this task and is more robust than traditional methods.; Predicting whether a particular grasp will succeed is critical to performing stable grasping and manipulating tasks. Robots need to combine vision and touch as humans do to accomplish this prediction. The primary problem to be solved in this process is how to learn effective visual-tactile fusion features. In this letter, we propose a novel Visual-Tactile Fusion learning method based on the Self-Attention mechanism (VTFSA) to address this problem. We compare the proposed method with the traditional methods on two public multimodal grasping datasets, and the experimental results show that the VTFSA model outperforms traditional methods by a margin of 5+% and 7+%. Furthermore, visualization analysis indicates that the VTFSA model can further capture some position-related visual-tactile fusion features that are beneficial to this task and is more robust than traditional methods.
资助项目	National Key Research and Development Program of China[2018AAA0103003] ; National Key Research and Development Program of China[2018AAA0103003] ; National Natural Science Foundation of China[61773378] ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS ; National Natural Science Foundation of China[61773378] ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS
WOS研究方向	Robotics ; Robotics
语种	英语 ; 英语
WOS记录号	WOS:000554894900003 ; WOS:000554894900003
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC ; IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构	National Key Research and Development Program of China ; National Key Research and Development Program of China ; National Natural Science Foundation of China ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS ; National Natural Science Foundation of China ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS
源URL	[http://ir.ia.ac.cn/handle/173211/40268]
专题	智能机器人系统研究
通讯作者	Wang, Shuo
作者单位	1.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 3.Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Prov Key Lab Robot & Intelligent Syst, Shenzhen 518055, Peoples R China 4.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 5.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Shanghai 200031, Peoples R China
推荐引用方式 GB/T 7714	Cui, Shaowei,Wang, Rui,Wei, Junhang,et al. Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes, Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, IEEE ROBOTICS AND AUTOMATION LETTERS,2020, 2020,5, 5(4):5827-5834, 5827-5834.
APA	Cui, Shaowei,Wang, Rui,Wei, Junhang,Hu, Jingyi,&Wang, Shuo.(2020).Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes.IEEE ROBOTICS AND AUTOMATION LETTERS,5(4),5827-5834.
MLA	Cui, Shaowei,et al."Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes".IEEE ROBOTICS AND AUTOMATION LETTERS 5.4(2020):5827-5834.

入库方式： OAI收割

来源：自动化研究所

下载0

Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes

其他版本