Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes
文献类型:期刊论文
; | |
作者 | Cui, Shaowei1,2![]() ![]() ![]() ![]() |
刊名 | IEEE ROBOTICS AND AUTOMATION LETTERS
![]() ![]() |
出版日期 | 2020-10-01 ; 2020-10-01 |
卷号 | 5期号:4页码:5827-5834 |
关键词 | Grasping Grasping perception for grasping and manipulation multi-modal perception force and tactile sensing perception for grasping and manipulation multi-modal perception force and tactile sensing |
ISSN号 | 2377-3766 ; 2377-3766 |
DOI | 10.1109/LRA.2020.3010720 ; 10.1109/LRA.2020.3010720 |
通讯作者 | Wang, Shuo(shuo.wang@ia.ac.cn) |
英文摘要 | Predicting whether a particular grasp will succeed is critical to performing stable grasping and manipulating tasks. Robots need to combine vision and touch as humans do to accomplish this prediction. The primary problem to be solved in this process is how to learn effective visual-tactile fusion features. In this letter, we propose a novel Visual-Tactile Fusion learning method based on the Self-Attention mechanism (VTFSA) to address this problem. We compare the proposed method with the traditional methods on two public multimodal grasping datasets, and the experimental results show that the VTFSA model outperforms traditional methods by a margin of 5+% and 7+%. Furthermore, visualization analysis indicates that the VTFSA model can further capture some position-related visual-tactile fusion features that are beneficial to this task and is more robust than traditional methods.; Predicting whether a particular grasp will succeed is critical to performing stable grasping and manipulating tasks. Robots need to combine vision and touch as humans do to accomplish this prediction. The primary problem to be solved in this process is how to learn effective visual-tactile fusion features. In this letter, we propose a novel Visual-Tactile Fusion learning method based on the Self-Attention mechanism (VTFSA) to address this problem. We compare the proposed method with the traditional methods on two public multimodal grasping datasets, and the experimental results show that the VTFSA model outperforms traditional methods by a margin of 5+% and 7+%. Furthermore, visualization analysis indicates that the VTFSA model can further capture some position-related visual-tactile fusion features that are beneficial to this task and is more robust than traditional methods. |
资助项目 | National Key Research and Development Program of China[2018AAA0103003] ; National Key Research and Development Program of China[2018AAA0103003] ; National Natural Science Foundation of China[61773378] ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS ; National Natural Science Foundation of China[61773378] ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS |
WOS研究方向 | Robotics ; Robotics |
语种 | 英语 ; 英语 |
WOS记录号 | WOS:000554894900003 ; WOS:000554894900003 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC ; IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
资助机构 | National Key Research and Development Program of China ; National Key Research and Development Program of China ; National Natural Science Foundation of China ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS ; National Natural Science Foundation of China ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS |
源URL | [http://ir.ia.ac.cn/handle/173211/40268] ![]() |
专题 | 智能机器人系统研究 |
通讯作者 | Wang, Shuo |
作者单位 | 1.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China 3.Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Prov Key Lab Robot & Intelligent Syst, Shenzhen 518055, Peoples R China 4.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 5.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Shanghai 200031, Peoples R China |
推荐引用方式 GB/T 7714 | Cui, Shaowei,Wang, Rui,Wei, Junhang,et al. Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes, Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, IEEE ROBOTICS AND AUTOMATION LETTERS,2020, 2020,5, 5(4):5827-5834, 5827-5834. |
APA | Cui, Shaowei,Wang, Rui,Wei, Junhang,Hu, Jingyi,&Wang, Shuo.(2020).Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes.IEEE ROBOTICS AND AUTOMATION LETTERS,5(4),5827-5834. |
MLA | Cui, Shaowei,et al."Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes".IEEE ROBOTICS AND AUTOMATION LETTERS 5.4(2020):5827-5834. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。