中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes

文献类型:期刊论文

;
作者Cui, Shaowei1,2; Wang, Rui2,3; Wei, Junhang1,2; Hu, Jingyi1,2; Wang, Shuo2,4,5
刊名IEEE ROBOTICS AND AUTOMATION LETTERS ; IEEE ROBOTICS AND AUTOMATION LETTERS
出版日期2020-10-01 ; 2020-10-01
卷号5期号:4页码:5827-5834
关键词Grasping Grasping perception for grasping and manipulation multi-modal perception force and tactile sensing perception for grasping and manipulation multi-modal perception force and tactile sensing
ISSN号2377-3766 ; 2377-3766
DOI10.1109/LRA.2020.3010720 ; 10.1109/LRA.2020.3010720
通讯作者Wang, Shuo(shuo.wang@ia.ac.cn)
英文摘要Predicting whether a particular grasp will succeed is critical to performing stable grasping and manipulating tasks. Robots need to combine vision and touch as humans do to accomplish this prediction. The primary problem to be solved in this process is how to learn effective visual-tactile fusion features. In this letter, we propose a novel Visual-Tactile Fusion learning method based on the Self-Attention mechanism (VTFSA) to address this problem. We compare the proposed method with the traditional methods on two public multimodal grasping datasets, and the experimental results show that the VTFSA model outperforms traditional methods by a margin of 5+% and 7+%. Furthermore, visualization analysis indicates that the VTFSA model can further capture some position-related visual-tactile fusion features that are beneficial to this task and is more robust than traditional methods.;

Predicting whether a particular grasp will succeed is critical to performing stable grasping and manipulating tasks. Robots need to combine vision and touch as humans do to accomplish this prediction. The primary problem to be solved in this process is how to learn effective visual-tactile fusion features. In this letter, we propose a novel Visual-Tactile Fusion learning method based on the Self-Attention mechanism (VTFSA) to address this problem. We compare the proposed method with the traditional methods on two public multimodal grasping datasets, and the experimental results show that the VTFSA model outperforms traditional methods by a margin of 5+% and 7+%. Furthermore, visualization analysis indicates that the VTFSA model can further capture some position-related visual-tactile fusion features that are beneficial to this task and is more robust than traditional methods.

资助项目National Key Research and Development Program of China[2018AAA0103003] ; National Key Research and Development Program of China[2018AAA0103003] ; National Natural Science Foundation of China[61773378] ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS ; National Natural Science Foundation of China[61773378] ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS
WOS研究方向Robotics ; Robotics
语种英语 ; 英语
WOS记录号WOS:000554894900003 ; WOS:000554894900003
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC ; IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构National Key Research and Development Program of China ; National Key Research and Development Program of China ; National Natural Science Foundation of China ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS ; National Natural Science Foundation of China ; Opening Project of Guangdong Provincial Key Lab of Robotics and Intelligent System ; Youth Innovation Promotion Association CAS
源URL[http://ir.ia.ac.cn/handle/173211/40268]  
专题智能机器人系统研究
通讯作者Wang, Shuo
作者单位1.Univ Chinese Acad Sci, Sch Future Technol, Beijing 100049, Peoples R China
2.Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
3.Chinese Acad Sci, Shenzhen Inst Adv Technol, Guangdong Prov Key Lab Robot & Intelligent Syst, Shenzhen 518055, Peoples R China
4.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
5.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Shanghai 200031, Peoples R China
推荐引用方式
GB/T 7714
Cui, Shaowei,Wang, Rui,Wei, Junhang,et al. Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes, Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, IEEE ROBOTICS AND AUTOMATION LETTERS,2020, 2020,5, 5(4):5827-5834, 5827-5834.
APA Cui, Shaowei,Wang, Rui,Wei, Junhang,Hu, Jingyi,&Wang, Shuo.(2020).Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes.IEEE ROBOTICS AND AUTOMATION LETTERS,5(4),5827-5834.
MLA Cui, Shaowei,et al."Self-Attention Based Visual-Tactile Fusion Learning for Predicting Grasp Outcomes".IEEE ROBOTICS AND AUTOMATION LETTERS 5.4(2020):5827-5834.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。