中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Memory-Adaptive Vision-and-Language Navigation

文献类型:期刊论文

作者Keji He4,5; Ya Jing3; Yan Huang4,5; Zhihe Lu1; Dong An2,4; Liang Wang4,5
刊名Pattern Recognition
出版日期2024
卷号153页码:110511
关键词Vision-and-Language Navigation Memory bank History noises Memory-Adaptive Model
英文摘要

Vision-and-Language Navigation (VLN) requests an agent to navigate in 3D environments following given instructions, where history is critical for decision-making in dynamic navigation process. Particularly, a memory bank storing histories is widely used in existing methods to incorporate with multimodel representations in current scenes for better decision-making. However, by weighting each history with a simple scalar, those methods cannot purely utilize the informative cues that co-exist with detrimental contents in each history, thereby inevitably introducing noises into decision-making. To that end, we propose a novel Memory-Adaptive Model (MAM) that can dynamically restrain the detrimental contents in histories for retaining contents that benefit navigation only. Specifically, two key modules, Visual and Textual Adaptive Modules, are designed to restrain history noises based on scene-related vision and text, respectively. A Reliability Estimator Module is further introduced to refine above adaptation operations. Our experiments on the widely used RxR and R2R datasets show that MAM outperforms its baseline method by 4.0%/2.5% and 2%/1% on the validation unseen/test split, respectively, wrt the SR metric.

源URL[http://ir.ia.ac.cn/handle/173211/57628]  
专题自动化研究所_智能感知与计算研究中心
作者单位1.National University of Singapore
2.School of Future Technology, University of Chinese Academy of Sciences
3.ByteDance AI Lab
4.Center for Research on Intelligent Perception and Computing, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences
5.School of Artificial Intelligence, University of Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Keji He,Ya Jing,Yan Huang,et al. Memory-Adaptive Vision-and-Language Navigation[J]. Pattern Recognition,2024,153:110511.
APA Keji He,Ya Jing,Yan Huang,Zhihe Lu,Dong An,&Liang Wang.(2024).Memory-Adaptive Vision-and-Language Navigation.Pattern Recognition,153,110511.
MLA Keji He,et al."Memory-Adaptive Vision-and-Language Navigation".Pattern Recognition 153(2024):110511.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。