中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Context-Aware Proposal-Boundary Network With Structural Consistency for Audiovisual Event Localization

文献类型:期刊论文

作者Wang, Hao1; Zha, Zheng-Jun1; Li, Liang2; Chen, Xuejin1; Luo, Jiebo3
刊名IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
出版日期2023-07-19
页码11
ISSN号2162-237X
关键词Audiovisual learning context learning event localization
DOI10.1109/TNNLS.2023.3290083
英文摘要Audiovisual event localization aims to localize the event that is both visible and audible in a video. Previous works focus on segment-level audio and visual feature sequence encoding and neglect the event proposals and boundaries, which are crucial for this task. The event proposal features provide event internal consistency between several consecutive segments constructing one proposal, while the event boundary features offer event boundary consistency to make segments located at boundaries be aware of the event occurrence. In this article, we explore the proposal-level feature encoding and propose a novel context-aware proposal-boundary (CAPB) network to address audiovisual event localization. In particular, we design a local-global context encoder (LGCE) to aggregate local-global temporal context information for visual sequence, audio sequence, event proposals, and event boundaries, respectively. The local context from temporally adjacent segments or proposals contributes to event discrimination, while the global context from the entire video provides semantic guidance of temporal relationship. Furthermore, we enhance the structural consistency between segments by exploiting the above-encoded proposal and boundary representations. CAPB leverages the context information and structural consistency to obtain context-aware event-consistent cross-modal representation for accurate event localization. Extensive experiments conducted on the audiovisual event (AVE) dataset show that our approach outperforms the state-of-the-art methods by clear margins in both supervised event localization and cross-modality localization.
资助项目National Key Research and Development Program of China[2020AAA0105702] ; National Natural Science Foundation of China (NSFC)[62225207] ; National Natural Science Foundation of China (NSFC)[U19B2038] ; National Natural Science Foundation of China (NSFC)[62121002] ; Youth Innovation Promotion Association of CAS[2020108]
WOS研究方向Computer Science ; Engineering
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:001035833700001
源URL[http://119.78.100.204/handle/2XEOYT63/21291]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Zha, Zheng-Jun
作者单位1.Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei 230026, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100089, Peoples R China
3.Univ Rochester, Dept Comp Sci, Rochester, NY 14627 USA
推荐引用方式
GB/T 7714
Wang, Hao,Zha, Zheng-Jun,Li, Liang,et al. Context-Aware Proposal-Boundary Network With Structural Consistency for Audiovisual Event Localization[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,2023:11.
APA Wang, Hao,Zha, Zheng-Jun,Li, Liang,Chen, Xuejin,&Luo, Jiebo.(2023).Context-Aware Proposal-Boundary Network With Structural Consistency for Audiovisual Event Localization.IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS,11.
MLA Wang, Hao,et al."Context-Aware Proposal-Boundary Network With Structural Consistency for Audiovisual Event Localization".IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023):11.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。