Self-supervised spatial-temporal feature enhancement for one-shot video object detection
文献类型:期刊论文
作者 | Yao, Xudong1; Yang, Xiaoshan2,3![]() |
刊名 | NEUROCOMPUTING
![]() |
出版日期 | 2024-10-01 |
卷号 | 601页码:11 |
关键词 | One-shot Video object detection Video understanding |
ISSN号 | 0925-2312 |
DOI | 10.1016/j.neucom.2024.128219 |
通讯作者 | Yang, Xiaoshan(xiaoshan.yang@nlpr.ia.ac.cn) |
英文摘要 | One-shot video object detection is a task that aims to locate and identify objects in video sequences given only a single video sample for each class. Exploration in this field is still in its infancy and previous few-shot video object detection methods have limitation in this task. In this paper, we propose the Self-supervised Feature Enhancement (SFE) framework to address one-shot video object detection task. SFE includes two important modules: Hybrid Spatial Self-supervised Feature Enhancement (HSSFE) and Dynamic Temporal Self-supervised Feature Enhancement (DTSFE). HSSFE enhances features from a spatial perspective with spatial self-supervised auxiliary tasks at frame and instance levels. DTSFE on the other hand enhances features from a temporal perspective with memory-based self-supervised constraint for the same object across different frames. We have conducted experiments on multiple benchmarks, and the results demonstrate that our method achieves state-of-the-art performance. |
WOS研究方向 | Computer Science |
语种 | 英语 |
WOS记录号 | WOS:001280068200001 |
出版者 | ELSEVIER |
源URL | [http://ir.ia.ac.cn/handle/173211/59385] ![]() |
专题 | 自动化研究所_模式识别国家重点实验室_多媒体计算与图形学团队 |
通讯作者 | Yang, Xiaoshan |
作者单位 | 1.Tianjin Univ Technol, 391 Binshui Xi Rd, Tianjin 300384, Peoples R China 2.Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, 95 Zhongguancun East Rd, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Sch Artificial Intelligence, 80 Zhongguancun East Rd, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Yao, Xudong,Yang, Xiaoshan. Self-supervised spatial-temporal feature enhancement for one-shot video object detection[J]. NEUROCOMPUTING,2024,601:11. |
APA | Yao, Xudong,&Yang, Xiaoshan.(2024).Self-supervised spatial-temporal feature enhancement for one-shot video object detection.NEUROCOMPUTING,601,11. |
MLA | Yao, Xudong,et al."Self-supervised spatial-temporal feature enhancement for one-shot video object detection".NEUROCOMPUTING 601(2024):11. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。