中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition

文献类型:期刊论文

作者Liu, Yuanyuan2; Wei, Lin2; Liu, Kejun2; Chen, Zijing5; Chen, Zhe6; Tang, Chang2; Chen, Jingying1; Shan, Shiguang3,4
刊名IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
出版日期2025-10-01
卷号16期号:4页码:3404-3420
关键词Videos Face recognition Visualization Emotion recognition Transformers Training Accuracy Data mining Gaze tracking Computational modeling Video-based facial expression recognition eye movement signals pre-training fine-tuning instructed learning
ISSN号1949-3045
DOI10.1109/TAFFC.2025.3599859
英文摘要Video-based facial expression recognition (VFER) is challenging due to variations caused by cultural background and expression camouflage. To tackle these problems, researchers introduced eye movement signals to complement visual information. However, existing methods either require expensive devices to capture high-quality eye movements or can only extract low-quality eye movements visually, making them ineffective in the real world. To address this, we propose an eye movement-instructed VFER (EM-VFER) that leverages high-quality eye movements to instruct the visual learning, obtaining robust performance without requiring costly devices during inference. Specifically, our EM-VFER operates in two stages: the high-quality eye movement pre-training stage and the eye movement-instructed video fine-tuning stage. In the pre-training, we compile an Eye-behavior-aided Multimodal Emotion Recognition (EMER) dataset and use it to train a multimodal Transformer. During the fine-tuning, we propose a novel progressive eye movement-instructed learning to take better advantage of the prior knowledge about high-quality eye movement signals from EMER. The instructed fine-tuning model could then make more robust predictions on downstream facial expression datasets. We evaluate our approach on three macro-expression datasets (DFEW, MAFW and Aff-wild2) and two micro-expression datasets (CASME III and CASME II). The results demonstrate that EM-VFER significantly outperforms existing methods.
资助项目National Natural Science Foundation of China[62076227] ; Natural Science Foundation of Hubei Province[2023AFB572] ; Hubei Key Laboratory of Intelligent Geo-Information Processing[KLIGIP-2022-B10]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:001626710800006
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL[http://119.78.100.204/handle/2XEOYT63/42815]  
专题中国科学院计算技术研究所
通讯作者Chen, Zhe; Chen, Jingying; Shan, Shiguang
作者单位1.Cent China Normal Univ, Natl Engn Res Ctr E Learning, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China
2.China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China
4.Univ Chinese Acad Sci, Beijing 100049, Peoples R China
5.La Trobe Univ, Cisco La Trobe Ctr Artificial Intelligence & Inter, Sch Comp Engn & Math Sci, Flora Hill, Vic 3550, Australia
6.La Trobe Univ, Cisco La Trobe Ctr Artificial Intelligence & Inter, Australian Ctr Artificial Intelligence Med Innovat, Sch Comp Engn & Math Sci, Flora Hill, Vic 3550, Australia
推荐引用方式
GB/T 7714
Liu, Yuanyuan,Wei, Lin,Liu, Kejun,et al. Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING,2025,16(4):3404-3420.
APA Liu, Yuanyuan.,Wei, Lin.,Liu, Kejun.,Chen, Zijing.,Chen, Zhe.,...&Shan, Shiguang.(2025).Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition.IEEE TRANSACTIONS ON AFFECTIVE COMPUTING,16(4),3404-3420.
MLA Liu, Yuanyuan,et al."Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition".IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 16.4(2025):3404-3420.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。