中国科学院机构知识库网格系统: Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition

Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition

文献类型：期刊论文


作者	Liu, Yuanyuan 2; Wei, Lin 2; Liu, Kejun 2; Chen, Zijing 5; Chen, Zhe 6; Tang, Chang 2; Chen, Jingying 1; Shan, Shiguang 3,4
刊名	IEEE TRANSACTIONS ON AFFECTIVE COMPUTING
出版日期	2025-10-01
卷号	16 期号:4 页码:3404-3420
关键词	Videos Face recognition Visualization Emotion recognition Transformers Training Accuracy Data mining Gaze tracking Computational modeling Video-based facial expression recognition eye movement signals pre-training fine-tuning instructed learning
ISSN号	1949-3045
DOI	10.1109/TAFFC.2025.3599859
英文摘要	Video-based facial expression recognition (VFER) is challenging due to variations caused by cultural background and expression camouflage. To tackle these problems, researchers introduced eye movement signals to complement visual information. However, existing methods either require expensive devices to capture high-quality eye movements or can only extract low-quality eye movements visually, making them ineffective in the real world. To address this, we propose an eye movement-instructed VFER (EM-VFER) that leverages high-quality eye movements to instruct the visual learning, obtaining robust performance without requiring costly devices during inference. Specifically, our EM-VFER operates in two stages: the high-quality eye movement pre-training stage and the eye movement-instructed video fine-tuning stage. In the pre-training, we compile an Eye-behavior-aided Multimodal Emotion Recognition (EMER) dataset and use it to train a multimodal Transformer. During the fine-tuning, we propose a novel progressive eye movement-instructed learning to take better advantage of the prior knowledge about high-quality eye movement signals from EMER. The instructed fine-tuning model could then make more robust predictions on downstream facial expression datasets. We evaluate our approach on three macro-expression datasets (DFEW, MAFW and Aff-wild2) and two micro-expression datasets (CASME III and CASME II). The results demonstrate that EM-VFER significantly outperforms existing methods.
资助项目	National Natural Science Foundation of China[62076227] ; Natural Science Foundation of Hubei Province[2023AFB572] ; Hubei Key Laboratory of Intelligent Geo-Information Processing[KLIGIP-2022-B10]
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:001626710800006
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL	[http://119.78.100.204/handle/2XEOYT63/42815]
专题	中国科学院计算技术研究所
通讯作者	Chen, Zhe; Chen, Jingying; Shan, Shiguang
作者单位	1.Cent China Normal Univ, Natl Engn Res Ctr E Learning, Natl Engn Lab Educ Big Data, Wuhan 430079, Peoples R China 2.China Univ Geosci, Sch Comp Sci, Wuhan 430074, Peoples R China 3.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 4.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 5.La Trobe Univ, Cisco La Trobe Ctr Artificial Intelligence & Inter, Sch Comp Engn & Math Sci, Flora Hill, Vic 3550, Australia 6.La Trobe Univ, Cisco La Trobe Ctr Artificial Intelligence & Inter, Australian Ctr Artificial Intelligence Med Innovat, Sch Comp Engn & Math Sci, Flora Hill, Vic 3550, Australia
推荐引用方式 GB/T 7714	Liu, Yuanyuan,Wei, Lin,Liu, Kejun,et al. Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition[J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING,2025,16(4):3404-3420.
APA	Liu, Yuanyuan.,Wei, Lin.,Liu, Kejun.,Chen, Zijing.,Chen, Zhe.,...&Shan, Shiguang.(2025).Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition.IEEE TRANSACTIONS ON AFFECTIVE COMPUTING,16(4),3404-3420.
MLA	Liu, Yuanyuan,et al."Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition".IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 16.4(2025):3404-3420.

入库方式： OAI收割

来源：计算技术研究所

下载0

Leveraging Eye Movement for Instructing Robust Video-Based Facial Expression Recognition

其他版本