中国科学院机构知识库网格系统: End-to-End Paired Ambisonic-Binaural Audio Rendering

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

End-to-End Paired Ambisonic-Binaural Audio Rendering

文献类型：期刊论文


作者	Yin Zhu; Qiuqiang Kong; Junjie Shi; Shilei Liu; Xuzhou Ye; Ju-Chiang Wang; Hongming Shan; Junping Zhang
刊名	IEEE/CAA Journal of Automatica Sinica
出版日期	2024
卷号	11 期号:2 页码:502-513
关键词	Ambisonic attention binaural rendering neural network
ISSN号	2329-9266
DOI	10.1109/JAS.2023.123969
英文摘要	Binaural rendering is of great interest to virtual reality and immersive media. Although humans can naturally use their two ears to perceive the spatial information contained in sounds, it is a challenging task for machines to achieve binaural rendering since the description of a sound field often requires multiple channels and even the metadata of the sound sources. In addition, the perceived sound varies from person to person even in the same sound field. Previous methods generally rely on individual-dependent head-related transferred function (HRTF) datasets and optimization algorithms that act on HRTFs. In practical applications, there are two major drawbacks to existing methods. The first is a high personalization cost, as traditional methods achieve personalized needs by measuring HRTFs. The second is insufficient accuracy because the optimization goal of traditional methods is to retain another part of information that is more important in perception at the cost of discarding a part of the information. Therefore, it is desirable to develop novel techniques to achieve personalization and accuracy at a low cost. To this end, we focus on the binaural rendering of ambisonic and propose 1) channel-shared encoder and channel-compared attention integrated into neural networks and 2) a loss function quantifying interaural level differences to deal with spatial information. To verify the proposed method, we collect and release the first paired ambisonic-binaural dataset and introduce three metrics to evaluate the content information and spatial information accuracy of the end-to-end methods. Extensive experimental results on the collected dataset demonstrate the superior performance of the proposed method and the shortcomings of previous methods.
源URL	[http://ir.ia.ac.cn/handle/173211/54558]
专题	自动化研究所_学术期刊_IEEE/CAA Journal of Automatica Sinica
推荐引用方式 GB/T 7714	Yin Zhu,Qiuqiang Kong,Junjie Shi,et al. End-to-End Paired Ambisonic-Binaural Audio Rendering[J]. IEEE/CAA Journal of Automatica Sinica,2024,11(2):502-513.
APA	Yin Zhu.,Qiuqiang Kong.,Junjie Shi.,Shilei Liu.,Xuzhou Ye.,...&Junping Zhang.(2024).End-to-End Paired Ambisonic-Binaural Audio Rendering.IEEE/CAA Journal of Automatica Sinica,11(2),502-513.
MLA	Yin Zhu,et al."End-to-End Paired Ambisonic-Binaural Audio Rendering".IEEE/CAA Journal of Automatica Sinica 11.2(2024):502-513.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。