中国科学院机构知识库网格系统: Audio Mixing Inversion via Embodied Self-supervised Learning

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Audio Mixing Inversion via Embodied Self-supervised Learning

文献类型：期刊论文


作者	Haotian Zhou 2,3
刊名	Machine Intelligence Research
出版日期	2024
卷号	21 期号:1 页码:55-62
ISSN号	2731-538X
关键词	Audio mixing inversion, intelligent audio mixing, self-supervised learning, audio signal processing, deep learning
DOI	10.1007/s11633-023-1441-9
英文摘要	Audio mixing is a crucial part of music production. For analyzing or recreating audio mixing, it is of great importance to conduct research on estimating mixing parameters used to create mixdowns from music recordings, i.e., audio mixing inversion. However, approaches of audio mixing inversion are rarely explored. A method of estimating mixing parameters from raw tracks and a stereo mixdown via embodied self-supervised learning is presented. In this work, several commonly used audio effects including gain, pan, equalization, reverb, and compression, are taken into consideration. This method is able to learn an inference neural network that takes a stereo mixdown and the raw audio sources as input and estimate mixing parameters used to create the mixdown by iteratively sampling and training. During the sampling step, the inference network predicts a set of mixing parameters, which is sampled and fed to an audio-processing framework to generate audio data for the training step. During the training step, the same network used in the sampling step is optimized with the sampled data generated from the sampling step. This method is able to explicitly model the mixing process in an interpretable way instead of using a black-box neural network model. A set of objective measures are used for evaluation. The experimental results show that this method has better performance than current state-of-the-art methods.
源URL	[http://ir.ia.ac.cn/handle/173211/54575]
专题	自动化研究所_学术期刊_International Journal of Automation and Computing
作者单位	1.School of Intelligence Science and Technology, Peking University, Beijing 100871, China 2.Laboratory of Music Artificial Intelligence, Laboratory of Philosophy and Social Sciences, Ministry of Education, Beijing 100031, China 3.Department of AI Music and Music Information Technology, Central Conservatory of Music, Beijing 100031, China
推荐引用方式 GB/T 7714	Haotian Zhou. Audio Mixing Inversion via Embodied Self-supervised Learning[J]. Machine Intelligence Research,2024,21(1):55-62.
APA	Haotian Zhou.(2024).Audio Mixing Inversion via Embodied Self-supervised Learning.Machine Intelligence Research,21(1),55-62.
MLA	Haotian Zhou."Audio Mixing Inversion via Embodied Self-supervised Learning".Machine Intelligence Research 21.1(2024):55-62.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。