Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire
文献类型:期刊论文
| 作者 | Fan ZY(范志赟)2,3 ; Dong LH(董林昊)1 ; Cai M(蔡猛)1; Ma ZJ(马泽君)1; Xu B(徐波)3
|
| 刊名 | Signal Processing Letters
![]() |
| 出版日期 | 2022 |
| 页码 | 1551-1554 |
| DOI | 10.1109/LSP.2022.3185955 |
| 英文摘要 | Speaker change detection is an important task in multi-party interactions such as meetings and conversations. In this paper, we address the speaker change detection task from the perspective of sequence transduction. Specifically, we propose a novel encoder-decoder framework that directly converts the input feature sequence to the speaker identity sequence. The difference-based continuous integrate-and-fire mechanism is designed to support this framework. It detects speaker changes by integrating the speaker difference between the encoder outputs frame-by-frame and transfers encoder outputs to segment-level speaker embeddings according to the detected speaker changes. The whole framework is supervised by the speaker identity sequence, a weaker label than the precise speaker change points. The experiments on the AMI and DIHARD-I corpora show that our sequence-level method consistently outperforms a strong frame-level baseline that uses the precise speaker change labels. |
| 源URL | [http://ir.ia.ac.cn/handle/173211/49731] ![]() |
| 专题 | 数字内容技术与服务研究中心_听觉模型与认知计算 |
| 作者单位 | 1.Bytedance AI LAB 2.School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3.Institute of Automation, Chinese Academy of Sciences, China |
| 推荐引用方式 GB/T 7714 | Fan ZY,Dong LH,Cai M,et al. Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire[J]. Signal Processing Letters,2022:1551-1554. |
| APA | Fan ZY,Dong LH,Cai M,Ma ZJ,&Xu B.(2022).Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire.Signal Processing Letters,1551-1554. |
| MLA | Fan ZY,et al."Sequence-level Speaker Change Detection with Difference-based Continuous Integrate-and-fire".Signal Processing Letters (2022):1551-1554. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


