中国科学院机构知识库网格系统: Interpretability of Neural Networks Based on Game-theoretic Interactions

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Interpretability of Neural Networks Based on Game-theoretic Interactions

文献类型：期刊论文


作者	Huilin Zhou 1; Jie Ren 1; Huiqi Deng 1; Xu Cheng 1; Jinpeng Zhang2 ; Quanshi Zhang 1
刊名	Machine Intelligence Research
出版日期	2024
卷号	21 期号:4 页码:718-739
关键词	Model interpretability and transparency explainable AI game theory interaction deep learning
ISSN号	2731-538X
DOI	10.1007/s11633-023-1419-7
英文摘要	This paper introduces the system of game-theoretic interactions, which connects both the explanation of knowledge encoded in a deep neural networks (DNN) and the explanation of the representation power of a DNN. In this system, we define two game theoretic interaction indexes, namely the multi-order interaction and the multivariate interaction. More crucially, we use these interaction indexes to explain feature representations encoded in a DNN from the following four aspects: 1) Quantifying knowledge concepts encoded by a DNN; 2) Exploring how a DNN encodes visual concepts, and extracting prototypical concepts encoded in the DNN; 3) Learning optimal baseline values for the Shapley value, and providing a unified perspective to compare fourteen different attribution methods; 4) Theoretically explaining the representation bottleneck of DNNs. Furthermore, we prove the relationship between the interaction encoded in a DNN and the representation power of a DNN (e.g., generalization power, adversarial transferability, and adversarial robustness). In this way, game-theoretic interactions successfully bridge the gap between “the explanation of knowledge concepts encoded in a DNN” and “the explanation of the representation capacity of a DNN” as a unified explanation.
源URL	[http://ir.ia.ac.cn/handle/173211/58569]
专题	自动化研究所_学术期刊_International Journal of Automation and Computing
作者单位	1.School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China 2.XLAB, The Second Academy of China Aerospace Science and Industry Corporation, Beijing 100854, China
推荐引用方式 GB/T 7714	Huilin Zhou, Jie Ren, Huiqi Deng,et al. Interpretability of Neural Networks Based on Game-theoretic Interactions[J]. Machine Intelligence Research,2024,21(4):718-739.
APA	Huilin Zhou, Jie Ren, Huiqi Deng, Xu Cheng,Jinpeng Zhang,& Quanshi Zhang.(2024).Interpretability of Neural Networks Based on Game-theoretic Interactions.Machine Intelligence Research,21(4),718-739.
MLA	Huilin Zhou,et al."Interpretability of Neural Networks Based on Game-theoretic Interactions".Machine Intelligence Research 21.4(2024):718-739.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。