中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Interpretability of Neural Networks Based on Game-theoretic Interactions

文献类型:期刊论文

作者Huilin Zhou1;  Jie Ren1;  Huiqi Deng1;  Xu Cheng1; Jinpeng Zhang2;  Quanshi Zhang1
刊名Machine Intelligence Research
出版日期2024
卷号21期号:4页码:718-739
关键词Model interpretability and transparency explainable AI game theory interaction deep learning
ISSN号2731-538X
DOI10.1007/s11633-023-1419-7
英文摘要This paper introduces the system of game-theoretic interactions, which connects both the explanation of knowledge encoded in a deep neural networks (DNN) and the explanation of the representation power of a DNN. In this system, we define two game theoretic interaction indexes, namely the multi-order interaction and the multivariate interaction. More crucially, we use these interaction indexes to explain feature representations encoded in a DNN from the following four aspects: 1) Quantifying knowledge concepts encoded by a DNN; 2) Exploring how a DNN encodes visual concepts, and extracting prototypical concepts encoded in the DNN; 3) Learning optimal baseline values for the Shapley value, and providing a unified perspective to compare fourteen different attribution methods; 4) Theoretically explaining the representation bottleneck of DNNs. Furthermore, we prove the relationship between the interaction encoded in a DNN and the representation power of a DNN (e.g., generalization power, adversarial transferability, and adversarial robustness). In this way, game-theoretic interactions successfully bridge the gap between “the explanation of knowledge concepts encoded in a DNN” and “the explanation of the representation capacity of a DNN” as a unified explanation.
源URL[http://ir.ia.ac.cn/handle/173211/58569]  
专题自动化研究所_学术期刊_International Journal of Automation and Computing
作者单位1.School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
2.XLAB, The Second Academy of China Aerospace Science and Industry Corporation, Beijing 100854, China
推荐引用方式
GB/T 7714
Huilin Zhou, Jie Ren, Huiqi Deng,et al. Interpretability of Neural Networks Based on Game-theoretic Interactions[J]. Machine Intelligence Research,2024,21(4):718-739.
APA Huilin Zhou, Jie Ren, Huiqi Deng, Xu Cheng,Jinpeng Zhang,& Quanshi Zhang.(2024).Interpretability of Neural Networks Based on Game-theoretic Interactions.Machine Intelligence Research,21(4),718-739.
MLA Huilin Zhou,et al."Interpretability of Neural Networks Based on Game-theoretic Interactions".Machine Intelligence Research 21.4(2024):718-739.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。