中国科学院机构知识库网格系统: DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling

DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling

文献类型：期刊论文

;
作者	Zhang, Xinbang1,2 ; Chang, Jianlong3,4 ; Guo, Yiwen 5; Meng, Gaofeng1,2 ; Xiang, Shiming1,2 ; Lin, Zhouchen 6; Pan, Chunhong1
刊名	IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ; IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
出版日期	2021-09-01 ; 2021-09-01
卷号	43 期号:9 页码:2905-2920
关键词	Computer architecture Computer architecture Search problems Optimization Task analysis Bridges Binary codes Estimation Neural architecture search(NAS) ensemble gumbel-softmax distribution guided sampling Search problems Optimization Task analysis Bridges Binary codes Estimation Neural architecture search(NAS) ensemble gumbel-softmax distribution guided sampling
ISSN号	0162-8828 ; 0162-8828
DOI	10.1109/TPAMI.2020.3020315 ; 10.1109/TPAMI.2020.3020315
通讯作者	Chang, Jianlong(jianlong.chang@huawei.com)
英文摘要	Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS.; Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS.
WOS关键词	NETWORKS ; NETWORKS
资助项目	Major Project for New Generation of AI[2018AAA0100400] ; Major Project for New Generation of AI[2018AAA0100400] ; National Natural Science Foundation of China[91646207] ; National Natural Science Foundation of China[61976208] ; National Key R&D Program of China[2019AAA0105200] ; NSF China[61625301] ; NSF China[61731018] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AC01] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AB02] ; Beijing Academy of Artificial Intelligence ; Qualcomm ; National Natural Science Foundation of China[91646207] ; National Natural Science Foundation of China[61976208] ; National Key R&D Program of China[2019AAA0105200] ; NSF China[61625301] ; NSF China[61731018] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AC01] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AB02] ; Beijing Academy of Artificial Intelligence ; Qualcomm
WOS研究方向	Computer Science ; Computer Science ; Engineering ; Engineering
语种	英语 ; 英语
WOS记录号	WOS:000681124300007 ; WOS:000681124300007
出版者	IEEE COMPUTER SOC ; IEEE COMPUTER SOC
资助机构	Major Project for New Generation of AI ; Major Project for New Generation of AI ; National Natural Science Foundation of China ; National Key R&D Program of China ; NSF China ; Major Scientific Research Project of Zhejiang Lab ; Beijing Academy of Artificial Intelligence ; Qualcomm ; National Natural Science Foundation of China ; National Key R&D Program of China ; NSF China ; Major Scientific Research Project of Zhejiang Lab ; Beijing Academy of Artificial Intelligence ; Qualcomm
源URL	[http://ir.ia.ac.cn/handle/173211/45658]
专题	自动化研究所_模式识别国家重点实验室_遥感图像处理团队
通讯作者	Chang, Jianlong
作者单位	1.Chinese Acad Sci, Inst Automat, Dept Natl Lab Pattern Recognit, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 3.Huawei Cloud & AI, Beijing 100095, Peoples R China 4.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100095, Peoples R China 5.Bytedance AI Lab, Beijing 100190, Peoples R China 6.Peking Univ, Sch EECS, Key Lab Machine Percept MoE, Beijing 100871, Peoples R China
推荐引用方式 GB/T 7714	Zhang, Xinbang,Chang, Jianlong,Guo, Yiwen,et al. DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling, DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2021, 2021,43, 43(9):2905-2920, 2905-2920.
APA	Zhang, Xinbang.,Chang, Jianlong.,Guo, Yiwen.,Meng, Gaofeng.,Xiang, Shiming.,...&Pan, Chunhong.(2021).DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,43(9),2905-2920.
MLA	Zhang, Xinbang,et al."DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 43.9(2021):2905-2920.

入库方式： OAI收割

来源：自动化研究所

下载0

DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling

其他版本