中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling

文献类型:期刊论文

;
作者Zhang, Xinbang1,2; Chang, Jianlong3,4; Guo, Yiwen5; Meng, Gaofeng1,2; Xiang, Shiming1,2; Lin, Zhouchen6; Pan, Chunhong1
刊名IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE ; IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
出版日期2021-09-01 ; 2021-09-01
卷号43期号:9页码:2905-2920
关键词Computer architecture Computer architecture Search problems Optimization Task analysis Bridges Binary codes Estimation Neural architecture search(NAS) ensemble gumbel-softmax distribution guided sampling Search problems Optimization Task analysis Bridges Binary codes Estimation Neural architecture search(NAS) ensemble gumbel-softmax distribution guided sampling
ISSN号0162-8828 ; 0162-8828
DOI10.1109/TPAMI.2020.3020315 ; 10.1109/TPAMI.2020.3020315
通讯作者Chang, Jianlong(jianlong.chang@huawei.com)
英文摘要Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS.;

Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS.

WOS关键词NETWORKS ; NETWORKS
资助项目Major Project for New Generation of AI[2018AAA0100400] ; Major Project for New Generation of AI[2018AAA0100400] ; National Natural Science Foundation of China[91646207] ; National Natural Science Foundation of China[61976208] ; National Key R&D Program of China[2019AAA0105200] ; NSF China[61625301] ; NSF China[61731018] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AC01] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AB02] ; Beijing Academy of Artificial Intelligence ; Qualcomm ; National Natural Science Foundation of China[91646207] ; National Natural Science Foundation of China[61976208] ; National Key R&D Program of China[2019AAA0105200] ; NSF China[61625301] ; NSF China[61731018] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AC01] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AB02] ; Beijing Academy of Artificial Intelligence ; Qualcomm
WOS研究方向Computer Science ; Computer Science ; Engineering ; Engineering
语种英语 ; 英语
WOS记录号WOS:000681124300007 ; WOS:000681124300007
出版者IEEE COMPUTER SOC ; IEEE COMPUTER SOC
资助机构Major Project for New Generation of AI ; Major Project for New Generation of AI ; National Natural Science Foundation of China ; National Key R&D Program of China ; NSF China ; Major Scientific Research Project of Zhejiang Lab ; Beijing Academy of Artificial Intelligence ; Qualcomm ; National Natural Science Foundation of China ; National Key R&D Program of China ; NSF China ; Major Scientific Research Project of Zhejiang Lab ; Beijing Academy of Artificial Intelligence ; Qualcomm
源URL[http://ir.ia.ac.cn/handle/173211/45658]  
专题自动化研究所_模式识别国家重点实验室_遥感图像处理团队
通讯作者Chang, Jianlong
作者单位1.Chinese Acad Sci, Inst Automat, Dept Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
3.Huawei Cloud & AI, Beijing 100095, Peoples R China
4.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100095, Peoples R China
5.Bytedance AI Lab, Beijing 100190, Peoples R China
6.Peking Univ, Sch EECS, Key Lab Machine Percept MoE, Beijing 100871, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Xinbang,Chang, Jianlong,Guo, Yiwen,et al. DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling, DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2021, 2021,43, 43(9):2905-2920, 2905-2920.
APA Zhang, Xinbang.,Chang, Jianlong.,Guo, Yiwen.,Meng, Gaofeng.,Xiang, Shiming.,...&Pan, Chunhong.(2021).DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,43(9),2905-2920.
MLA Zhang, Xinbang,et al."DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 43.9(2021):2905-2920.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。