DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling
文献类型:期刊论文
; | |
作者 | Zhang, Xinbang1,2![]() ![]() ![]() ![]() ![]() |
刊名 | IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
![]() ![]() |
出版日期 | 2021-09-01 ; 2021-09-01 |
卷号 | 43期号:9页码:2905-2920 |
关键词 | Computer architecture Computer architecture Search problems Optimization Task analysis Bridges Binary codes Estimation Neural architecture search(NAS) ensemble gumbel-softmax distribution guided sampling Search problems Optimization Task analysis Bridges Binary codes Estimation Neural architecture search(NAS) ensemble gumbel-softmax distribution guided sampling |
ISSN号 | 0162-8828 ; 0162-8828 |
DOI | 10.1109/TPAMI.2020.3020315 ; 10.1109/TPAMI.2020.3020315 |
通讯作者 | Chang, Jianlong(jianlong.chang@huawei.com) |
英文摘要 | Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS.; Neural architecture search (NAS) is inherently subject to the gap of architectures during searching and validating. To bridge this gap effectively, we develop Differentiable ArchiTecture Approximation (DATA) with Ensemble Gumbel-Softmax (EGS) estimator and Architecture Distribution Constraint (ADC) to automatically approximate architectures during searching and validating in a differentiable manner. Technically, the EGS estimator consists of a group of Gumbel-Softmax estimators, which is capable of converting probability vectors to binary codes and passing gradients reversely, reducing the estimation bias in a differentiable way. To narrow the distribution gap between sampled architectures and supernet, further, the ADC is introduced to reduce the variance of sampling during searching. Benefiting from such modeling, architecture probabilities and network weights in the NAS model can be jointly optimized with the standard back-propagation, yielding an end-to-end learning mechanism for searching deep neural architectures in an extended search space. Conclusively, in the validating process, a high-performance architecture that approaches to the learned one during searching is readily built. Extensive experiments on various tasks including image classification, few-shot learning, unsupervised clustering, semantic segmentation and language modeling strongly demonstrate that DATA is capable of discovering high-performance architectures while guaranteeing the required efficiency. Code is available at https://github.com/XinbangZhang/DATA-NAS. |
WOS关键词 | NETWORKS ; NETWORKS |
资助项目 | Major Project for New Generation of AI[2018AAA0100400] ; Major Project for New Generation of AI[2018AAA0100400] ; National Natural Science Foundation of China[91646207] ; National Natural Science Foundation of China[61976208] ; National Key R&D Program of China[2019AAA0105200] ; NSF China[61625301] ; NSF China[61731018] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AC01] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AB02] ; Beijing Academy of Artificial Intelligence ; Qualcomm ; National Natural Science Foundation of China[91646207] ; National Natural Science Foundation of China[61976208] ; National Key R&D Program of China[2019AAA0105200] ; NSF China[61625301] ; NSF China[61731018] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AC01] ; Major Scientific Research Project of Zhejiang Lab[2019KB0AB02] ; Beijing Academy of Artificial Intelligence ; Qualcomm |
WOS研究方向 | Computer Science ; Computer Science ; Engineering ; Engineering |
语种 | 英语 ; 英语 |
WOS记录号 | WOS:000681124300007 ; WOS:000681124300007 |
出版者 | IEEE COMPUTER SOC ; IEEE COMPUTER SOC |
资助机构 | Major Project for New Generation of AI ; Major Project for New Generation of AI ; National Natural Science Foundation of China ; National Key R&D Program of China ; NSF China ; Major Scientific Research Project of Zhejiang Lab ; Beijing Academy of Artificial Intelligence ; Qualcomm ; National Natural Science Foundation of China ; National Key R&D Program of China ; NSF China ; Major Scientific Research Project of Zhejiang Lab ; Beijing Academy of Artificial Intelligence ; Qualcomm |
源URL | [http://ir.ia.ac.cn/handle/173211/45658] ![]() |
专题 | 自动化研究所_模式识别国家重点实验室_遥感图像处理团队 |
通讯作者 | Chang, Jianlong |
作者单位 | 1.Chinese Acad Sci, Inst Automat, Dept Natl Lab Pattern Recognit, Beijing 100190, Peoples R China 2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 3.Huawei Cloud & AI, Beijing 100095, Peoples R China 4.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100095, Peoples R China 5.Bytedance AI Lab, Beijing 100190, Peoples R China 6.Peking Univ, Sch EECS, Key Lab Machine Percept MoE, Beijing 100871, Peoples R China |
推荐引用方式 GB/T 7714 | Zhang, Xinbang,Chang, Jianlong,Guo, Yiwen,et al. DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling, DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2021, 2021,43, 43(9):2905-2920, 2905-2920. |
APA | Zhang, Xinbang.,Chang, Jianlong.,Guo, Yiwen.,Meng, Gaofeng.,Xiang, Shiming.,...&Pan, Chunhong.(2021).DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,43(9),2905-2920. |
MLA | Zhang, Xinbang,et al."DATA: Differentiable ArchiTecture Approximation With Distribution Guided Sampling".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 43.9(2021):2905-2920. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。