中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images

文献类型:期刊论文

作者Song, Jia; Zhu, A-Xing; Zhu, Yunqiang
刊名SENSORS
出版日期2023-05-29
卷号23期号:11页码:5166
ISSN号1424-8220
关键词vision transformer hyperparameter building self-attention deep learning
DOI10.3390/s23115166
产权排序1
文献子类Article
英文摘要Semantic segmentation with deep learning networks has become an important approach to the extraction of objects from very high-resolution remote sensing images. Vision Transformer networks have shown significant improvements in performance compared to traditional convolutional neural networks (CNNs) in semantic segmentation. Vision Transformer networks have different architectures to CNNs. Image patches, linear embedding, and multi-head self-attention (MHSA) are several of the main hyperparameters. How we should configure them for the extraction of objects in VHR images and how they affect the accuracy of networks are topics that have not been sufficiently investigated. This article explores the role of vision Transformer networks in the extraction of building footprints from very-high-resolution (VHR) images. Transformer-based models with different hyperparameter values were designed and compared, and their impact on accuracy was analyzed. The results show that smaller image patches and higher-dimension embeddings result in better accuracy. In addition, the Transformer-based network is shown to be scalable and can be trained with general-scale graphics processing units (GPUs) with comparable model sizes and training times to convolutional neural networks while achieving higher accuracy. The study provides valuable insights into the potential of vision Transformer networks in object extraction using VHR images.
学科主题Chemistry ; Engineering ; Instruments & Instrumentation
WOS关键词CLASSIFICATION ; NETWORK
WOS研究方向Chemistry ; Engineering ; Instruments & Instrumentation
出版者MDPI
源URL[http://ir.igsnrr.ac.cn/handle/311030/193780]  
专题资源与环境信息系统国家重点实验室_外文论文
作者单位1.Chinese Academy of Sciences
2.Institute of Geographic Sciences & Natural Resources Research, CAS
3.University of Wisconsin Madison
4.University of Wisconsin System
推荐引用方式
GB/T 7714
Song, Jia,Zhu, A-Xing,Zhu, Yunqiang. Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images[J]. SENSORS,2023,23(11):5166.
APA Song, Jia,Zhu, A-Xing,&Zhu, Yunqiang.(2023).Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images.SENSORS,23(11),5166.
MLA Song, Jia,et al."Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images".SENSORS 23.11(2023):5166.

入库方式: OAI收割

来源:地理科学与资源研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。