CLADES: A classification-based machine learning method for species delimitation from population genetic data
文献类型:期刊论文
作者 | Pei, Jingwen2; Chu, Chong1,2; Li, Xin2; Lu, Bin3; Wu, Yufeng2 |
刊名 | MOLECULAR ECOLOGY RESOURCES
![]() |
出版日期 | 2018-09-01 |
卷号 | 18期号:5页码:1144-1156 |
关键词 | classification machine learning population genetics species delimitation |
ISSN号 | 1755-098X |
DOI | 10.1111/1755-0998.12887 |
产权排序 | 3 |
文献子类 | Article |
英文摘要 | Species are considered to be the basic unit of ecological and evolutionary studies. As multilocus genomic data are increasingly available, there have been considerable interests in the use of DNA sequence data to delimit species. In this study, we show that machine learning can be used for species delimitation. Our method treats the species delimitation problem as a classification problem for identifying the category of a new observation on the basis of training data. Extensive simulation is first conducted over a broad range of evolutionary parameters for training purposes. Each pair of known populations is combined to form training samples with a label of same species or different species. We use support vector machine (SVM) to train a classifier using a set of summary statistics computed from training samples as features. The trained classifier can classify a test sample to two outcomes: same species or different species. Given multilocus genomic data of multiple related organisms or populations, our method (called CLADES) performs species delimitation by first classifying pairs of populations. CLADES then delimits species by maximizing the likelihood of species assignment for multiple populations. CLADES is evaluated through extensive simulation and also tested on real genetic data. We show that CLADES is both accurate and efficient for species delimitation when compared with existing methods. CLADES can be useful especially when existing methods have difficulty in delimitation, for example with short species divergence time and gene flow. |
学科主题 | Environment ; Ecology |
URL标识 | 查看原文 |
WOS关键词 | MULTISPECIES COALESCENT |
WOS研究方向 | Biochemistry & Molecular Biology ; Environmental Sciences & Ecology ; Evolutionary Biology |
语种 | 英语 |
WOS记录号 | WOS:000441753000018 |
出版者 | WILEY |
源URL | [http://210.75.237.14/handle/351003/30173] ![]() |
专题 | 食品安全与环境治理领域_中国科学院环境与应用微生物重点实验室 |
作者单位 | 1.Harvard Med Sch, Dept Biomed Informat, Boston, MA USA; 2.Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA; 3.Chinese Acad Sci, Chengdu Inst Biol, Chengdu, Sichuan, Peoples R China |
推荐引用方式 GB/T 7714 | Pei, Jingwen,Chu, Chong,Li, Xin,et al. CLADES: A classification-based machine learning method for species delimitation from population genetic data[J]. MOLECULAR ECOLOGY RESOURCES,2018,18(5):1144-1156. |
APA | Pei, Jingwen,Chu, Chong,Li, Xin,Lu, Bin,&Wu, Yufeng.(2018).CLADES: A classification-based machine learning method for species delimitation from population genetic data.MOLECULAR ECOLOGY RESOURCES,18(5),1144-1156. |
MLA | Pei, Jingwen,et al."CLADES: A classification-based machine learning method for species delimitation from population genetic data".MOLECULAR ECOLOGY RESOURCES 18.5(2018):1144-1156. |
入库方式: OAI收割
来源:成都生物研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。