中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
CLADES: A classification-based machine learning method for species delimitation from population genetic data

文献类型:期刊论文

作者Pei, Jingwen2; Chu, Chong1,2; Li, Xin2; Lu, Bin3; Wu, Yufeng2
刊名MOLECULAR ECOLOGY RESOURCES
出版日期2018-09-01
卷号18期号:5页码:1144-1156
关键词classification machine learning population genetics species delimitation
ISSN号1755-098X
DOI10.1111/1755-0998.12887
产权排序3
文献子类Article
英文摘要Species are considered to be the basic unit of ecological and evolutionary studies. As multilocus genomic data are increasingly available, there have been considerable interests in the use of DNA sequence data to delimit species. In this study, we show that machine learning can be used for species delimitation. Our method treats the species delimitation problem as a classification problem for identifying the category of a new observation on the basis of training data. Extensive simulation is first conducted over a broad range of evolutionary parameters for training purposes. Each pair of known populations is combined to form training samples with a label of same species or different species. We use support vector machine (SVM) to train a classifier using a set of summary statistics computed from training samples as features. The trained classifier can classify a test sample to two outcomes: same species or different species. Given multilocus genomic data of multiple related organisms or populations, our method (called CLADES) performs species delimitation by first classifying pairs of populations. CLADES then delimits species by maximizing the likelihood of species assignment for multiple populations. CLADES is evaluated through extensive simulation and also tested on real genetic data. We show that CLADES is both accurate and efficient for species delimitation when compared with existing methods. CLADES can be useful especially when existing methods have difficulty in delimitation, for example with short species divergence time and gene flow.
学科主题Environment ; Ecology
URL标识查看原文
WOS关键词MULTISPECIES COALESCENT
WOS研究方向Biochemistry & Molecular Biology ; Environmental Sciences & Ecology ; Evolutionary Biology
语种英语
WOS记录号WOS:000441753000018
出版者WILEY
源URL[http://210.75.237.14/handle/351003/30173]  
专题食品安全与环境治理领域_中国科学院环境与应用微生物重点实验室
作者单位1.Harvard Med Sch, Dept Biomed Informat, Boston, MA USA;
2.Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA;
3.Chinese Acad Sci, Chengdu Inst Biol, Chengdu, Sichuan, Peoples R China
推荐引用方式
GB/T 7714
Pei, Jingwen,Chu, Chong,Li, Xin,et al. CLADES: A classification-based machine learning method for species delimitation from population genetic data[J]. MOLECULAR ECOLOGY RESOURCES,2018,18(5):1144-1156.
APA Pei, Jingwen,Chu, Chong,Li, Xin,Lu, Bin,&Wu, Yufeng.(2018).CLADES: A classification-based machine learning method for species delimitation from population genetic data.MOLECULAR ECOLOGY RESOURCES,18(5),1144-1156.
MLA Pei, Jingwen,et al."CLADES: A classification-based machine learning method for species delimitation from population genetic data".MOLECULAR ECOLOGY RESOURCES 18.5(2018):1144-1156.

入库方式: OAI收割

来源:成都生物研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。