finding optimal threshold for correction error reads in dna assembling
文献类型:会议论文
作者 | Chin Francis Y. L. ; Leung Henry C. M. ; Li Wei-Lin ; Yiu Siu-Ming |
出版日期 | 2009 |
会议名称 | 9th Asia Pacific Bioinformatics Conference |
会议日期 | JAN 13-16, |
会议地点 | Beijing, PEOPLES R CHINA |
页码 | - |
英文摘要 | Background: DNA assembling is the problem of determining the nucleotide sequence of a genome from its substrings, called reads. In the experiments, there may be some errors on the reads which affect the performance of the DNA assembly algorithms. Existing algorithms, e. g. ECINDEL and SRCorr, correct the error reads by considering the number of times each length-k substring of the reads appear in the input. They treat those length-k substrings appear at least M times as correct substring and correct the error reads based on these substrings. However, since the threshold M is chosen without any solid theoretical analysis, these algorithms cannot guarantee their performances on error correction. Results: In this paper, we propose a method to calculate the probabilities of false positive and false negative when determining whether a length-k substring is correct using threshold M. Based on this optimal threshold M that minimizes the total errors ( false positives and false negatives). Experimental results on both real data and simulated data showed that our calculation is correct and we can reduce the total error substrings by 77.6% and 65.1% when compared to ECINDEL and SRCorr respectively. Conclusion: We introduced a method to calculate the probability of false positives and false negatives of the length-k substring using different thresholds. Based on this calculation, we found the optimal threshold to minimize the total error of false positive plus false negative. |
收录类别 | SCI,ISTP |
会议录出版者 | BMC BIOINFORMATICS |
会议录出版地 | CURRENT SCIENCE GROUP, MIDDLESEX HOUSE, 34-42 CLEVELAND ST, LONDON W1T 4LB, ENGLAND |
语种 | 英语 |
ISSN号 | 1471-2105 |
WOS记录号 | WOS:000265601900015 |
源URL | [http://124.16.136.157/handle/311060/8192] ![]() |
专题 | 软件研究所_软件所图书馆_2009年期刊/会议论文 |
推荐引用方式 GB/T 7714 | Chin Francis Y. L.,Leung Henry C. M.,Li Wei-Lin,et al. finding optimal threshold for correction error reads in dna assembling[C]. 见:9th Asia Pacific Bioinformatics Conference. Beijing, PEOPLES R CHINA. JAN 13-16,. |
入库方式: OAI收割
来源:软件研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。