Developing machine learning approaches to identify candidate persistent, mobile and toxic (PMT) and very persistent and very mobile (vPvM) substances based on molecular structure
文献类型:期刊论文
作者 | Han, Min2,3,4; Jin, Biao2,3,4![]() |
刊名 | WATER RESEARCH
![]() |
出版日期 | 2023-10-01 |
卷号 | 244页码:12 |
关键词 | Pollutants Water quality Machine learning Chemicals |
ISSN号 | 0043-1354 |
DOI | 10.1016/j.watres.2023.120470 |
英文摘要 | Determining which substances on the global market could be classified as persistent, mobile and toxic (PMT) substances or very persistent, very mobile (vPvM) substances is essential to prevent or reduce drinking water contamination from them. This study developed machine learning models based on different molecular de-scriptors (MDs) and defined applicability domains for the screening of PMT/vPvM substances. The models were trained with 3111 substances with expert weight-of-evidence based PMT/vPvM hazard classifications that considered the highest quality data available. The model was based on the hypothesis that PMT/vPvM substances contain similar MDs, representative of chemical structures resistant to degradation, be associated with low sorption (or high-water solubility) and in some cases be associated with known toxic mechanisms. All possible model combinations were tested by integrating different molecular description methods, data balancing stra-tegies and machine learning algorithms. Our model allows one-step prediction of candidate PMT/vPvM sub-stances, and our method was compared with the approach predicting P, M and T separately (i.e. three-step prediction). The results showed that the one-step model achieved a higher accuracy of 92% for PMT/vPvM identification (i.e. positive samples) for an internal test set, and also resulted in a higher accuracy of 90% for an external test set of chemical pollutants detected in Taihu Lake, China. Furthermore, prediction mechanism of the model was interpreted by Shapley additive explanations (SHAP). This work presents an advance of big data in silico screening models for the identification of substances that potentially meet the PMT/vPvM criteria. |
WOS研究方向 | Engineering ; Environmental Sciences & Ecology ; Water Resources |
语种 | 英语 |
WOS记录号 | WOS:001063597800001 |
源URL | [http://ir.gig.ac.cn/handle/344008/74804] ![]() |
专题 | 有机地球化学国家重点实验室 |
通讯作者 | Jin, Biao |
作者单位 | 1.Norwegian Univ Sci & Technol NTNU, NO-7491 Trondheim, Norway 2.Chinese Acad Sci, Guangzhou Inst Geochem, State Key Lab Organ Geochem, Guangzhou 510640, Peoples R China 3.CAS Ctr Excellence Deep Earth Sci, Guangzhou 510640, Peoples R China 4.Univ Chinese Acad Sci, Beijing 10069, Peoples R China 5.South China Normal Univ, Sch Software, Foshan 528225, Peoples R China 6.Norwegian Geotech Inst NGI, POB 3930 Ullevaal Stad, N-0806 Oslo, Norway |
推荐引用方式 GB/T 7714 | Han, Min,Jin, Biao,Liang, Jun,et al. Developing machine learning approaches to identify candidate persistent, mobile and toxic (PMT) and very persistent and very mobile (vPvM) substances based on molecular structure[J]. WATER RESEARCH,2023,244:12. |
APA | Han, Min,Jin, Biao,Liang, Jun,Huang, Chen,&Arp, Hans Peter H..(2023).Developing machine learning approaches to identify candidate persistent, mobile and toxic (PMT) and very persistent and very mobile (vPvM) substances based on molecular structure.WATER RESEARCH,244,12. |
MLA | Han, Min,et al."Developing machine learning approaches to identify candidate persistent, mobile and toxic (PMT) and very persistent and very mobile (vPvM) substances based on molecular structure".WATER RESEARCH 244(2023):12. |
入库方式: OAI收割
来源:广州地球化学研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。