PMID- 36475377 OWN - NLM STAT- MEDLINE DCOM- 20221221 LR - 20230105 IS - 1520-5851 (Electronic) IS - 0013-936X (Linking) VI - 56 IP - 24 DP - 2022 Dec 20 TI - Machine Learning-Based Models with High Accuracy and Broad Applicability Domains for Screening PMT/vPvM Substances. PG - 17880-17889 LID - 10.1021/acs.est.2c06155 [doi] AB - Persistent, mobile, and toxic (PMT) substances and very persistent and very mobile (vPvM) substances can transport over long distances from various sources, increasing the public health risk. A rapid and high-throughput screening of PMT/vPvM substances is thus warranted to the risk prevention and mitigation measures. Herein, we construct a machine learning-based screening system integrated with five models for high-throughput classification of PMT/vPvM substances. The models are constructed with 44 971 substances by conventional learning, deep learning, and ensemble learning algorithms, among which, LightGBM and XGBoost outperform other algorithms with metrics exceeding 0.900. Good model interpretability is achieved through the number of free halogen atoms (fr_halogen) and the logarithm of partition coefficient (MolLogP) as the two most critical molecular descriptors representing the persistence and mobility of substances, respectively. Our screening system exhibits a great generalization capability with area under the receiver operating characteristic curve (AUROC) above 0.951 and is successfully applied to the persistent organic pollutants (POPs), prioritized PMT/vPvM substances, and pesticides. The screening system constructed in this study can serve as an efficient and reliable tool for high-throughput risk assessment and the prioritization of managing emerging contaminants. FAU - Zhao, Qiming AU - Zhao Q AD - Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China. FAU - Yu, Yang AU - Yu Y AD - Solid Waste and Chemicals Management Center, Ministry of Ecology and Environment of the People's Republic of China, Beijing100029, China. FAU - Gao, Yuchen AU - Gao Y AD - Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China. FAU - Shen, Lilai AU - Shen L AD - Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China. FAU - Cui, Shixuan AU - Cui S AD - Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China. AD - Women's Reproductive Health Key Laboratory of Zhejiang Province, Women's Hospital, School of Medicine, Zhejiang University, Hangzhou310006, China. FAU - Gou, Yiyuan AU - Gou Y AD - Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China. FAU - Zhang, Chunlong AU - Zhang C AD - Department of Environmental Sciences, University of Houston-Clear Lake, 2700 Bay Area Blvd., Houston, Texas77058, United States. FAU - Zhuang, Shulin AU - Zhuang S AUID- ORCID: 0000-0002-7774-7239 AD - Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China. AD - Women's Reproductive Health Key Laboratory of Zhejiang Province, Women's Hospital, School of Medicine, Zhejiang University, Hangzhou310006, China. FAU - Jiang, Guibin AU - Jiang G AUID- ORCID: 0000-0002-6335-3917 AD - State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China. LA - eng PT - Journal Article PT - Research Support, Non-U.S. Gov't DEP - 20221206 PL - United States TA - Environ Sci Technol JT - Environmental science & technology JID - 0213155 SB - IM MH - *Algorithms MH - *Machine Learning OTO - NOTNLM OT - PMT/vPvM substances OT - hazard classification OT - high-throughput screening OT - machine learning OT - risk management EDAT- 2022/12/08 06:00 MHDA- 2022/12/22 06:00 CRDT- 2022/12/07 03:02 PHST- 2022/12/08 06:00 [pubmed] PHST- 2022/12/22 06:00 [medline] PHST- 2022/12/07 03:02 [entrez] AID - 10.1021/acs.est.2c06155 [doi] PST - ppublish SO - Environ Sci Technol. 2022 Dec 20;56(24):17880-17889. doi: 10.1021/acs.est.2c06155. Epub 2022 Dec 6.