PMID- 32169637 OWN - NLM STAT- PubMed-not-MEDLINE DCOM- 20200512 LR - 20200518 IS - 1879-1026 (Electronic) IS - 0048-9697 (Linking) VI - 721 DP - 2020 Jun 15 TI - Improving prediction of water quality indices using novel hybrid machine-learning algorithms. PG - 137612 LID - S0048-9697(20)31123-2 [pii] LID - 10.1016/j.scitotenv.2020.137612 [doi] AB - River water quality assessment is one of the most important tasks to enhance water resources management plans. A water quality index (WQI) considers several water quality variables simultaneously. Traditionally WQI calculations consume time and are often fraught with errors during derivations of sub-indices. In this study, 4 standalone (random forest (RF), M5P, random tree (RT), and reduced error pruning tree (REPT)) and 12 hybrid data-mining algorithms (combinations of standalones with bagging (BA), CV parameter selection (CVPS) and randomizable filtered classification (RFC)) were used to create Iran WQI (IRWQI(sc)) predictions. Six years (2012 to 2018) of monthly data from two water quality monitoring stations within the Talar catchment were compiled. Using Pearson correlation coefficients, 10 different input combinations were constructed. The data were divided into two groups (ratio 70:30) for model building (training dataset) and model validation (testing dataset) using a 10-fold cross-validation technique. The models were evaluated using several statistical and visual evaluation metrics. Result show that fecal coliform (FC) and total solids (TS) had the greatest and least effect on the prediction of IRWQI(sc). The best input combinations varied among the algorithms; generally variables with very low correlations displayed weaker performance. Hybrid algorithms improved the prediction power of several of the standalone models, but not all. Hybrid BA-RT outperformed the other models (R(2) = 0.941, RMSE = 2.71, MAE = 1.87, NSE = 0.941, PBIAS = 0.500). PBIAS indicated that all algorithms, with the exceptions of RT, BA-RT and CVPS-REPT, overestimated WQI values. CI - Copyright (c) 2020 Elsevier B.V. All rights reserved. FAU - Bui, Duie Tien AU - Bui DT AD - Geographic Information Science Research Group, Ton Duc Thang University, Ho Chi Minh City, Viet Nam; Faculty of Environment and Labour Safety, Ton Duc Thang University, Ho Chi Minh City, Viet Nam. Electronic address: buitiendieu@tdtu.edu.vn. FAU - Khosravi, Khabat AU - Khosravi K AD - School of Engineering, University of Guelph, Guelph, Canada. Electronic address: kkhosrav@uoguelph.ca. FAU - Tiefenbacher, John AU - Tiefenbacher J AD - Department of Geography, Texas State University, San Marcos, TX 78666, USA. Electronic address: tief@txstate.edu. FAU - Nguyen, Hoang AU - Nguyen H AD - Institute of Research and Development, Duy Tan University, Da Nang 550000, Viet Nam. Electronic address: nguyenhoang23@duytan.edu.vn. FAU - Kazakis, Nerantzis AU - Kazakis N AD - Aristotle University of Thessaloniki, Department of Geology, Lab. of Engineering Geology & Hydrogeology, 54124 Thessaloniki, Greece. Electronic address: kazakis@geo.auth.gr. LA - eng PT - Journal Article DEP - 20200303 PL - Netherlands TA - Sci Total Environ JT - The Science of the total environment JID - 0330500 SB - IM OTO - NOTNLM OT - Data mining OT - Novel hybrid algorithms OT - Prediction OT - Water quality index COIS- Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. EDAT- 2020/03/15 06:00 MHDA- 2020/03/15 06:01 CRDT- 2020/03/15 06:00 PHST- 2020/01/23 00:00 [received] PHST- 2020/02/26 00:00 [revised] PHST- 2020/02/26 00:00 [accepted] PHST- 2020/03/15 06:00 [pubmed] PHST- 2020/03/15 06:01 [medline] PHST- 2020/03/15 06:00 [entrez] AID - S0048-9697(20)31123-2 [pii] AID - 10.1016/j.scitotenv.2020.137612 [doi] PST - ppublish SO - Sci Total Environ. 2020 Jun 15;721:137612. doi: 10.1016/j.scitotenv.2020.137612. Epub 2020 Mar 3.