PMID- 32007881 OWN - NLM STAT- PubMed-not-MEDLINE DCOM- 20200303 LR - 20200303 IS - 1879-1026 (Electronic) IS - 0048-9697 (Linking) VI - 715 DP - 2020 May 1 TI - Enhancing nitrate and strontium concentration prediction in groundwater by using new data mining algorithm. PG - 136836 LID - S0048-9697(20)30346-6 [pii] LID - 10.1016/j.scitotenv.2020.136836 [doi] AB - Groundwater resources constitute the main source of clean fresh water for domestic use and it is essential for food production in the agricultural sector. Groundwater has a vital role for water supply in the Campanian Plain in Italy and hence a future sustainability of the resource is essential for the region. In the current paper novel data mining algorithms including Gaussian Process (GP) were used in a large groundwater quality database to predict nitrate (contaminant) and strontium (potential future increasing) concentrations in groundwater. The results were compared with M5P, random forest (RF) and random tree (RT) algorithms as a benchmark to test the robustness of the modeling process. The dataset includes 246 groundwater quality samples originating from different wells, municipals and agricultural. It was divided for the modeling process into two subgroups by using the 10-fold cross validation technique including 173 samples for model building (training dataset) and 73 samples for model validation (testing dataset). Different water quality variables including T, pH, EC, HCO(3)(-), F(-), Cl(-), SO(4)(2)(-), Na(+), K(+), Mg(2+), and Ca(2+) have been used as an input to the models. At first stage, different input combinations have been constructed based on correlation coefficient and thus the optimal combination was chosen for the modeling phase. Different quantitative criteria alongside with visual comparison approach have been used for evaluating the modeling capability. Results revealed that to obtain reliable results also variables with low correlation should be considered as an input to the models together with those variables showing high correlation coefficients. According to the model evaluation criteria, GP algorithm outperforms all the other models in predicting both nitrate and strontium concentrations followed by RF, M5P and RT, respectively. Result also revealed that model's structure together with the accuracy and structure of the data can have a relevant impact on the model's results. CI - Copyright (c) 2020 Elsevier B.V. All rights reserved. FAU - Bui, Dieu Tien AU - Bui DT AD - Geographic Information Science Research Group, Ton Duc Thang University, Ho Chi Minh City, Viet Nam; Faculty of Environment and Labour Safety, Ton Duc Thang University, Ho Chi Minh City, Viet Nam. Electronic address: buitiendieu@tdtu.edu.vn. FAU - Khosravi, Khabat AU - Khosravi K AD - School of Engineering, University of Guelph, ON, Canada. FAU - Karimi, Mahshid AU - Karimi M AD - Department of Watershed Management, Sari Agricultural Science and Natural Resources University, Sari, Iran. FAU - Busico, Gianluigi AU - Busico G AD - University of Campania "Luigi Vanvitelli", Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, Via Vivaldi 43, 81100, Caserta, Italy. FAU - Khozani, Zohreh Sheikh AU - Khozani ZS AD - Department of Civil Engineering, Faculty of Engineering & Built Environment, Universiti Kebangsaan, Malaysia. FAU - Nguyen, Hoang AU - Nguyen H AD - Institute of Research and Development, Duy Tan University, Da Nang 550000, Viet Nam. Electronic address: nguyenhoang23@duytan.edu.vn. FAU - Mastrocicco, Micol AU - Mastrocicco M AD - University of Campania "Luigi Vanvitelli", Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, Via Vivaldi 43, 81100, Caserta, Italy. FAU - Tedesco, Dario AU - Tedesco D AD - University of Campania "Luigi Vanvitelli", Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, Via Vivaldi 43, 81100, Caserta, Italy; Istituto Nazionale di Geofisica e Vulcanologia, sezione di Napoli - Osservatorio Vesuvuviano, Via Diocleziano 328 - Napoli, Italy. FAU - Cuoco, Emilio AU - Cuoco E AD - University of Campania "Luigi Vanvitelli", Department of Environmental, Biological and Pharmaceutical Sciences and Technologies, Via Vivaldi 43, 81100, Caserta, Italy. FAU - Kazakis, Nerantzis AU - Kazakis N AD - Aristotle University of Thessaloniki, Department of Geology, Lab. of Engineering Geology & Hydrogeology, 54124 Thessaloniki, Greece. Electronic address: kazakis@geo.auth.gr. LA - eng PT - Journal Article DEP - 20200124 PL - Netherlands TA - Sci Total Environ JT - The Science of the total environment JID - 0330500 SB - IM EIN - Sci Total Environ. 2020 Nov 10;742:141568. PMID: 32839005 OTO - NOTNLM OT - Data mining OT - Gaussian process OT - Italy OT - Nitrate OT - Prediction OT - Strontium COIS- Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. EDAT- 2020/02/03 06:00 MHDA- 2020/02/03 06:01 CRDT- 2020/02/03 06:00 PHST- 2019/11/10 00:00 [received] PHST- 2020/01/19 00:00 [revised] PHST- 2020/01/19 00:00 [accepted] PHST- 2020/02/03 06:00 [pubmed] PHST- 2020/02/03 06:01 [medline] PHST- 2020/02/03 06:00 [entrez] AID - S0048-9697(20)30346-6 [pii] AID - 10.1016/j.scitotenv.2020.136836 [doi] PST - ppublish SO - Sci Total Environ. 2020 May 1;715:136836. doi: 10.1016/j.scitotenv.2020.136836. Epub 2020 Jan 24.