The application of machine learning in nanoparticle treated water: A review

.


Introduction
Organic matter is abundant in the ecosystem, such as the soil and aquatic ecosystems. The organic matter relates to any of the carbon-based compounds that abound in nature. The organic matter moves into the soil or water stream, where it decomposes and provides nutrients to living organisms. Living organisms also excrete or secrete material classified as organic. The organic materials from living things contribute as a part of the environment. Domestic sewage (raw or treated), urban run-off, industrial (trade) effluents, and farm wastes all contribute to organic matter. However, organic matter in wastewater is diverse and complex, comprising substances with a wide range of molecular weights, from simple compounds like acetic acid to highly complex polymers [1]. By referring to the biological degradation abilities of the organic matter in the wastewater, they can be divided into two groups which are organic pollutants and persistent organic pollutants (POPs). The organic matter with simple structures and high with Fe, Cu is a "noble metal" that provides excellent antibacterial properties and serves as a protective layer over "corrosive metals" (Fe), making it a valuable material for aquatic environmental applications [17,18]. The presence of Cu also enables Fe to remove nitrate well at neutral pH, as without Cu, Fe was good for nitrate removal at pH 2 [19]. Iron and copper nanoparticles played a crucial role which was effective in the contaminant removal including ammoniacal nitrogen [20,21], heavy metals [22], and organic compounds [23,24] from the aquatic environment such as lentic, lotic, and wetlands. The nanoparticle's properties over the use of just one metal (monometallic) have proven to be improved through the construction of bimetallic nanoparticles consisting of two distinct metals. Nanoparticles such as iron-copper nanoparticles (nanoFeCu) composed of bimetallic components exhibit fascinating properties that differ significantly from those of monometallic [25]. Thence, bimetallic nanoparticles have piqued the interest of researchers. An extensive range of organic compounds can be degraded and adsorbed by the bimetallic nanoFeCu, including poly and mono-aromatic hydrocarbons, chlorinated ethanes, chlorinated hydrocarbons, and polychlorinated biphenyls [9]. However, a minimal study has applied a machine learning (ML) model on pollutant removal using nanoFeCu.
ML is a subfield of artificial intelligence that aims to create algorithms capable of automatically learning from data [26]. It was reported that ML can help in reducing the cost and conducting optimisation on the chemical usage in wastewater treatment and a saving of 20 to 30% of operational expenditures is expected [27]. ML has played a significant role in different applications including optimisation, classification, regression, and forecasting [28]. To accomplish operations of wastewater treatment, reuse of water, water-saving, and cost-reduction through prediction, diagnosis, assessment, and simulation, ML is extensively utilised [29]. When modeling complex data sets with possible nonlinearities or incomplete data, ML methods are often more robust and effective [28]. The possibility of the application of ML has been proven to effectively predict pollutant removal performance. ML can be utilised to build relationships between predictive variables and target properties, evaluate the properties of nanoparticles using reaction conditions, precisely control the synthesis of nanoparticles with desired functionalities, and therefore allow accurate predictions of nanoparticle properties and applications such as pollutant removal in wastewater treatment [30].
This paper aims to review the application of ML in nanoparticle-treated water. A critical review of the application of ML in nanoparticle-treated water is essential to deliver an informative background to further enhance the applications. Hence, this study is presented to review the application of ML in nanoparticle-treated water and the removal efficiency of different wastewater pollutants including industrial wastewater for single and multiple pollutants. Moreover, the need and the importance of ML to conduct the performance of nanoparticles in wastewater treatment is provided. After that, the advantages of nanoparticles are presented. Finally, future perspectives regarding the nanoparticles will be discussed along with the application of machine learning. Additionally, the use of nanoparticles in organic matter removal in wastewater treatment processes can be referred to in Table 1. However, these studies were mostly focused on the modeling using MLR, RSM and ANN.

Organic matter removal in wastewater treatment
Generally, one of the existing methods employed to eliminate COD is through the aerobic and anaerobic processes with the presence of the microorganism. However, the microorganism has a limitation which is sensitive to changes in environmental conditions. For example, when there is higher pH and temperature, the removal performance by the microorganism will be affected and thus reduced. This results in slow removal and long retention time. Due to this, another effective and alternative method compared to the traditional method is required to provide better outcomes. Nanoparticles attracted a lot of interest in terms of pollutant removal. Due to the advantages of nanoparticles which are huge surface area and related high reactivity, the nanoparticles give a better performance in pollutant removals [31]. To comply with the wastewater discharge limit and minimise the undesired impacts on the environment, nanoparticles have been studied and used to treat the organic compounds in polluted water, as tabulated in Table 1. Notes: qmax is the saturation adsorption of Langmuir, qm is the theoretical saturation capacity, N/A is not available.
Ayoub [32] studied the degradation of tetracycline (TC) antibiotics using the nanoFeCu by considering the factors that influence TC reaction such as nano FeCu dose, stirring intensity, H2O2 concentration, and initial TC dosage. The nano FeCu dosage affected the degradation of TC the most. The authors concluded that the Fenton reaction removed 82.3% of TC from an initial TC dose of 8 µg/L, when the pH, stirring intensity, nanoFeCu dose, and H2O2 dose was 7, 150 rpm, 0.6 g/L, and 1.0 g/L, respectively, which the results have well corresponded with the results obtained by Adel, Alalm, El-Etriby and Boffito [40], Abdel-Aziz, Farag and Abdel-Gawad [38] and Shih, Chen, Su and Tso [41]. This is because the relationship of adsorbent dose with the number of vacant adsorption sites is increasing linearly and hence results in enhanced removal performance. Multiple linear regression (MLR) is applied to conduct the prediction of the TC's removal due to its simplicity, directness, and good accuracy, MLR can predict the result of the dependent variable which is TC removal percent, and describe the nonlinear relationships and interactions between variables as compared to simple linear regression [42]. Although Ayoub [32] studied the lab scale application on MLR to predict the TC's removal, yet there is no application of the pilot scale on MLR has been carried out in this study.
A synthesis of nanoFeCu by a polysaccharide bio-flocculant (glucose, monopotassium phosphate, ammonium sulfate, sodium chloride, urea, magnesium sulfate, and yeast extract) is investigated by Dlamini, Basson and Pullabhotla [34]. The monometallic nanoparticles were synthesised separately. To determine the formation of the nanoparticles, physical observation and different characterisation techniques such as scanning electron microscope, and Fourier Transform-Infrared spectroscopy analysis was applied. They evaluated the potential application of nano FeCu on coal mine wastewater and found out that nanoFeCu can remove 98% of phosphate, 79% of sulfate, 87% of calcium, 92% of COD, and 97% of BOD at pH 8.2 and the lowest dosage of 0.2 mg/mL which nanoFeCu provides a better removal performance compared to the bio-flocculant and able to determine that the synthesised material nanoFeCu is cost-effective. A pilot prototype system efficiency was studied to demonstrate the effectiveness of nanoFeCu in treating real textile wastewater from the textile industry [33]. The removal efficiencies of COD, BOD, colour, TN, TP, and TSS increased when applying 0.5 g/L ferric chlorides as coagulant under an optimum adsorption condition of pH 6, 1.4 g/L nanoFeCu dosage, 80 minutes contact time and 250 r/min stirring rates at room temperature via adsorption process. The dosage of nanoFeCu which accomplished relative importance of larger than 40% the most influential factor. However, the authors found out that the surplus of nanoFeCu dosage which was higher than 1.4 g/L resulted in fewer removal efficiencies enhancement. The surplus of nanoFeCu dosage may cause overlapping to occur on the adsorbent's pore structure, shorten the distances of the interparticle, and shield the binding sites from pollutants because of the increased boundary layer screening effect. The study also indicated that the effects of environmental factors such as adsorbent dosage, pH, contact time, stirring rate, and concentration can be demonstrated by the employment of an artificial neural network (ANN). By relating input data to each other and the output data, ANN has demonstrated a promising potential for driving meaningful relationships between imprecise data.
Mahmoud and Mahmoud [35] introduced a study on the organic load removal efficiency represented in COD by applying nano FeCu. An increment of nanoFeCu dosage from 0.1 g/L to 0.6 g/L increased the COD removal efficiency from 100% to 69% via the adsorption process. This has been proven that nanoFeCu is a convincing adsorbent material for COD removal due to its increment of the vacant site for adsorption and free electrons for the degradation process from domestic wastewater. Moreover, the authors also concluded that the application of the ANN model is applicable to predict COD removal by using pH, nanoFeCu dose, contact time, stirring rate, and concentration as input data. Also, based on the response surface methodology (RSM) results, it was indicated that the model accurately described the relationship between operating conditions including pH, nanoFeCu dose, contact time, stirring rate, concentration, and removal percent of COD.
The studies conducted by Abdel-Aziz, Farag and Abdel-Gawad [36], Abdel-Aziz, Farag and Abdel-Gawad [37], and Abdel-Aziz, Farag and Abdel-Gawad [38] reported the use of nano FeCu in caffeine (CAF), metformin hydrochloride (MF) and carbamazepine (CBZ) removal, respectively. From the studies of Abdel-Aziz, Farag and Abdel-Gawad [36] and Abdel-Aziz, Farag and Abdel-Gawad [37], the efficiency of nanoFeCu removed CAF and MF applying different CAF and MF concentrations after reusing the nanoFeCu five times was still over 68% and 50%, respectively. After reusing the nanoFeCu in the 1st, 2nd, 3rd, 4th, and 5th recycles, the removal efficiency of CAF was 82, 78, 83, 70, and 69% while the removal efficiency of MF was 67, 61, 57, 54, and 52%, respectively. The removal efficiencies of CAF and MF decreased with each reuse cycle. Yet, their removal efficiency was still high even in the fifth recycle. This confirms the advantages of nanoFeCu such as superior stability and durability as well as the ability to be reused. The removal rate of CAF, MF, and CBZ is 86%, 97%, and 99%, respectively. The removal of CAF, MF, and CBZ is conducted under the adsorption process. The optimisation using a kinetic model is carried out. However, optimisation of the effect on different parameters using ML has not been examined in these studies. Hence, the application of ML can be applied to further compare the optimisation of the effect on different parameters.
Nitrogen-doped-TiO2 (N-TiO2) nanoparticles are produced via manual grinding method using electrophoretic deposition method. N-TiO2 nanoparticles and graphene (G) nanoplatelets were deposited onto a titanium grid sheet [39]. Research work was done to remove the insecticide, diazinon via an electrosorption-assisted photo-electrocatalytic process using N-TiO2/G nanocomposite under visible light irradiation. The authors found out the catalyst exhibited constant even after five successive cycles. The diazinon removal percentage stayed nearly identical about 80%. Furthermore, the number of reused catalysts did not affect the catalytic performance. To conduct the prediction of the diazinon solution removal, ANN is applied. It has been discovered to have the competence to regenerate experimental data. Also, ANN can conduct the prediction of the behaviour of the electrosorption-assisted photoelectrocatalytic process. Different studies conducted with diazinon removal applying different sorbent materials and dosages resulted in reasonable efficiency properties for the diazinon concentration reduction using different operating conditions. Dehghani, Hassani, Karri, Younesi, Shayeghi, Salari, Zarei, Yousefi and Heidarinejad [43] showed the ability of pumice with a particle size range of 200 to 2000 µm to reduce the initial diazinon concentration of 50 mg/L to 6.288 mg/L using 4 g/L pumice dosage at pH 3 and 30 min contact time. Esfandian, Samadi-Maybodi, Parvini and Khoshandam [44] studied the removal of diazinon ranging from 50 mg/L to 120 mg/L and 50 mg/L and 225 mg/L using modified zeolite by Cu2O nanoparticle (MZ) and acid-treated zeolite (ATZ), with 0.3 and 0.2 of adsorbent dose, 20 min and 6 min of contact time at pH 6, respectively. The results demonstrated that MZ proved to be more effective than ATZ in the sorption of diazinon which removed 98.32% of diazinon at 20℃.
Ayoub [32], Mahmoud, Mostafa and Peters [33], Mahmoud and Mahmoud [35], Abdel-Aziz, Farag and Abdel-Gawad [37], and Ayoubi-Feiz, Mashhadizadeh and Sheydaei [39] adopted mathematical models such as RSM and machine learning which included MLR and ANN to carry out the removal prediction. However, these studies are limited to only one model to be applied for prediction. MLR possesses the ability to determine the relative influence of predictor variables on the criterion value as well as the ability to conduct the identification of outliers or anomalies [45]. Yet, the limitation of MLR is it suffers from collinearity, is sensitive to outliers, and is only applicable to linear datasets [46,47]. In addition, ANN is employed to predict the performance evaluation of the reactions of the organic matter. The generally studied organic matters included TC, OG, COD, MF, CAF, and CBZ. The important parameters that influence the removal efficiency and the removal efficiency were utilised as input and output, respectively. By relating input data with each other and with the output data, ANN has shown promising potential in generating meaningful relationships between imprecise data [33]. Nevertheless, even though ANN can accommodate large data sets, small datasets can result in the overfitting of the model [48]. On the other hand, RSM is utilised to discover the correlation between operating covariable and pollutant removal efficiencies. Even though RSM is capable of evaluating interaction effects between independent input parameters, it has a limitation on making assumptions only on quadratic nonlinear correlation. To overcome the limitations, the needs and importance of ML to model the performance of nanoparticle and wastewater treatment are (i) to estimate the performance of the wastewater treatment process and minimise the experimental and operational cost, (ii) to conduct optimisation and govern the monitoring and management of treated water, and (iii) to compare and propose the ideal techniques for applications in nanoparticles and practical wastewater treatment. As to drive the meaningful relationship between the imprecise data by connecting the input data and with the output data, the predictive performance of the pollutant's removal can be studied by applying the ML which are MLR, ANN, and RSM. Table 2 tabulated the detailed comparison of the existing used ML methods for nanoparticle treated water.

Future perspective
Nanoparticles are broadly employed in industrial areas due to their advantages and removal performances on various contaminants and they also proved to be an innovative, environmentally friendly, and advanced processing technology. Nonetheless, there are several concerns about nanoparticles. For instance, issues relating to toxicity are unavoidable as the production of nanoparticles and their industrial applications grows. In the actual wastewater treatment plants, the mechanism of nanoparticle toxicity whether it occurs is unclear since lab-scale tests were mostly applied to study the effects of nanoparticles on treated water along with the mechanisms underlying the nanoparticles [55]. Hence, to prevent or minimise the toxicity of nanoparticles in the practical wastewater treatment plant, a future study in the nanotoxicology field and strict laws by government agencies are necessary to recognise and prevent toxic nanoparticles. Moreover, a life cycle analysis on nanoparticles can also be conducted. It is important to predict nanotoxicity effects to assess risk, but this is a challenging task because various physicochemical characteristics (size, shape, surface ligands, zeta potential, etc.), as well as experimental conditions (dose, pH, initial concentration, contact time, stirring rate, exposure pathway, etc.), influence nanotoxicity. To address the limitations, mathematical approaches which included ML can be used to conduct the prediction on nanotoxicity to prevent costly experiments, timesaving and delivers a beneficial reference to assess the nanotoxicity. Furthermore, a nanoparticle's synthesis process is tedious and necessitates skilled workers. In this matter, the application of ML to nanoparticle synthesis is promising, as it is capable of enabling the development of efficient protocols for the synthesis of nanoparticles and possibly, the development of new nanoparticles. With the vast amount of data available, ML models can be applied to construct complex structure-property and composition-property relationships for the compound desired or to conduct the generation of new molecules and materials. Lastly, existing research works showed that nanoparticles could be regenerated for reuse purposes. Thus, research has to be done to identify the cost-effective regeneration condition to regenerate nanoparticles for real-life application.

Conclusions
ML has the transformative potential to effectively supervise the operation of processes and predict the performance of the operation. Moreover, the improvement of operational efficiency and cost minimisation can be achieved by using ML. However, very limited of the reviewed research has attempted to review on the applications of ML in nanoparticle-treated water. Therefore, this review study provides a critical review of the applications of ML in nanoparticle-treated water and the removal efficiency of different wastewater pollutants including industrial wastewater for single and multiple pollutants. The contribution of this review study is to deliver an informative background to further improve the applications. Additionally, the need and the importance of ML to conduct the performance of nanoparticles in wastewater treatment is reported. Meanwhile, the advantages of nanoparticles are also presented. From this review study, it was discovered that mathematical models such as RSM and ML models including MLR and ANN are commonly used in nanoparticle research and water treatment. Moreover, the removal efficiency of nanoparticles for organic matter removal is efficient where it can obtain an organic matter removal performance of more than 75%. Nevertheless, there is a deduction of the pollutants removal efficiencies after the nanoparticles have been reused. Since there is only a few studies reported in this review study, it can be said that still less research work has been conducted on predicting the performance of nanoparticles in the wastewater treatment process. Also, there is no comparative model accuracy works had been done and no pilot scale has been applied in the current studies. Although the present study has successfully reviewed the application of mathematical approaches such as ML in nanoparticle-treated water, however, there is no adaptive soft sensor has been applied. Thus, it was suggested to employ a locally weighted least squares support vector regression model to further examine the predictive performance of ML models on pollutant removal.