Chicken eggshell as biosorbent : Artificial intelligence as promising approach in optimizing study

Response Surface Methodology (RSM) is the most popular approach for optimization study in various biochemical processes nowadays. Artificial Neural Network (ANN) has emerged as one of the most efficient methods in empirical modeling and optimization, particularly for non-linear systems. In this study, the estimation capability of RSM and ANN models was compared in copper removal from aqueous solution. The experiments were carried out based on a 3-level and 4-variable Central Composite Design (CCD). The RSM results revealed that the relationship between the response and independent variable could be represented by the quadratic polynomial model. In the development of ANN model, the optimal configuration of the model was found to be 4-10-1. Estimated responses from both models were compared with the experimentally determined responses to determine predictive capabilities of both techniques. Comparison of two methodologies showed that the ANN model was more accurate and exhibited better generalization capability than RSM, thus indicated a clear superiority than the latter in capturing the non-linear behaviour of the adsorption process using chicken eggshell as biosorbent.


Introduction
Chicken eggshells are waste material from hatcheries, homes and fast food industries which are abundant and readily available worlwide [1].Without proper management, its disposal contributes to environmental pollution.A recent report described that chicken eggshells contain calcium and trace amounts of other micro elements (boron, copper, iron, manganese, molybdenum, sulphur, magnesium, silicon and zinc).The presence of cellulosic structure and amino acids suggested it a good biosorbent [2].Calcium carbonate was identified as the main element responsible for metal adsorption [3].Several studies in recent years concentrating on the evaluation of the possibility of heavy metal removal, particularly copper, fluoride, iron and chromium from aqueous solution had shed lights on the application of discarded chicken eggshell in wastewater treatment [2,[4][5][6].In Malaysia, chicken eggshells are discarded from food industry or household as municipal solid waste.They are normally disposed in landfills.Even though these waste materials are available in abundance, its potential in heavy metal removal is still less being investigated.
The efficiency of copper (heavy metal) biosoprtion using chicken eggshell powder can be influenced by many factors such as particle size, biosorption temperature, solution pH, ratio of sorbent to solution and duration of biosorption [2,7,8].Consequently, the biosorption process or design of experiments must be conducted or designed carefully.Various experimental design techniques had been employed for biosorption process which included the conventional One-Factor-At-A-Time (OFAT) method and statical methods such as Central Composite Design (CCD) and Box-Behnken Design (BBD).Under circumstances where multiple process variables may influence the response variables, response surface methodology (RSM) emerges as a favourable technique in optimizing the process [3,9].RSM has been extensively employed in various biochemical applications for the purpose of empirical modeling and optimization.It is a collection of statistical techniques for experiment designs, model buildings, evaluation of effects of process factors, as well as identification of optimum conditions, through fitting of the experimental responses to design of experiments (DOEs) into quadratic function [10].The applications of RSM involve three major steps that start with establishment of experimental design using CCD and BBD to set the experimental level of independent variables.Subsequently, regression analysis will be conducted for the response surface modeling.Finally, the response surface mode will be used as objective function for the process optimization [11].RSM had been commonly employed in various bioprocesses such as extraction, fermentation and absorption processes.The ultimate objective of the application is to identify the optimum operating conditions for the respective processes.
In most of the RSM problems, the interaction between the response and process variable is not known.Hence, suitable approximation for the true functional relationship between the respond, y, and the set of process variables, x 1 , x 2 , x 3 ...x n must be established where n is the total number of process variables in the process or system.The function can be expressed as: (1) with ε refers to the noise or error observed in the response y.The true response function f is normally unknown, thus low-order polynomial in some region of the independent variables is preferable for the approximation of f.If the response can be defined by a linear function of the independent variables, then the approximation function is a first-order model expressed by: (2) In the case of curvature in the response surface, a higher degree polynomial can be used (e.g. the second order model) as follow: (3) Nevertheless, neural network-based models have recently been shown by several researchers to exhibit superior performance in terms of accuracy and predictive capability [12].Artificial Neural Network (ANN) has emerged as an attractive tool for non-linear multivariate modeling in the last two decades.It has typically been used as a "black-box" tool, that is, no prior knowledge about the process is assumed but the goal is to develop a process model based only on observations of its inputoutput behavior [13].ANN is generic in structure and it exhibits the ability to learn from data gathered by users, thus widely known as a reliable modeling tool for process that is lack of information [10].ANN is capable of handling multiple independent and dependent variables simultaneously and to do this prior knowledge on the functional relationship does not need to be known.Each neuron receives information through input connections, process the information and produce the output which is distributed through output connections.Each connection is associated with a real number quantity known as weight [14].In other words, the neurons sum weighted inputs and then applies a linear or non-linear function to the resulting sum to determine the output.These neurons are arranged in layers and are combined through excessive connectivity [14].A neural network in its basic form is usually composed of several layers of neurons, there being one input layer, one output layer and at least one hidden layer.
The error or bias in prediction is then propagated through the system and the inter-unit connections are changed to minimize the error in prediction [15].This is a continuous process with multiple training sets until the minimum error is attained.Training an ANN network is therefore accomplished by adjusting these weights by minimizing a non-linear error function [16].However, a major disadvantage of ANN is the difficulty in explaining the relation between independent and response variables resulting from the ambiguously defined weights, which as mentioned earlier, is a black box [17].According to Agatonovic-Kustrina et al. [15], 3 types of data sets that are inclusive of training data, testing data and validation or unseen data, are required for network training, neural network performance monitoring during the training, and measurement of the performance of trained network, respectively.The number of the layers and processing elements in layers vary from one process to another.There is no strict rule available to identify the optimal number of layers and neurons.In fact, it is generally defined by trial and error [18].
It is possible to investigate the effect of each independent process variable, alone or in combination in the process through the least number of experiments through the application RSM [18].Yet, drawbacks associated with it have been pointed out recently.Desai et al. [16] and Lou and Nakai [17] stated that RSM was useful only for quadratic approximations and the sensitivity analysis of input variables was difficult to perform using RSM due to the presence of cross interactions.It is therefore hard to conclude that RSM is applicable to all optimization and modeling studies.Consequently, the main objective of the present work is to investigate the possibility of employing artificial intelligence approach, particularly through the application of artificial neural network to construct a nonlinear relationship among the process parameters and their corresponding responses.The ANN used in this work is the feedforward back-propagation network which is normally applied in analytical applications and is very powerful in function optimization modeling.To our best knowledge, this is the first report on the comparison of RSM and ANN methodologies in predicting the copper removal efficiency from aqueous solution using eggshell as adsorbent.Also, changes on the surface morphology of the eggshell as biosorbent for copper removal was observed and comparison was made before and after the adsorption process.

Materials
Chicken eggshells were collected from the restaurants or market around UCSI University Kuala Lumpur (North Wing).The sample was washed with distilled water several times to remove dirt and contaminants, followed by drying in a hot air oven at 110°C for at least 12 hours.The chicken eggshells were then grinded or crushed using pestle and mortar.The sample was stored in an airtight container for future use.

Biosorption Process
The adsorption process was conducted based on the method reported by Nabil et al. [5] with slight modifications.Chicken eggshell was mixed with copper (II) sulphate solution.Agitation was made at a constant stirring speed of 150 -300 rpm for 60 -180 minutes.The remaining concentration of Cu(II) after adsorption was measured using UV-Vis spectrophotometer.

Response Surface Methodology (RSM)
RSM was employed for experimental design, data analysis, and model building with the aid of the software Design Expert (Version 6.0.6,Stat-Ease Inc., Minneapolis, Minnesota USA).Central Composite Design (CCD) with four variables was used to determine the response pattern and subsequently to establish a response model.According to CCD, the total number of experimental combinations is 2 k +2k + n 0 , where k is the number of independent variables and n 0 is the number of repetitions of the experiments at the centre point.Four independent variables used in this work were temperature (X 1 ), agitation speed (X 2 ), dosage (X 3 ), and time (X 4 ), while the dependent variable was the percentage of copper removal.The ranges and center point values of all independent variables were based on the results of preliminary experiments (Table 1).The copper removal percentage was analyzed by multiple regressions to fit into the following polynomial equation: (4) where Y is dependent variable (copper removal percentage), β 0 is intercept, β i , β ii , βij are regression coefficients, X i indicates the linear terms, X i 2 for the quadratic terms for a single variable, and X i X j for the interaction terms.Four replicates at the central point of the designed model were used to estimate the pure error sum of squares.30 experiments were performed in random order to cover all combinations of the factor levels in the experimental design.2).To cope with different magnitudes in the input and output, all the data were scaled to zero mean and unit standard deviation.Single hidden layer neural networks with different numbers of hidden neurons were trained on the training data and tested on the testing data.The network with the lowest MSE on the testing data was considered as having the best network topology.In assessing the developed models, MSE and correlation coefficient (R 2 ) on the unseen validation data were used as the performance criterion.

Characterization Study
Scanning electron microscopy (SEM) analysis was performed to observe structural and morphology changes in eggshell surfaces both before and after adsorption process.SEM images of the eggshells were obtained using JEOL JSM 6400 LV model scanning electron microscope at 15 kV.

Response Surface Methodology
All 30 experimental points were evaluated and the percentages of copper removal results based on the factorial design are shown in Table 2. Response surface optimization is favorable over the conventional single parameter optimization in that it saves time, space, and raw materials [11].The experiments were run at random in order to minimize errors from the systematic trends in the variables.The percentage of copper removal obtained ranged from 4.55 to 59.1 %.Exploration and optimization of a fitted response surface might produce poor or misleading results unless the model exhibits a good fit, which makes the checking of the model adequacy essential.Through the analysis of variance (ANOVA), the P-value of the model was less than 0.0001.Meanwhile, the lack of fit value of the model was not significant at 0.1417.These values confirmed that the model fitness was good.
In order to check on the adequacy of the predictive model, 3 sets of experiments were repeated at the suggested optimum conditions in order to experimentally obtain the maximum copper removal using eggshell.As shown in Table 3, the percentage error differences between the experimental and predicted values were always less than 1.5 %.Since the differences between the experimental and predicted response were less than 5%, the validity of the model was verified [10].

Network Topology and Training of Artificial Neural Network (ANN) Model
The copper removal adsorption process using eggshell was conducted in 30 batches contributed to 30 samples data.To create the model, these data were re-sampled using bootstrap method into 90 data for training purposes.Several structures were tested to identify the best network topology based on trial and error method.Variation in number of neurons in the hidden layer ranged from 5 -14.Subsequently, the network was trained and tested after each addition of neuron.There was a drastic drop in the MSE obtained after number of neurons used was more than 7 (Figure 1).With the outputs predicted by ANN trained with different number of neurons, the least error between the predicted and validation result (unseen) was obtained with 10 neurons in the hidden layer.The configuration of the ANN model could therefore be defined as 4-10-1.
The predicted and actual values for the training data matched with each other as shown in Figure 2. In this study, the ANN training resulted in unity R 2 value, proving that the ANN model could fit well with experimental data.Validation of the trained model was conducted at the suggested optimum conditions by ANN and it was predicted that 59.37 % of copper removal could be achieved.3 sets of validation experiments were repeated.The percentage error differences between the experimental and predicted values were less than 1 % (Table 4).Therefore, the validity of the trained ANN model was verified.

Validation and Comparison of Both Predictive Models
The estimation capabilities between the statistical-and artificial intelligence-based optimization techniques namely RSM and ANN, respectively, were compared and examined.The comparison was made on the basis of various parameters such as the correlation coefficient (R RSM was one of the most popular and widely used techniques in optimization studies of chemical and biochemical processes.Nonetheless, it had limitations that modeling of the processes could be represented by only up to second-order equations [14].However, it could be possible that the relationship between the process variable and response variable is more complex than that.Therefore, in the present study, the R 2 values of training datasets showed that ANN model resulted in significant higher generalization and predictive capability as compared to RSM.Also, it was evidenced by the validation of the predictive results in which prediction suggested by RSM resulted in greater deviation (error) compared to ANN model prediction (Table 5).

Characterization Result
Scanning electron microscopy (SEM) in Figure 3

Conclusion
Two models were developed to predict the percentage of copper removal using response surface methodology (RSM) and artificial neural network (ANN).A feedforward artificial neural network (FANN) was trained by the sets of input-output patterns by implementing the error-back-propagation algorithm.The present study revealed that ANN technique performed better than RSM.Thus, it could be concluded that even RSM was widely applied in various optimization studies, ANN optimization strategy might present a better alternative to the conventional method.The results of this study also suggested that powdered eggshell, with its low cost and abundant availability has a potential for being used as an adsorbent for contaminants such as copper in wastewater.

Figure 1 .
Figure 1.Selection of number of neurons in the hidden layer.

Figure 2 .
Figure 2. Actual and predicted output for training data.
(a)   shows the morphology of powdered eggshell.It confirmed the crystallinity of eggshell and the existence of pores on the surface of the eggshell.After the adsorption process, success in copper removal was evidenced by the deposition of flake-like copper crystals on the surface of the eggshell.No obvious porous structure was observed on the surface after the adsorption process (Figure3 (b)).

Figure 3 .
Figure 3. SEM image of eggshell (a) Before adsorption and (b) After adsorption

Table 1 .
[19]rimental range and values of the independent variables used in CCD rise, the training was terminated and the value of weight that the network had in that previous step was chosen as the result of the training run.The development of ANN (activation functions) adopted in this study was in accordance to the theory and approach suggested by Ang et al.[19].The FANN model was developed based on the experimental data of RSM (Table output.Using MATLAB (version 7.8.0.347,Mathworks, Natick, MA), a feedforward artificial neural network (FANN) trained with an error-back-propagation algorithm was designed.All weights and biases were randomly initialized in the range from 0 -1.Neural network building data were trained and Mean Squared Error (MSE) was plotted against training epoch.When MSE started to

Table 2 .
Response surface CCD and results for copper removal percentage

Table 3
Validation of data and constructed RSM model

Table 4
Validation of data and constructed ANN model.
[16]itions suggested by RSM at temperature of 49.26 °C, agitation speed of 233.38 rpm, chicken eggshell dosage of 0.7 g for adsorption process of 178.7 minutes.Desai et al.[16]stated that the generalization ability could best be judged with the unseen (validation) dataset.Therefore, the comparative values are tabulated in Table5.
2) for both training and validation sets, predicted copper removal percentage by both techniques and the average error of the experimental result based on the optimum