A Comparative Study of Feature Selection Techniques for Bat Algorithm in Various Applications

Feature selection is a process to select the best feature among huge number of features in dataset, However, the problem in feature selection is to select a subset that give the better performs under some classifier. In producing better classification result, feature selection been applied in many of the classification works as part of preprocessing step; where only a subset of feature been used rather than the whole features from a particular dataset. This procedure not only can reduce the irrelevant features but in some cases able to increase classification performance due to finite sample size. In this study, Chi-Square (CH), Information Gain (IG) and Bat Algorithm (BA) are used to obtain the subset features on fourteen well-known dataset from various applications. To measure the performance of these selected features three benchmark classifier are used; k-Nearest Neighbor (kNN), Naïve Bayes (NB) and Decision Tree (DT). This paper then analyzes the performance of all classifiers with feature selection in term of accuracy, sensitivity, F-Measure and ROC. The objective of these study is to analyse the outperform feature selection techniques among conventional and heuristic techniques in various applications.


Introduction
Feature selection is a technique that has an ability to decrease the number of attribute by eliminates the least significant features [1].However, the problem in feature selection in finding the optimum features.Most of the features in datasets that did not contribute to end result are unknown.Some unimportant or irrelevant features need to be diminished in order to reduce the classification complexity and time processing [1].Feature selection becomes the important process in order to improve the classification performance.However, not all the feature selection technique reduces the same feature in dataset.For that reason, chosen the feature selection techniques is crucial where subset feature is needed for dimensionality reduction and gives better performance in classification.
Previously, some researcher likely to employ the Chi-Square (CH), Attribute evaluation (AE), Information Gain (IG), Correlation-based Attribute Evaluation (CB) and Symmetrical Uncertainty Attribute evaluation (SU) [2] that knows as conventional techniques.Today, several metaheuristic optimization algorithms seems become the potential technique to become feature selection techniques.The well-known metaheuristic optimization algorithms that widely such as Firefly Algorithm (FA), Genetic Algorithm (GA), Simulated Annealing (SA), Cuckoo Search (CS), Artificial Bee Colony (ABC), Bat Algorithm (BA) and Particle Swarm Optimization (PSO ) [3].Due to the strength of the combination of several metaheuristic algorithms, BA has become more powerful than PSO, GA and Harmony Search (HS) [4].Thus, BA is seen as one of the possible solutions to resolve problem in data mining problems such as feature selection and classification.
The aim of this study is to find out the outperform feature selection technique by taking into consideration the capability of BA as a feature selection.The main contribution of this study are 1) run the experiment in 14 dataset from various application order to find out the outperform feature selection techniques between CH, IG and BA., 2) analyse the outperform feature selection techniques among conventional and heuristic techniques where kNN, NB and DT were applied in these experiments to evaluate the performance of selected features by using four performance measures accuracy, sensitivity, F-Measure and ROC area.The rest of this paper is organized to provide a brief explanation of BA in Bat Algorithm section.The following section will discuss about Related Work.While, in the next section will present the Methodology, Experimental Result and conclude the finding by Discussion and Conclusion in the last section.
selection is to choose a subset of input variables by eliminating features, which are irrelevant or of no predictive information.It has been proven in both theory and practice to be effective in enhancing learning efficiency, increasing predictive accuracy and reducing complexity of learned results [6,7].
There are four basic steps in a typical feature selection process as shown in Fig. 1[8].The process of feature selection is as below;  The generation procedure to generate the next candidate subset from original feature set  The evaluation function to evaluate the subset to determine the relevancy towards the classification task using measure for instances distance, dependency, information and consistency  Stopping criteria to decide when to stop.This is where it determine the relevant subset or optimal feature subset  Validation procedure is to check whether the selected feature subset is valid . There exists a several feature selection method that used by researcher.Some researcher trend to employ conventional method such as information gain and chisquare for instance [3,9,10].In the other research, heuristic method such as genetic algorithm [11] ACO [12] and [13] in memetic feature selection, noisy data, spam email, binary variables; respectively.Feature selection also involve as active field of research such as in pattern recognition, machine learning and data mining area [14,15].Feature selection objective is to reduce irrelevant data and finding the most relevant features that would increase classification accuracy.It has been proven in both theory and practice to be effective in enhancing learning efficiency, increasing predictive accuracy and reducing complexity of learned results [16].
A wrapper feature selection approach based on BA and Optimum Path Forest had been proposed by Nakamura [17].This approach modeled a problem of feature selection as a binary based optimization technique.Six datasets been used in experiments that demonstrated that the proposed approach provides statistically significant more compact sets and in some cases it indeed improves the classification effectiveness.
Binary Bat Algorithm (BBA) was one of the inspired binary version feature selection that proposed to find the most significant feature in a search space [18].BBA was proposed to associate each bat a set of binary coordinates that indicate whether that feature belongs to the final set of features or not.It combined the power of bat algorithm and Optimum Path Forest in finding the set of features that maximizes the accuracy of validating sets.It been proved that the proposed techniques can outperformed other well known techniques such as PSO, FFA and GSA.
From [19] proposed bio-inspired method called Bat Algorithm hybridized with a Naive Bayes classifier (BANB).Twelve benchmarks datasets from different domains been used in experiments to compared their performance measures with three well known feature selection techniques; GA, PSO and GPSO in term of the number of selected features from the original datasets.It shows that BANB significantly outperformed other algorithms in selecting significant number of features and lead to maintaining and improving classification accuracy.
Generally these studies found that feature selection techniques are capable to improve the performance of learning algorithms thru increasing the accuracy of the classifier by removing irrelevant attributes.Therefore with high quality features, it makes the classification process accurate, comprehensible and produces better results.For that reason this research will conduct the experiment that focuses to analyze the outperform techniques among conventional and heuristic techniques.

Experimental results
In this research, the experiment is setup to analyze the outperform feature selection techniques among conventional and heuristic techniques.The conventional techniques involve in this experiments are Chi-Square (CH) and Information Gain (IG).Meanwhile Bat algorithm (BA) as a heuristic method.This section discusses on the experimental results of 14 datasets from UCI repository in various application with selected attributes using CH, IG and BA.Three benchmark classifier; kNN, NB and DT DT were applied in these experiments to evaluate the performance of selected features by using four performance measures accuracy, sensitivity, F-Measure and ROC area.
Table 1 shows the characteristics of 14 datasets; number of attributes and instances.In general, selected features by CH and IG improve the classification performance for all classifier as shown in  classification performance.The highest value is ROC area for KNN with value 0.59.
For Image Segmentation dataset, ROC area for NB, 0.946 is the highest value (see Table 5).Meanwhile, the ROC area with BA is 0.926.From Table 6, ROC area for NB, 0.943 is the highest value while ROC area by BA is 0.938.In Table 7, the highest value is ROC area for NB, 0.876.From Table 8 shows, ROC area for NB is the highest value for Plant Species dataset in CH, IG and Bat with value 0.996, 0.996 and 0.993, respectively.In Table 9, the highest value for ROC area for NB in CH, IG and Bat with value 0.965, 0.965 and 0.957, respectively.In Table 10 and 12, the highest value for ROC area in Automobile dataset is 0.806 and in Yeast dataset is 0.817.
In Table 11, the highest performance is ROC area for NB in CH, IG and Bat with value 0.71, 0.71 and 0.704, respectively.For Waveform dataset that shown in Table 13, the highest performance is ROC area for NB in CH and IG is 0.96 and BA is 0.954.From Table 14 and 15, the highest value is ROC area for NB is 0.935 and 0.878, respectively.

Discussion and conclusion
This study was used three feature selection techniques and 14 datasets in various applications from UCI repository.The experiment was testing using Weka and Matlab.CH and IG is consider as conventional features selection method while BA as heuristic features selection method.These features selection techniques are used to find the subset features from 14 datasets.Then, analyse the outperform feature selection techniques, it can be determine according to classification performance.There are three benchmark classifier; NB, KNN and DT with four performance measures accuracy, sensitivity, F-Measure and ROC area were used in this experiment The results shows the outperform feature selection method is determine by ROC area.The outperform feature selection method is conventional techniques, CH and IG.However, BA has a potential to be outstanding feature selection method.The result from Table 4 show BA improves all the performance measures for KNN.And also, results from Table 5

Table 2 ,
3,5 until 15.From Table 2 and 3, ROC area for NB is a highest performance measures in Credit Approval and Ecoli with value 0.879 and 0.988, respectively.However the result for Hill Valley dataset is little bit different as shown in Table 4. Selected features by CH and IG only improve the classification performance for NB in all performance measures.But BA improves all the performance measures for KNN.Meanwhile, BA obtains the same result with CH and IG for DT in all

Table 2 .
Results for classification performance for Credit Approval dataset

Table 3 .
Results for classification performance for Ecoli dataset

Table 4 .
Results for classification performance for Hill Valley dataset

Table 5 .
Results for classification performance for Image Segmentation dataset

Table 6 .
Results for classification performance for Libras Movement dataset

Table 7 .
Results for classification performance for Steel Plates Faults dataset

Table 8 .
Results for classification performance for Plant Species dataset

Table 9 .
Results for classification performance for Urban Land dataset

Table 10 .
Results for classification performance for Automobile dataset

Table 11 .
Results for classification performance for Abalone dataset

Table 12 .
Results for classification performance for Yeast dataset

Table 13 .
Results for classification performance for Waveform dataset

Table 14 .
Results for classification performance for Ionosphere dataset

Table 15 .
Results for classification performance for Water Treatment dataset