The Performance Analysis of K-Nearest Neighbors (K-NN) Algorithm for Motor Imagery Classification Based on EEG Signal

Most EEG–based motor imagery classification research focuses on the feature extraction phase of machine learning, neglecting the crucial part for accurate classification which is the classification. In contrast, this paper concentrates on the classifier development where it thoroughly studies the performance analysis of k-Nearest Neighbour (k-NN) classifier on EEG data. In the literature, the Euclidean distance metric is routinely applied for EEG data classification. However, no thorough study has been conducted to evaluate the effect of other distance metrics to the classification accuracy. Therefore, this paper studies the effectiveness of five distance metrics of k-NN: Manhattan, Euclidean, Minkowski, Chebychev and Hamming. The experiment shows that the distance computations that provides the highest classification accuracy is the Minkowski distance with 70.08%. Hence, this demonstrates the significant effect of distance metrics to the k-NN accuracy where the Minknowski distance gives higher accuracy compared to the Euclidean. Our result also shows that the accuracy of k-NN is comparable to Support Vector Machine (SVM) with lower complexity for EEG classification.


Introduction
Motor imagery classification is used to classify the imaginary task such as hand and foot movements by using brain signal which usually used in Brain-Computer Interfaces (BCI). BCI is an alternative communication pathway between the brain and computer which does not requires muscular movement or control [1] which has potential application value in wide range of the fields such as medical, military and entertainment [2].
The classification of brain signals is challenging as the signals are low signal to noise ratio, non-linearity and limited training data due to difficulties in collecting signals [3]. Electroencephalogram (EEG) is one of the techniques that usually been chosen by researchers to collect the signals instead of others such as computer tomography (CT), magnetic resonance imaging (MRI) and functional MRI (fMRI). The collection of signal using EEG is done by placing the electrode on the scalp to records electrical activity and brain waves.
Two main phases in data classification is feature extraction phase and training/ classification phase (classifier development). Unlike the existing researches [4][5][6][7][8] which focus on feature extraction, this paper focuses on the classifier development.
Classification task of EEG data can be done with many algorithms, one of them is k-Nearest Neighbor (k-NN). K-NN classifier is used in [9] to discriminate between seizure and non-seizure events for automated seizure detection using EEG signals. In [10], the k-NN is used as classifier for its classification task to compare three different signal decomposition methods for motor imagery BCI systems. K-NN classifier provides better classification accuracy and requires less training and testing times compared with the Multiple Layer Perceptron (MLP) and Support Vector Machine (SVM) [11]. In [12], the k-NN classifier obtained the best accuracy in classifying EEG signals to identify the engagement, enjoyment, frustration and difficulty compared to Bayes Network, Naïve Bayes, SVM, Multilayer Perceptron, Random Forest and J48. Besides that, [13] shows that the k-NN also provides better accuracy in eye state classification compared to Multilayer Perceptron Neural Networks.
This research study the classification technique which using k-NN algorithm. K-NN is simple supervised learning algorithm [14] that classified a sample by a majority vote of its neighbors which means the sample is allocated to the class based on the most common class among its k closest neighbors. In k-NN, distance metrics is used to calculate distance between a new samples and existing samples in dataset. The literature is strongly influenced by a commonplace in using the Euclidean distance metric [10][11][12][13] [15][16][17][18]. In fact, we are not aware of any studies with a focus on a performance comparison among various different distance metrics for k-NN. This paper aims to thoroughly investigate the effect of distance metrics and parameter k to the k-NN performance. In this paper, we study five distance experiments.
Additionally, the paper also verifies whether the k-NN algorithm is suitable or not to use for classification method in BCI system as in [15] states that k-NN is one of the most important non-parametric algorithms in BCI implementation and compare it with the performance of SVM which commonly used for EEG data classification. This paper is study the performance analysis of k-NN algorithms by using EEG data for BCI motor imaginary classification. The results helps in choosing better k value and a proper metric that makes a good classification performance. The rest of this paper is organized as follows: Section 2 gives the introduction of k-NN and its parameter. Section 3 contains the EEG processing using k-NN. In section 4, the results and discussion is presented. Lastly, section 5 is conclusion.

K-Nearest Neighbors
K-NN classification is a non-parametric model that is described as instance-based learning which the model are characterized by memorizing the training dataset. K-NN also a typical example of a lazy learner. It is called lazy not because of its apparent simplicity, but because it does not learn a discriminative function from the training data but memorized the training dataset instead. Lazy learning is a special case of instance-based learning that is associated with zero cost during the learning process [19].
The k-NN algorithm is suitable to classify EEG data as it is a robust technique for large noisy data [20]. The samples which is the data is classified by the majority vote of its neighbor's class. In order to determine the class, this algorithm requires training data and predefined k value as it will search through the training sample space for the k-most similar samples based on a similarity measure a distance function [21]. The value of k and distance metric will affect the result of classification [20]. Fig. 1 illustrates the concept of k-NN algorithm when applied the distance metric to determine the appropriate class of new data with k = 9. The data to be classified is at point (0.6, 0.45), which is shown with "X". The big circle with dot line is represented the distance metric using Euclidean distance computation. It has two possible classes which is circle class with six instances and triangle class with three instances. The algorithm will classify mark "X" to the circle class as the circle class have the majority of data within the radius.

Value of k
The value of k is usually small and integer with positive value. If k = 5, class allocation of the sample is based on the nearest five neighbors within a certain distance. In this paper, the value of k is use from 1 to 15.

Distance metric
Distance metrics are a method to find distance between a new data point and existing training dataset [22]. There are 5 distance metrics used in this research which can be explained as follows:

Manhattan / City Block Distance
The Manhattan distance or also be called as city block distance is the distance between two points ( which the sum of the absolute difference of Cartesian coordinates, defined by:

Euclidean Distance
The Euclidean distance is a measure to find distance between two points. In Cartesian coordinates, if and are two points in Euclidean k-space, then the distance (d) from to or from to is defined by the Pythagoras's theorem. The formula is as follow:

Minkowski Distance
The Minkowski distance is a method to find distance between two points ( in a normed vector space with various value of p. The distance is defined as follow: For the special case of Minkowski distance: When p = 1, the Minkowski metric gives the Manhattan distance; when p = 2, the Minkowski metric gives the Euclidean distance; and when p = ∞, the Minkowski metric gives the Chebychev distance.

Chebychev Distance
The Chebychev distance is a measure to find the distance between two vectors or points ( which have the greatest of their differences along any coordinates dimension, defined by:

Hamming Distance
The Hamming distance, which is the percentage of coordinates that differ, can be defined by:

Data description
The dataset used is Dataset 1 from BCI Competition IV [23] which contain recorded EEG data for motor imagery task perform by four healthy human subjects (a, b, f and g). In the experiment, each subject was asked to select two mental tasks to perform out of three tasks which are left hand, right hand or foot movements. The arrows pointing left, right or down is presented as visual cues on a computer screen as represented for left, right and foot movement respectively. The cues were displayed for a period of 4s which interleaved with 2s of blank screen and 2s with a fixation cross shown in the center of the screen. As in each run have 50 trials of each of the chosen two classes, give the total of 200 trials for each subject [24]. Eleven channels are used in this research, named as FC5, FCz, FC6, C5, C3, Cz, C4, C6, CP5, CPz and CP6.

Feature Extraction
The extraction of the wanted signals from the raw EEG signals and removing the irrelevant signals is called as feature extraction process as shown in Fig.2. The EEG data should be process in term of frequency as it is been classified based on frequency bands which are delta, theta, alpha, beta and gamma band. Therefore, the Fast Fourier Transform (FFT) is used to transform the time domain to frequency domain. The FFT features are then feed into k-NN classifier for classification process. As signals in time domain can be split into a group of sinusoids, the EEG signals which are lengthy and noisy can be easily convert into frequency domain. The hidden features can become visible after the conversion. The original signal can be restored by adding all the sinusoids up after FFT. Therefore, no information is lost.

Classification
In machine learning, classification is a prediction process to predict the categories or classes of the samples in the dataset by applying classification algorithm. The classifier used in this paper is k-NN classifier as shown in Fig. 3. This process is performed in python programming.
Five distance metrics: Manhattan distance, Euclidean distance, Minkowski distance, Chebyshev distance and Hamming distance are used to determine the best value of k to maximize the classification performance. The kvalue is searched from 1 to 15. The classification is performed for all subjects and the result are expressed in percentage of accuracy.

Fig. 3 Block diagram of the classification process
SVM is usually used in BCI research as classifier for this dataset [4][5][6][7][8]. Thus, in this paper, the k-NN is investigated and compared with SVM.

Result and Discussion
The classification is between two classes which either motor imagery of right hand and left hand or left hand and right foot. In this section, the result of EEG data classification by applying the k-NN classifier with various value of k and distance metrics are tabulated in tables and illustrated in graphs. The results have been analyzed for the performance analysis of k-NN classifier and verified by using 10-fold cross validation technique.
The results have been analyzed for the performance analysis of k-NN classifier. In order to prove the reliability of the classification results obtained, the results is verified using 10-fold cross validation technique. This process randomly divides the data into 10 folds for tested 10 times. The testing process is done on one of the folds while the remaining folds undergoes the training process. The process was repeated until all folds are used for testing and training the classifier. The accuracy of classification was calculated from the average of 10-fold cross validation.

K-NN Accuracy with Varying Parameters
The data classification has been performed with the same classifier which is k-NN with various value of k and The left hand and right foot imaginary movements classification is done by subject a and f. For subject a, the highest accuracy is 67.67% when using Minkowski distance at p = 11 and k = 14. For subject f, the highest accuracy is 63.13% when using Manhattan distance at k = 13,15. Subject b and g is used to classify the right hand and left hand imaginary movements. For subject b, the highest accuracy is 53.87% when using Minkowski distance at p = 11 and k = 14. For subject g, the best classification accuracy achieved is 70.08% when using Minkowski distance at p = 11 and k = 15.
The comparison accuracy results between different distance metrics and value of k for motor imaginary task classification of left hand vs right foot imaginary movement and right hand vs left hand imaginary movement are illustrated in Fig. 4. Based on the overall results, the Minkowski distance at p = 11 with the value of k between 13 and 15 give the highest accuracy compared to other distances which should be chosen to get a good performance for BCI motor imagery classification.

K-NN vs SVM
Although SVM classifier is commonly used for this dataset, the results shows that the k-NN outperformed most of the classification accuracy for subjects when the classification accuracy using SVM and k-NN is compared. The k-NN is very simple algorithm to implement and need least time compared to SVM.
For this comparison, Linear Discriminant Analysis (LDA) is applied before run the classifiers to reduce the data dimensionality in order to get optimum results. As the Minkowski distance at p =11 give the highest accuracy compared to other distances, this metric distance is used to be compare with SVM. Table 2 shows the result of classification accuracy of k-NN and SVM and Fig. 5 shows the graph of comparison between k-NN and SVM after applying the LDA.

Fig. 5 Comparison classification accuracy between k-NN and SVM
From the results obtained, it can be concluded that the k-NN algorithm is also suitable to use for classification method in BCI system as its result comparable to SVM.

Conclusion
The motor imagery classification of EEG signal for BCI applications is performed in this paper. Feature extraction of the signals is obtained using FFT, and the k-NN classifier is used to classify the data. The classification of motor imagery task using EEG data is performed to analyze performance analysis of k-NN. This paper demonstrates a detail the performance analysis of k-NN for EEG signal classification. The results of this research show that different accuracy of k-NN is obtained when applied with different k-value and distance metrics. This demonstrates that distance metrics and k-value affect the classifier performance and therefore worth to be considered in designing an EEG signal classifier. The results show that k-NN obtained comparable accuracy to SVM which commonly used for EEG signal classification. Given the expensive computational time of SVM, the lower complexity k-NN is promising for low-cost EEG classification. Since, the Hamming distance metrics perform badly on all datasets as having the lowest accuracy compared to others distance. Thus, Hamming distance not suitable for EEG data classification. In future work, more algorithms will be thoroughly investigated for EEG classification.