Comparison of principal component analysis algorithm and local binary pattern for feature extraction on face recognition system

Characteristic extraction in face recognition is a step to get characteristic information from the image. The characteristic extraction algorithm is tested against several scenarios of different sunlight and lights, objects facing the camera and not facing the camera. The sample test data were performed on 4 people using a video file or frame numbering 70 for recognizable faces using Principal Component Analysis (PCA) and Local Binary Pattern (LBP) algorithms. The result of the research shows that Local Binary Pattern (LBP) algorithm in object scenario facing camera with sunlighting in room has accuracy of 98.59%, recognition time of 812,817 milliseconds, FAR of 1,41% and FRR of 0%, while at Principal Component Analysis (PCA) 98.59% accuracy, recognition time of 1275,761 milliseconds, FAR of 1.41% and FRR of 0%. Based on these results, the Local Binary Pattern (LBP) algorithm is more efficient than Principal Component Analysis (PCA) for face recognition of the scenarios to be implemented in real-time video.


Introduction
The face recognition system has received a lot of attention in the security field.This system provides user authentication for access control, but when matching the individual identification with the face image on the database is still a problem, this is due to the variability of the human face under different operational conditions e.g.illumination, rotation, expression, camera display, aging and expression [1].
The use of algorithms affect the quality of built systems, then the comparison of characteristic extraction algorithms with local binary pattern algorithm histogram more efficient than eigenface algorithm [2].However, this study suggests differences in face position while is recognition.
Video is a few frames within which there are objects to be recognized by the face recognition system, as for the type of video used for face recognition that is realtime video [2][3][4][5] or video file [6,7].Real-time video is frame retrieval directly for face recognition, whereas video files are videos stored on a file / file to perform face recognition indirectly.

System Design
The face recognition system is basically part of an expert system based on image technology.Expert systems are systems that process data in an organized manner [8], which seeks to adapt human knowledge to a computer, so that the computer can solve problems as is usually done by experts [9].Based on many research results, the expert system has a very good ability in decision making, the system has advantages in terms of: good data accessibility, efficient in time [10], accurate [11], support decisions appropriately [12], more economical [13], broad accessibility [14], improving user understanding [15], improving productivity [16], presenting data and information well [17], and in certain cases can be utilized as data storage media [18].
Face recognition system consists of four stages of face localization using face detection method haar cascade clasiffier, facial normalization is to cut the face image, image size changes and change the image of RGB to grayscale, feature extraction is to get the feature / feature on the face image and matching characteristics the test image with training image to identify the image to be identified identity [7,19].The stages performed in this study are described in sub chapters 2.1 to 2.3 below.

Preprocessing
Preprocessing is an early stage to process input data (face image) before to stage process (face recognition).As for several stages will be explained as follows:

Face Detection
The phase of face detection is the phase localization of the face on the frame using the haar cascade classifier algorithm, i.e. the object detection method with a high degree of accuracy and speed [20].The haar cascade classifier algorithm consists of: -Create integral image; https://doi.org/10.1051/matecconf/201819703001AASEC 2018 -Adaboost training; -Cascade Classifier.

Image Cropping
At this stage the image is crop from the background of the face image in the frame that is considered unnecessary.

Image Resizing
The next process is to change the face image to 128 x 128 pixels, because the initial image resolution is considered too big and will interfere in the next stage.

Converting RGB Image to Grayscale
Change the image to grayscale aims to characterize the face is clearly visible and can be taken pattern feature.

Feature Extraction
The process of feature extraction and classification is performed using Principal Component Analysis (PCA) and Local Binary Pattern (LBP) algorithms.The main component k of the vector x is observed using the equation:

Principal Component Analysis (PCA)
Where W = (v1, v2, …,vk) (5) The Eigenvector in PCA is expressed as a facial characteristic, therefore this method is often referred to as the eigenface method [22].

Local Binary Pattern (LBP)
Local Binary Patterns are descriptors that describe localized texture patterns on grayscale images [23].LBP is defined as a set of circular neighborhoods with centered pixels in the middle as shown in Figure 1 The gi notation is the pixel value of the i-th neighbor.gc is the central pixel used as the threshold value for the pixel to become a binary code.Thresholding is done to get the LBP value on neighboring circular pixels using the central pixel, then multiplying by binary weighting.For example for sampling points P = 8 and radius R = 1, the calculation of LBP values is illustrated in Figure 2.

Distance Measurement (Classification)
The classification stage is generally done in two main processes namely training and testing process [25].The training process is carried out using a set of training data in the form of training images containing the characteristic parameters used to distinguish between one object and another object.Meanwhile, the testing process maps the train data to the target of training through a formula.There are several approaches between them based on statistics (machine learning) or rule-based (rule-based) [26].In this study researchers using Euclidean Distance algorithm and Chi Square Distance, the explanation as follows:

Euclidean Distance
Euclidean Distance is the most commonly used metric for calculating the similarity of two vectors [27] with the following equation: OpenCV implements the compareHist function to perform a comparison [28] with the following equation:

Research Methodology
This research consists of several research related to the research conducted that is system architecture, application architecture and application testing parameter which will be explained as follows [29].

System Architecture
System architecture is designed with the aim of knowing how the system when implemented and what components are related to this research, as seen in Figure 3 there are several components that support the running of the system, such as webcam, laptop, software, people objects, etc.The application system is in a java programming that is on a computer with the help of a camera hardware to see objects to be tested with Netbeans software and OpenCV libraries using PCA and LBP methods.Recognized data is then stored on the image dataset.

Testing Parameters
Test parameters are used to calculate the efficiency of the comparable algorithm.The parameters used are described below.

Accuracy
Accuracy is a measure of the accuracy of the system in recognizing the input given to produce the correct output. (8)

Time of Recognition
Computational time is the time it takes the system to perform a process using the time calculation completed minus the start time.
1. False Acceptance Rate (FAR) FAR is the percentage of unauthorized users that are authenticated as legitimate users [2].FAR on the system is a mistake in recognizing the identity of the input image, either the error in recognizing the input image of the individual outside the detected database as an individual in the database, nor the error in recognizing the input image of the individual in the database identified as another individual [29].
2. False Rejection Rate (FRR) FRR is the percentage of legitimate users denied in authentication [2].The FRR in this system is an error in rejecting the input image, i.e. an image of the input that should be recognized (the identity contained in the database) becomes unrecognized [30]. (10)

Face Dataset
Face dataset is a collection of data that is divided into two parts namely the training of face images and face image testing, the following explanation about it:

Face Image Testing Dataset
The image testing dataset is a collection of image data obtained during the test process.As for the training stage of the facial image as shown in Figure 5.

Result
The test results were performed using a combination of 4 scenarios against Principal Component Analysis (PCA) and Local Binary Pattern (LBP) algorithms.The scenario results in comparison of average accuracy, average recognition time, FAR and FRR PCA and LBP algorithms can be seen in Table 1 and Table 2.

Conclusion
The designed system is able to perform face recognition by utilizing OpenCV library.The result of the comparison of characteristic extraction algorithm is obtained by time-efficient algorithm that is Local Binary Pattern.Scenario with efficient results as in Table 2 that is on the condition of the object facing the camera and sunlight in the room that is the average accuracy of 98.59%, the average time of introduction of 812.817 milliseconds.
Component Analysis is a method of feature extraction and serves to reduce the dimension of image data without losing the information contained in a face image.The following is the preparation of Principal Component Analysis (PCA) algorithm [21]: a. Setting up data by creating a set of S consisting of all training images ie (1, 2, . . .m) b.Calculation of mean value of an image with equation: S = (1, 2, . . .m) Information: xi = data ke-i from variabel x n = numbers of data Calculation of the mean value to reduce the dimension to be calculated in the next process.c.Finding the value of the image covariant matrix with the following equation: The purpose of this covariate matrix search is to facilitate the search for eigenvalues and eigenvectors.d.Calculate eigenvalue i and eigenvector vi from S (kovariate matrix) with the following equation: Svi = ivi, i = 1,2, . .., n (3) e. Sorts the eigenvector and eigenvalue values from large to small based on the order of eigenvalues.The main component k is the eigenvector corresponding to the largest eigenvalue.
degree n = number of vectors xik = vector training image xjk = vector testing image 2.3.2CHI SQUARE DISTANCE Chi Square Distance is used to compare the difference of histogram value value from training image in database with histogram value from test image.When comparing two histograms (H1 and H2) by selecting a matrix (d [H_1 H_2]) to assess how well the histogram is,

Fig. 5 .
Fig. 5. Face Image Testing Dataset.The total amount of image data is obtained on the video / collection of training frames and tests with the number of training images 80 and 241 test images, so the total face image is 321 of 360 frames.The following overview of the sequence of image data obtained in Figure 6.

Table 1 .
Comparison of average accuracy and average recognition time.

Table 2 .
Comparison of FAR and FRR, PCA and LBP.