The Cascaded Enhanced k -Means and Fuzzy c -Means Clustering Algorithms for Automated Segmentation of Malaria Parasites

. Malaria continues to be one of the leading causes of death in the world, despite the massive efforts put forth by World Health Organization (WHO) in eradicating it, worldwide. Efficient control and proper treatment of this disease requires early detection and accurate diagnosis due to the large number of cases reported yearly. To achieve this aim, this paper proposes a malaria parasite segmentation approach via cascaded clustering algorithms to automate the malaria diagnosis process. The comparisons among the cascaded clustering algorithms have been made by considering the accuracy, sensitivity and specificity of the segmented malaria images. Based on the qualitative and quantitative findings, the results show that by using the final centres that have been generated by enhanced k -means (EKM) clustering as the initial centres for fuzzy c -means (FCM) clustering, has led to the production of good segmented malaria image. The proposed cascaded EKM and FCM clustering has successfully segmented 100 malaria images of Plasmodium Vivax species with average segmentation accuracy, sensitivity and specificity values of 99.22%, 88.84% and 99.56%, respectively. Therefore, the EKM algorithm has given the best performance compared to k -means (KM) and moving k -means (MKM) algorithms when all the three clustering algorithms are cascaded with FCM algorithm.


Introduction
Malaria is an infectious disease caused by the Plasmodium blood parasite, with high prevalence in tropical and subtropical regions.World Health Organization (WHO) has given the analysis report for the year 2015 as around 214 million peoples on earth suffers from malaria, with 90% of the victims were from African Region amassing a total of 438,000 death cases [1].Across the globe, it was recorded that 306,000 children under the age of five lost their lives due to malaria, with 292,000 of them are from Africa [1].
Malaria develops to become life-threatening without immediate action.Currently, microscopy-based diagnosis is the most widely used approach to diagnose malaria.The manual microscopic diagnosis is often characterized by its high sensitivity and accuracy.However, its time-consuming and labour intensive characteristics seem to outweigh benefits as it consumes approximately 20 to 60 minutes to examine each slide [2].The final diagnosis ultimately depends on the ability and experience of the experts, and may be prone to human error [3].
Recognition of malaria parasite requires the segmentation of the malaria parasite or infected red blood cells (RBCs) in malaria image.However, the segmentation process is difficult since most of the parasites are often a part of the RBCs that they have infected; presented in a complex and irregular visual aspect.As a result, various applications of image processing techniques for segmenting the malaria parasites have been reported to necessitate the malaria diagnosis process.These include the use of thresholding [4][5], marker controlled watershed with morphological approach [6][7], fuzzy rule base segmentation [8], active contour base segmentation [9] and clustering approach [10][11] as the latest trend.
Clustering is an unsupervised procedure that aims to group a set of data into meaningful groups, based on the distances between data points.K-means, fuzzy c-means and mean shift are among the popular clustering algorithms that have been utilized to segment the region of interest in various types of medical images.K-means (KM) clustering [12] employs the concept of data allocation to the nearest cluster centre based on the shortest Euclidean distance measurements.
Though KM algorithm is very popular for general clustering, this algorithm is sensitive to cluster centres initialization and has the dead centre problem [13].The limitations of KM algorithm could lead to poor representation of data.As a result, several clustering algorithms such as fuzzy c-means and enhanced k-means algorithms have been proposed to overcome the limitations of KM algorithm.In fuzzy c-means (FCM) clustering [14], each data is assigned a membership grade indicating the degree of belonging of the data to each centre rather than assigning each data to one centre as in KM algorithm.The centre is an average of all data points in the cluster, weighted by their degree of belonging of the data to the centre.Substantially, these attributes could minimize the dead centre problem in FCM algorithm.However, this algorithm is still sensitive to cluster centres initialization [15].The limitations of FCM algorithm could lead to poor segmentation result regardless of the resultant cell images being under-segmented or over-segmented as shown in [16][17].
Another type of clustering algorithm that has been proven to produce good segmentation performance on medical image is enhanced k-means (EKM) clustering [18] algorithm.In EKM clustering, the concept of variance and a new version of transferring process for clustered members are used to assist the assignation of data to the proper centre during the process of clustering, so that good segmented image can be generated.In addition, comparisons of different image segmentation techniques show that EKM algorithm has successfully segmented the malaria parasite from the overlapping RBCs compared to the results of KM and FCM clustering [18].
Based on the advantages of EKM algorithm in image segmentation, the current study will utilize its potential by cascading the EKM with FCM algorithm.In addition, the other two popular clustering algorithms which are KM and moving k-means (MKM) [19] will also be cascaded with FCM algorithm to compare the performance of segmentation among these three cascaded algorithms.

Image Acquisition
The malaria thin blood smear samples were obtained from Department of Microbiology and Parasitology, Hospital Universiti Sains Malaysia (HUSM), Malaysia.The malaria images were acquired by using a Luminera Infinity-2 digital camera mounted to the Leica DLMA 1200 digital microscope with 100X magnifications.Images were captured in the bitmap (*.bmp) format at the resolution of 800×600 pixels.This study have analysed a total of 100 images obtained from eight malarial Plasmodium Vivax slides.The images consist of malaria parasites in ring, trophozoite, schizont and gametocyte stages.

The Proposed Cascaded Enhanced k-Means and Fuzzy c-Means Clustering Algorithms
As previously mentioned, FCM clustering algorithm was introduced as a fuzzy version of KM algorithm for reducing the weaknesses of KM.However, its performance is depending on the choice of the initial cluster centre.If a good initial cluster centre that is close to the actual final cluster centre is used, the FCM algorithm will converge faster and the processing time can be deducted.Without a proper initialization, a set of poor final cluster centres would be generated by this algorithm hence producing poor clustering results.Even though FCM is sensitive to cluster centre initialization, the concepts of fuzziness makes this algorithm becomes more flexible in producing the best possible centre.Thus, to take advantage of EKM and FCM clustering algorithms, the cascaded EKM and FCM clustering algorithm is proposed.Technically, the cascaded clustering algorithm combines the fundamental theory of EKM algorithm which employs the variance concept for data allocation to its cluster centre, with the concept of FCM algorithm that allows each data to be simultaneously assigned to multiple clusters by different degrees of membership.Thus, the final centres generated by EKM clustering will become the initial centres for FCM clustering.Besides, fuzzy membership function is used along with Euclidean distance and variance calculations, in order to assist the assignation of the members to the proper centre during the clustering process.
In order to describe the cascaded EKM and FCM clustering algorithms, consider an image with resolution of X × Y pixels to be clustered into n c regions.Let p(x,y) as an input pixel to be clustered and c j is the j-th centre (cluster) (x = 1, 2, …, X, y = 1, 2, …, Y and j = 1, 2, …, n c ).The cascaded EKM and FCM clustering algorithms for image segmentation is described as follows: 1. Initialize the centres using: where min p(x,y) and max p(x,y) are the minimum and maximum pixel intensity in the image.2. Set the values of α 0 = 0.05, α b = 0.05 and α a = α 0 , where α 0 and α b are small constant values.3.For each pixel of an image, calculate the Euclidean distance, d using: 4. Assign all pixels to the nearest centre based on d and calculate the centre positions using: 5. Check the variance between the members with their centre using: Note: The variance calculation is applied for measuring the relationship between the centre and its members.c) is performed in order to reduce the variance of c l by giving up members for p(x,y) < c l and p(x,y) > c l , respectively to the appropriate cluster centre so that the variance of c l will be kept at a reasonable value.d.Recalculate the positions of c l and c n according to: Note: n l and n n in ( 5) and ( 6) are the number of the new members of c l and c n respectively, after the reassigning process in step 7d. 8. Update α a according to α a = α a -α a / n c and repeat steps 6 and 7 until  ( ) where, 12. Calculate the centre positions using: In order to ensure a better clustering process, the basic concept of EKM algorithm suggests that each cluster must have strong relationship strength between the members with their centre so that the differences between them could be minimized.After the final centres produced by EKM algorithm are obtained, these centres will be fed as initial values to FCM algorithm for further clustering process.In FCM algorithm, the fuzziness concept is used for allowing each data to assign a membership grade indicating the degree of belonging of the data to each centre simultaneously.However, in the proposed cascaded EKM and FCM clustering algorithms, steps 11 and 12 are used once and not be repeated as in the conventional FCM algorithm.This is done in order to test the capability of each single clustering algorithm in producing good initial cluster centre to FCM algorithm.If a good initial cluster centre is used, hence a set of good final cluster centres would be generated by FCM algorithm after performing steps 11 and 12. Therefore, the proposed cascaded EKM and FCM clustering algorithms combines the characteristics of both clustering algorithms to produce the best clustering process.

Segmentation of Malaria Parasites
In order to obtain the segmented malaria parasites, the image segmentation process has been conducted based on the procedures as proposed in [20].In [20], these procedures were used for comparing the performance of malaria parasite segmentation using different colour components.Meanwhile, this study utilizes the similar segmentation procedures for comparing the performance of three cascaded clustering algorithms in segmenting the malaria image.The three cascaded clustering algorithms are obtained by cascading KM, MKM and EKM with FCM clustering algorithm.The procedures for segmentation of malaria parasite are illustrated in Fig. 1.
Initially, the pre-processing step is conducted by applying the modified global contrast stretching (MGCS) technique to generate high contrast image.Then, segmentation process using the three cascaded clustering algorithms is performed on green component image to isolate the RBCs and background regions, followed by 7×7 pixels median filter for noise reduction.Finally, seeded region growing area extraction (SRGAE) algorithm is used to remove the large unwanted regions such as artefacts, platelets and RBCs in the image.

Results and Discussions
The three cascaded clustering algorithms have been proposed in order to combine the advantages of KM, MKM and EKM with the FCM clustering.Therefore, comparisons among these three cascaded clustering algorithms have been made in order to measure the performance of each cascaded clustering algorithm for image segmentation.Figs.2(a In order to easily isolate the malaria parasite from the RBCs and background regions, the green component information has been extracted from the enhanced RGB image.Then, FCM and the three cascaded clustering algorithms have been applied for segmenting the malaria image.Fig. 3 shows the results of images after applying the clustering algorithms on the green component image.

Conclusions
This paper has proposed an image segmentation technique for malaria images, by cascading two clustering algorithms which are EKM and FCM algorithms.Overall, the EKM algorithm has given the best performance compared to KM and MKM algorithms when all the three algorithms are cascaded with FCM algorithm.In addition, a better segmentation result has been achieved by cascading two clustering algorithms as compared of using a single clustering algorithm for image segmentation.Here, the segmentation performances using 3 clusters for the cascaded EKM and FCM algorithms is found to be slightly effective than the EKM algorithm with average segmentation accuracy, sensitivity and specificity values of 99.22%, 88.84% and 99.56%, respectively.Overall, these findings prove that the proposed cascaded EKM and FCM clustering algorithms have good potential in the application of malaria image segmentation as compared to the other segmentation techniques.

2 MATEC 6 .. If 2 σ (c s ) < α a 2 σ
Web of Conferences 150, 06037 (2018) https://doi.org/10.1051/matecconf/201815006037MUCET 2017 Find c s and c l , the centre that has the smallest and the largest value of variance between the members with their centre.7(c l ), a. Find c n , the nearest centre to c l by calculating the Euclidean distance between the c l and other centres.b.If c n < c l , assign the members of c l to the nearest centre if p(x,y) < c l , where x,y∈ c l , and leave the rest of the members to c l .c. Else if c n > c l , assign the members of c l to the nearest centre if p(x,y) > c l , where x,y∈ c l , and leave the rest of the members to c l .Note: The transferring process of c l members in (b) and (

9 . 2 σ (c s ) ≥ α b 2 σ
Reassign all pixels to the nearest centre and recalculate the centre positions using Eq. 3. 10.Update α a and α b according to α a = α 0 and α b = α bα b / n c respectively, and repeat steps 5 to 9 until

Table 1 .
Segmentation performances for the segmented malaria images.