Color feature extraction of HER2 Score 2+ overexpression on breast cancer using Image Processing

. One of the major challenges in the development of early diagnosis to assess HER2 status is recognized in the form of Gold Standard. The accuracy, validity and refraction of the Gold Standard HER2 methods are widely used in laboratory (Perez, et al., 2014). Method determining the status of HER2 (human epidermal growth factor receptor 2) is affected by reproductive problems and not reliable in predicting the benefit from anti-HER2 therapy (Nuciforo, et al., 2016). We extracted color features by methods adopting Statistics-based segmentation using a continuous-scale naïve Bayes approach. In this study, there were three parts of the main groups, namely image acquisition, image segmentation, and image testing. The stages of image acquisition consisted of image data collection and color deconvolution. The stages of image segmentation consisted of color features, classifier training, classifier prediction, and skeletonization. The stages of image testing were image testing, expert validation, and expert validation results. Area segmentation of the membrane is false positive and false negative. False positive and false negative from area are called the area of system failure. The failure of the system can be validated by experts that the results of segmentation region is not membrane HER2 (noise) and the segmentation of the cytoplasm region. The average from 40 data of HER2 score 2+ membrane images show that 75.13% of the area is successfully recognized by the system.


Introduction
The accuracy of the status Human Epidermal Growth Factor Receptor 2 (HER2) is very important to optimize the result of breast cancer treatmementPerez, et al., 2014). Human epidermal growth factor receptor 2 (HER2) overexpression in breast cancer is an indicator of poor prognosis. HER2 is a prerequisite for a treatment with tissue targeting the epidermal growth receptor (Cresti, et al., 2016). One of the major challenges in the development of early diagnosis to assess HER2 status is recognized in the form of Gold Standard. The accuracy, validity and refraction of the Gold Standard HER2 methods are widely used in laboratory (Perez, et al., 2014). Method determining the status of HER2 (human epidermal growth factor receptor 2) is affected by reproductive problems and not reliable in predicting the benefit from anti-HER2 therapy (Nuciforo, et al., 2016).
The computer-assisted research is carried out to improve the accuracy and reproducibility of breast tissue assessment. Breast tissue specimen image assessment has been done in recent years. Brugmann, et al. (2012) developed and validated a software which is capable of IHC scoring algorithms automatically based on membrane connectivity on each image. Keller, Chen, and Gavrielides (2012) developed a system which classifies IHC score based on color feature. The system is built using algorithms fuzzy c-means clustering in the HSV color space. Hall et al. (2008) resulted in a system which is able to give a score of IHC using multiple stages, such as selection and image capturing, color decomposition, and membrane isolation algorithm (MIA) with three features based on the average intensity of membrane area. Masmoudi

A. Color Deconvolution
Color deconvolution is needed in the application of membrane extraction method. The approach has been used in previous of this method to divide the allegations of cell membrane overexpression. Ruifrok

B. Calculation of Color Feature
The pixel values were obtained from the clustering using color deconvolution method. The pixel values were used as criteria for location of the position coordinates. RGB values of the layer were taken to apply the calculation of color features. The chromaticities (r,g,b) were given by the raw color value divided with the sum of all the color total value (Laursen, et al., 2014).
There were also excess green (ExG) and excess red (ExR) (Meyer and Neto, 2008). (2) Characteristic assumptions of color features include probability functions. It can be written to equation (10).
E. Skeletonization The extraction results of the membrane skeletonization technique were used to obtain a linear line to shape the object of the membrane extraction model. Skeleton object is an important topological description of the object of two-dimensional binary image. It has been used in a wide range of applications (Levine, 1985;Serra, 1982;Pitas & Venetanopoulos, 1990). The skeleton formula of Serra (1982) can be seen in equation (12). (12) where Sn (X) shows a part of skeleton SK (X) corresponding to the maximum radius in disk.

Research Methodology
We extracted color features by methods adopting Statistics-based segmentation using a continuous-scale naïve Bayes approach. In this study, there were three parts of the main groups, namely image acquisition, image segmentation, and image testing. The stages of image acquisition were image data collection and color deconvolution. The stages of image segmentation consisted of color features, classifier training, classifier prediction, and skeletonization. The stages of image testing were image testing, expert validation, and expert validation results (see fig. 1).
The datasets of HER2 images were obtained from the Histology Laboratory of Pathology, Universitas Muhammadiyah Yogyakarta. The tool was sigma hd microscope camera, olympus. The images were taken from patient preparation with a microscope at 400x magnification. The staining of HER2 was done by using IHC which was less expensive than FISH and CISH (Dobson, et al., 2010). The datasets used 40 images. The data were a diagnosis of HER2 score 2+. The color deconvolution was the automatic annotation to separate the alleged membranes and cells in the image. The results of this annotation were used as criteria to object and background color feature. The calculation of color features was done by using equation (1), equation (2), and equation (3). The calculation values of the color features were processed by classifier training and classifier prediction. They were the eight-layer image namely red (R), green (G), blue (B), excess green (ExG), Chromaticities red (r), Chromaticities green (g) , Chromaticities blue (b), and excess red (EXR). The results of classifier training and classifier prediction were to perform skeletonization. The skeletonization was done by experts to justify the overexpression of the membrane criteria.
The researchers used the approach to the average value of the threshold pixel in the image to provide a solution to the justification of experts. The results of the final image underwent the validation process by experts to provide justification to determine the percentage of successful segmented membrane by the system model.

Results
This section shows the results of the proposed method. Figure 2 is an example of the implementation of the system model. The system model implementation was designed from the datasets. Ten samples from the image datasets were taken to the modeling system. The end results were validated by physicians. The validation was intended to determine the accuracy of the percentage rate by the membrane systems model. Figure 2 (a) is the original image without image processing. The initial image was the result of the image on hd microscope sigma camera, olympus. Figure 2 (b) shows the results of the color deconvolution, calculation of color features, classifier training, and classifier prediction. Figure 2 (c) is the result of the image skeletonization and application. Figure 2 (d) is the final image. The final image was validated by experts to provide an assessment of the membrane percentage. It is segmented by the system model. Membrane features contain microscopic images of breast cancer cells using stain Immunuhistochemistry/IHC. They are contained on the distribution of RGB image layer. Sample distribution histogram value of the RGB layer on the membrane characteristics can be seen in figure 3. The x-axis is the range of variations in the intensity image pixel value ranging from 0 to 255 (grayscale). The y-axis represents the sum of the number of pixels contained in the image. Figure 3 is a sample of the RGB histogram distribution on HER2 score 2+. The results are compared with the variation in the intensity of membrane staining positive control specimen. The results of the distribution, HER2 score 2+ membrane values are in the middle of the range