A Review of Image Segmentation Methods for Lung Nodule Detection Based on Computed Tomography Images

: The detection and segmentation of lung nodules based on computer tomography images (CT) is a basic and significant step to achieve the robotic needle biopsy. In this paper, we reviewed some typical segmentation algorithms, including thresholding, active contour, differential operator, region growing and watershed. To analyse their performance on lung nodule detection, we applied them to four CT images of different kinds of lung nodules. The results show that thresholding, active contour and differential operator do well in the segmentation of solitary nodules, while region growing has an advantage over the others on segmenting nodules adhere to vessels. For segmentation of semi-transparent nodules, differential operator is an especially suitable choice. Watershed can segment nodules adhere to vessels and semi-transparent nodules well, but it has low sensitivity in solitary nodules.


INTRODUCTION
Lung cancer has been the most common cancer for long. It often appears as a pulmonary nodule on a chest radiograph. Today, most nodules are detected by CT with a detection rate of 40-60%1. However, as shown in Fig. 1, lung nodules have significant variation in size and location and many other diseases can also give the similar appearance, including metastatic cancer, hamartomas, and infectious granulomas such as tuberculosis, histoplasmosis and coccidioidomycosis2. In most cases, needle biopsy is necessary to identify whether a lung nodule is confirmed as cancer. Manual biopsy is relatively dangerous as patients with pulmonary diseases usually cough badly and can hardly hold their breath for few tens of seconds. With the development of CT technology and surgical robot, robotic needle biopsy which can automatically follow the position of the lung nodule while the patient is coughing is expectable to be applied to surgery.
The basis of the robotic needle biopsy is the CT segmentation, which digitalizes the information about lung nodules. Existing segmentation algorithms can mainly be divided to two categories: region-based and edge-based. Region-based segmentations are achieved by the similarity or homogeneity between pixels. Thresholding3, Watershed4 and seeded region growing5 are the most widely used region-based segmentations. Edge-based segmentations are achieved by detecting the edges and contours in the image, assuming that different objects are separated by edges. Some popular edgedetection algorithms are based on differential operator, including Sobel, Roberts, Prewitt, Laplacian, and Canny, while the others such as snake model6 and level set-based Figure 1. lung nodule types (a) solitary nodule; (b) semitransparent nodule; (c) nodule adhere to vessels; (d) nodule adhere to pulmonary surface method 7 are based on active contour. In recent years, researchers apply more specific theories to image segmentation, such as morphology, fuzzy set, artificial neural networks, wavelet transform, etc. However, the performance of these algorithms on lung nodule detection remains unknown, considering lung nodules are adhered to vessels and pulmonary alveolus in some conditions. In this paper, we reviewed some typical image segmentation algorithms and listed some of their applications. We also evaluated the performances of the most widely used segmentation methods for lung nodule detection, so that readers can have a brief idea about which kind of method is better to be applied to CT images. The rest of the paper is organized as follows: Section II introduces the principles of some typical segmentation methods; In Section III, we evaluate the performance of some typical methods and make a discussion about the results; Section IV is the conclusion.

Thresholding
Thresholding achieves segmentation by turning pixels with intensity higher than a threshold value and the others into white and black pixels respectively. The optimum of the threshold is the most important factor. Otsu algorithm is a widely used thresholding which can automatically determine the optimized threshold by maximizing inter-class variance 3. Denote the probability of two parts separated by t as and , and class mean grayscale as and . The mean grayscale and inter-class variance can be resented by (1) and (2) respectively.
The threshold maximizing inter-class variance can be find by traversal.
Thresholding is often combined with other methods. Ying et al. (2009) propose an optimized method combining 2D Otsu and genetic algorithm to segment lung CT image8. Helen (2011) combine 2D Otsu with particle swarm optimization (PSO) to segment pulmonary parenchyma in CT images9.

Region Growing
Given an initial seed pixel, the region growing method then judges whether neighbouring pixels should be included in the region. Generally, all pixels will belong to a certain region after the region growing segmentation is done. As we only need to find the nodule region for lung nodule segmentation, the method can be simplified as follows. Firstly, we manually choose a pixel inside the nodule region as an initial centre (0). The algorithm will then include the neighbouring pixels ( ) (2D or 3D) which meet the growing requirements in (3), where is a neighbour pixel list containing pixels, is the set threshold and ( ) is the gray scale difference, which can be represented as (4), where ( ) represents grayscale of pixel . Taking the included pixel as center, repeat this step till no pixel can be included. If includes more than one ( ) at a time, choose the pixel with minimum ( ) as the seed point.
Region growing has been widely used in nodules detection. Parveen and Kavitha (2013) use region growing method to realize automatic segmentation of lung cancer nodules10. Wu and Wang (2012) applies 3D region growing to reconstruct original 3D shape of nodules, which help doctors to distinguish nodules from vessels more effectively11.

Differential Operator (canny)
Edge is a result of discontinuity of grayscale. Derivative function, usually first derivative and second derivative, can be used to detect the edge. There are some widely used differential operator. Take Sobel operator as an example. It uses two 3 × 3 kernels to calculate the approximations of the horizontal and vertical derivatives, denoted as and respectively. The computation of them are shown in (5): where represents an image smoothed by a Gaussian filter and * represents the 2-dimensional signal processing convolution operation. From (5), the edge gradient and direction Θ can be determined: However, after applying gradient calculation, the edge extracted from the gradient value is still quite blurred. As an improvement, Canny operator combines Sobel operator with an edge thinning technique called nonmaximum suppression. Since there should be only one accurate response to the edge, the non-maximum suppression sets all the gradient values to 0 except the local maxima, which indicate locations with the sharpest change of intensity value. Sun and Wang (2011) introduce the principles and compare the difference between Roberts, Prewitt, Sobel, LOG and Canny operator18.

Active Contour (Snake Model and level set model)
Snake model is a typical parametric active contour model. After the initial contour is manually set, the deformable curve ( ) can be represented as: The energy function defined as (9), where the first two terms denote internal energy and the last term denotes external energy. ( ) is spring factor, ( ) is strength coefficient.
Finally, the segmentation of an image converts to calculate the minimum of the energy function.
Furthermore, in order to process the automatic split and merge of the contour, geometric active contour is proposed. A typical one is level set model. To describe the deformation of a boundary, level set model defines the boundary as the intersection of a surface and the zero level plane. The surface is updated with force derived from the image. A point at intersection at time satisfies (10).
The geometric deformation is only concerned with force normal to the surface F.
Mirderikvand (2016) present a combination of the Graph Cut and active contour (Snakes) modeling application for CT scan image segmentation19. Soltaninejad (2016) combine adaptive concave hulls with active contours to segment, which is especially useful for automated detection of Juxta-pleural pulmonary nodules20. Liu and Li (2006) propose a calibrating mechanism to automatically weight image intensity and gradient information in the level set speed function. It gives stable and accurate segmentation on complex organic structures, including lung21. Farag et al. (2013) propose a general lung nodule shape model and use level set for segmentation, allowing the problem of nodules attached to the lung walls and vessels to be overcome21.

EXPERIMENTS
To evaluate the performance of the reviewed segmentation methods, we applied five segmentation methods, including Otsu thresholding algorithm, region growing, watershed, canny differential operator and active contour, to four different kinds of nodules respectively. The segmentation results are shown in Fig.  2. The result shows that all segmentation methods have high quality in terms of nodules without adhesion in contrast to low quality with regard to semi-transparent nodules and nodules adhere to pulmonary surface.
The segmentation methods are evaluated on three aspects, including sensitivity, specificity and F-measure. Sensitivity measures the portion of positives that are correctly identified as such. Canny differential operator has highest sensitivity in all categories of nodules, especially semi-transparent nodules (0.9025). Watershed always has low sensitivity, in other words, high false negative rate. Specificity measures the portion of negatives that are correctly identified as such. All segmentation methods have high specificity, larger than 0.99. F-measure is the harmonic average of the precision and recall. F-measure is especially low for nodules adhere to pulmonary nodules, lower than 0.6. It shows that tested segmentation methods are not suitable for this kind of nodules. They also have low F-measure for semitransparent nodules, except canny differential operator (0.9181) and watershed (0.8024).

CONCLUSIONS
In this paper, we reviewed some basic segmentation methods of lung CT nodules based on region and edge. Their segmentation quality is evaluated with four typical kinds of nodules. Thresholding algorithm has high false negative rate on semi-transparent nodules, which can be accurately segmented by canny differential operator, and high false negative rate on nodules adhere to pulmonary surface. Active contour has an average performance. Watershed algorithm has high false negative rate in terms of solitary nodules and nodules adhere to vessels, while region growing performs well on the latter kind of nodules.
To achieve robotic needle biopsy, we need an accurate and real-time lung nodule segmentation method. In this case, active contour and region growing are unsuitable, because of including the manual selection of seed points and initial contour. Differential operator is not efficient enough to achieve robotic needle biopsy, since its high sensitivity requires effective filter to remove the noises. Although thresholding and watershed algorithm has high false negative rate in some cases, they are suitable, considering the exclusion of part of nodules will not affect the accuracy of robotic biopsy. Nodules adhere to pulmonary surface cannot be accurately segmented by above-mentioned methods. Furthermore, a combination of the methods is expected to have a better performance. In the future, machine learning can be included to achieve a higher segmentation accuracy for different kinds of nodule CT images.