An Overlapping and Integral Blurry Evaluation Method to Optimize Tissues Scanning from Microscope

Virtual microscopes are devices that employ an automated XYZ mechanism to scan a sample, leading to the obtention of a series of small pictures that, when merged, compose a high-quality representation of the specimen. Due to the assembly tolerances, these devices may suffer from zones out of focus, reducing the quality of the final image. To solve this problem, researchers employ evaluation methods to calculate the blurriness of the image, and when an out of focus picture is located, performs the process of autofocus. Because of the variation on the types of samples, especially in pathology, the existing evaluation methods may fail to deliver a proper blur detection. This article proposes an optimized algorithm for the detection of the blurriness while conducting the sample scan in real time, ensuring that every scanned picture will be in focus. For this purpose, the algorithm relies on two functions, the comparison of the overlapping zones of two consecutive images, and the multivariate linear regression of a series of focus functions. The algorithm proved to be a reliable tool when applied in different pathology samples.


Introduction
Whole-slide-Image (WSI) is a procedure that permits the digitalization of samples of pathological purpose to obtain high-resolution images. Due to the various applications of this technology, it has become a key factor for the innovation of digital pathology [1], unleashing in the construction of different microscopes based on this approach [2][3][4][5][6][7][8][9][10][11]. Unfortunately, there are limitations for the implementations of WSI for clinical use, among them is its high cost due to the use of specialized software, hardware, and maintenance .
Various microscopes have been proposed to tackle the mentioned drawback. Notwithstanding, the proposed low budget devices decrease the image quality of the sample. Due to the tilt of the mechanical stage produced by the increasing assemble tolerances, that causes blurriness in small zones of the WSI. This inconvenient, although on a lesser scale, is still an issue of expensive microscopes [12]. Different researchers have stated algorithms to detect the blurry zones and to conduct the autofocus of the microscope. The detection of the blurry zones is accomplished by the selection of an adequate focus measure. Many subroutines were found in the literature to calculate the focus, such as Tenengrad function, energy gradient function, Brenner function, and Entropy function [13][14][15][16][17][18][19][20][21]. The evaluation method should be selected based on its performance, unbiasedness, higher signal to noise ratio [11][18] [22]. Unfortunately, the wide variety of tissues make the selection between those functions difficult, as for some samples the functions can detect false blurriness.
To increase the generality of the evaluation function we propose a new algorithm, overlapping and integral blurry evaluation method (BIEMO). This method integrates the contrast measure function, variance function, the lens distance, the Laplacian variance, and the Brenner function, to calculate the focus of the microscope. Furthermore, as the WSI is based on the capture of a series of small pictures and then in the stitching of them, we also use this stitching process as a second evaluation function. The stitching process merges two subsequent images calculating the feature difference between the overlapping regions of the two pictures. Thus, we propose the use of this overlapping region as a mean to evaluate the degree of blurriness of two images.

Functions and correlations
Let's consider the WSI image, which is divided by a set of small pictures C(i), each figure being in gray-scale. Then, for each pair of adjacent photos, a series factors are considered to determine if an image is out focus. The mentioned factors are the key-points matcher Km, the data variance σ, the Brenner's High Bhq and low Blq quality scores, the contrast measure function Cm, the Laplacian variance of the matrix L V , and the distance between the camera and the sample F d . The definition of these functions is in the following section.

Functions definition
The key-points matcher Km is the function utilized for the overlapping. Given two images, the function searches for the interest points to locate the area that is coincident in both pictures, returning a number that quantifies the percentage of coincidence. The algorithm used in this article for the calculation of the key-points is the AKAZE key-point detector and the Flann Based Matcher. Due to the complexity of those functions, the reader is encouraged to look for the reference [23].
The mean and the variance functions are both expressions used to evaluate the blurriness of the image (in the next sections we will be using N and M as the dimensions of the matrix, where N is number of columns and M number of rows). They are presented in the following: The Brenner's high (B hq ) and low (B lq ) quality scores calculate how much of an image has high and low quality, we calculate the proportion of the two quantities B hq /B lq . The following equation defines both quality scores: The main feature of this Brenner coefficient is that when the image is focused, the value is lesser than 1 and when is not focused is greater or equal to 1. Figure 1 has a Brenner coefficient of 0,59.


The Laplacian variance (equation 5) highlights the edges of the image, using the gradients in both directions of the Cartesian plane. We rely on this measure to calculate how blurry is the photo. Then, if we cannot differentiate the edges of the image, the picture is out of focus. Figure 2, which has a Laplacian variance of 118,68 and a normal variance of 1004,65. Thus, due to the difference between the two measures is huge, we conclude that the picture is out of focus.
 To calculate the contrast measure Cm, we calculated The Sobel operators. The Sobel operators are a measure of the robustness of the image to noise. To calculate them we calculate two gradients on the image, Gx and Gy, by applying the following kernels, respectively .
Then, by applying the norm to the gradients, we obtain the contrast measure utilizing the next equation:

Multivariate linear regression
We utilize the functions defined in section 2 to obtain a multivariate linear regression model to predict the focus of the picture. We picked those functions due to the results of the ANOVA analysis, which stated that these evaluation methods had the greatest correlation with the blurriness of the image. The equation of the model is the following: Where a value of y close to 1 means that the image is focused, otherwise it means blur. We obtained the previous equation employing a dataset of 100 images for the linear regression. Moreover, to prove the effectiveness of the model, we calculated the determination coefficient R, which yielded a value of 0, 98, implying fitness of the equation to describe the blurriness. Figure 4 presents pseudo-code of the BIEMO algorithm. The algorithm starts at the first position, C i−1 , then goes to the next point, C i (with i starting in 1). With both of the images in gray-scale, the algorithm divides each image into ten equally spaced parts. Then, the algorithm compares the last division area (LDA) of the C i−1 image with the first division area (FDA) of the image C i . Figure  3 describes the process. The FDA and LDA are used to calculate Km if this value is below to 0.001 the microscope must conduct autofocus; otherwise, the microscope goes to the next image C i+1 . If the value of i is less than the total of images n, the algorithm calculates i mod 6. If this operation is equal to 0 the algorithm employs equation 8, then if this value is less or equal to 0.5 the microscope conduct autofocus. In the opposite case, the algorithm returns to step 2. The i mod 6 function was implemented to reduce the use of the regression (decreasing the computational cost) and to avoid the mix between different zones of tissue, in the case of pathological samples.

Experiment and analysis
We captured six sequential images to test the validity and feasibility of the algorithm. To analyze the different scenarios in the analysis some of the pictures were out of focus, and others in the transition zones of different tissues.
For the analysis we use two matrices, a state matrix (Tables 1,3) and a comparison matrix (Tables 2,4). In the analysis, we replace a focused image by a blurred one in order to see how much the key-points matches decreased (we did it on purpose). Two different results were found (Tables 1-4).   (v) proportion: the relation between the number of keypoint matches and the sum of the areas of the last division area of image 1 and the first division area of image 2.
From the results obtained in Tables 1-4, we conclude that any change in the blur level of the study area will make the algorithm to do the auto-focus process. In Table  2, we can see a proportion colored in red, as well as in Table 4, that means that in Comparison matrix 1, the proportion of key-point matches decreased by 94% with respect to Comparison matrix 2, due to the blur image that we introduce in the first experiment (replacing the focus version by the blur version). In this way, if we continue analyzing in group of six images, we will get a high performance and less computational cost of processing, because we will use the regression model each six times instead of 1 and detect in real time when the camera get out of focus based on this two factors.

Conclusions
In this article was introduced a new algorithm for focusing tissue's scanning in real time with low computational cost and high performance. According to the results shown above, the two factors used to determine whether or not doing auto-focus are highly relevant, they explain the behavior of the focus based on the data of two adjacent images and groups of six ones. There is not a particular restriction in the use of the algorithm, the implementation is quite easy, and the precision is first class.