A probability approach in cans identification

The objective of this study is to identify can waste into three types based on the images by using a probability approach of trinomial distribution in term regression. Predictor variables considered are the color intensity of red, green, and blue of the images taken at the top, down, and side pose successively. From an independence test between each of the predictor variable and can waste type noted that only the color intensity of red which the image taken at top pose that does not correspond to the can waste types. Based on the Nagelkerke value is found that the variance of the predictor variable data in identifying the can waste type is able to explain the variance of the types of 59.1 percent. The final model show that the significant predictor variables are the colors intensity of green and blue which the image taken at the top pose, the color intensity of red which the image taken at down pose, and the color intensity of red, green and blue which the image taken at side pose successively. The model can identify cans waste into three types based on the images correctly by 73.13%.


Introduction
Aluminum cans are the most widely recycled solid waste in the world. A lot of energy is taken to make them, throwing away one aluminum can wastes as much energy as pouring out half a can of gasoline, and if aluminum cans are thrown in a landfill, it takes between 100-500 years for just one aluminum can to decompose [1]. In recent years, solid waste management, especially recycling based on automatic sorting system is a major challenge for large cities in developing countries [2].
A cans is a metal container designed to hold a fixed portion of liquid such as carbonated soft drink, fruit juice, herbal tea and the like. Cans are made of aluminium (75% international manufacture) or tin-plated steel (25% international manufacture). Worldwide production for all beverage cans is approximately 52 billion units. An aluminium can is a container for packaging, made primarily of aluminum. It is commonly used for foods and beverages but also for products such as oil, chemicals, and other liquids. A common 12ounce-size can weigh 15 grams when empty.
The identification process of cans waste is needed in an automatic sorting system of cans recycling. Several approaches have been applied to identify cans waste either the parametric such as discriminant analysis and non-parametric approach such as back propagation algorithm. The application of discriminant analysis to identify cans waste can be found in [4], while the implementation of back propagation algorithm can be found in [3].
The identification system research using automatic or semi-automatic techniques was started in the 1960s and intensified in the last two decades. With the increased availability of powerful computer, it has received significant attention from the research community. There are currently automatic systems that perform well when the object images are captured under uniform and controlled conditions. However, these technologies have several limitations due to the fact that they are difficult to develop, the quality of the image captured, and lighting problem.
The identification system is generally based on the selection of the color space to detect the color identification methods. Object identification is a feature based approach in which face geometry is taken which includes shape and other features like size, edge, color etc. The algorithm requires 2-D images whose threshold values of intensities are taken into consideration in the measurement of the number of the pixels to get the entire object feature area.
Trinomial logistic regression is also the parametric approach which often considered in many analysis because it does not assume normality, linearity, or homoscedasticity, whereas discriminant analysis requires these assumptions. [10] applied trinomial logistic regression to classify the health level of BUMN is based on the financial ratios, [9] classify the severity of traffic accident victims based on the characteristics of victims, types of vehicles, the characteristics and time of collision, while [8] applied trinomial logistic regression in another name as polynomial logistic regression to classify the risk of motor insurance for the third party. Application of trinomial logistic regression in other fields can be found in [11], [12], and [13]. In this paper, the trinomial logistic regression is applied to identify the cans waste based on the some images into three types.

Data collection and description
We collected three types of cans waste; 100% recyclable (type 1), not 100% recyclable (type 2), and unrecyclable (type 3). The proportion of each of cans waste types is presented in Table 1.
Correct identifying and sorting of aluminium cans from non-degradable materials in solid waste requires many facets of intelligent sorting system. First, for the item to identify itself, it should minimize the number of information carried or remembered on the automatic sorting system. For example, it must be able to distinguish between aluminium and non-aluminium cans and add a new object for automatic sorting in the future. This is to ensure that the best and optimal methods can be chosen in order to produce significant and efficient results. We have collected 3 pictures for each cans. In order to store the feature values, a database with two tables was created using Excel. One of the tables was used to store raw sample data images and another was used to store the template data.
An edge detection algorithm based on the Sobel function was applied to extract the contour of the object after eliminate noise of gray image. Figure 4.4 provides the edge of beverage cans after filtering the global edge.
Each of them is placed on a conveyor belt, which connect to a webcam. The images of them are taken using the webcam for three phases; top, down, and side poses. We save it in JPEG format with a resolution of 320 x 240 pixels, and then extract the color intensity of red, green and blue of each pose using a Matlab comment. Of this population, 50% is taken as a database or for learning, and the rest for testing.
The image capture contained image acquisition and image pre-processing steps. The feature extraction contained cropping and extraction steps, where the cropping consisted of template matching and window feature. On the other hand, the extraction comprised edge and color extraction, thus similarity measurement is to measure the similarity between the measured features and the values stored in the database. Some factors, such as lighting conditions, speed of the conveyor belt, camera gain, contrast and brightness, height of the camera above the object, may strongly influence the quality and sharpness of the final image; experiments were carried on in order to find an optimum configuration. In feature extraction phase also considered both color and gray scale images. For color image each of the three color components RGB are considered separately. For gray scale image, standard grayscale transformation is obtained from the original RGB image. Processing with the gray-level of one image is called point processing. Once the raw image has been provided by the webcam and grabbing routine, the edges are extracted and the shape of the cans is identified. This procedure has been designed to also work in conditions of bad lighting, high image noise although the experimental condition may not require all the following image processing steps. Processing with the gray-level of one image is called point processing. The manipulation of image is meant for extracting pixel value and pixel position of each pixel of the image. The image is loaded in a form or picture box using load Picture function. Then, using the point method, pixel values of the pixels of an image are obtained. Next, the red, green and blue values for the pixel values are separated. After the necessary operations, the new pixel values are placed in the appropriate positions in a form or in a picture box using pset method and RGB function. All these steps will result in image manipulation. Cropping image The identification process was handled by the system, after each can was automatically detected by the webcam. The features of the captured object were then matched with the data for the cans waste in the database. For cans waste identification, the rule based classification was developed using the energy distribution with different quantization level.
The main process in develop the database system consisted of image pre-processing, feature extraction and classification. Precision is defined as the ratio of genuine records in the template database retrieved by the identification system and the total number of templates retrieved. Recall is defined as the ratio of the genuine records in the template database retrieved by the identification system and the total number of genuine records in the template database.
The identification process is undertaken by comparing the value of the image with the values in the database to find the closest values. To determine the accuracy of the identification system, the can images are stored in the database with a standard pose.

Results and discussion
From the results of extracting color intensity of red, green, and blue of the images of cans waste types, we obtain the value of intensity of red, green, and blue is quite varied. They both mean and standard deviation of the values range of color intensity of each cans waste type can be seen in Table 2. The cans waste for type 1 has the biggest mean and standard deviation of values range, whereas for type 3 has the smallest mean.
For identification cans waste based on the colors intensity of red, green and blue, we employed the probability approach; trinomial distribution in term regression as described in Sec.2, where predictor variables , , X X X denote color intensity of red, green, and blue which the image is taken at top, 6 5 4 , , X X X denote color intensity of red, green, and blue which the image is taken at down, and , , X X X denote color intensity of red, green, and blue which the image is taken at side successively.
We employed an independence test between each of the predictor variable and cans waste type at first. Based on the chi-square test ( % 5 = α ) is revealed that only the color intensity of red which the image is taken at top pose ( 1 X ) that does not correspond with the can waste type. Then the variable is not involved in the trinomial regression modeling to identify the type. By using Gstatistic, the significance of predictor variables simultaneously in the trinomial model denotes that at least one predictor variable of eight predictor variables that influence in identifying the type cans waste ( ) notes that some variables have p-value > α . By using the backwardstepwise procedure, we have the significant variables as presented in Table 3. The final trinomial regression model parameters obtained given in Table 4, where type 3 as the reference category. The variance of the predictor variables data in identifying the cans waste type are able to explain the variance of the types by 59.1 percent (based on Nagelkerke value), while the rest is explained by the other predictor variables than this model. Table 4, we known that for cans waste type1, the higher the color intensity of the green of the image taken at top pose ( 2 X ), the tendency of an image to be identified as type 1 is 1.31 times more than in type 3. The higher the color intensity of the blue of the image taken at top pose ( 3 X ), the tendency of the image to be identified as type 1 is 1.06 times more than type 3, etc, while for type 2, the higher the color intensity of the green of the image taken at top pose ( 2 X ), the tendency of the image to be identified as type 2 is 1.75 times more than in type 3. The higher the color intensity of the color of the image taken at top pose ( 3 X ), the tendency of the image to be identified as type 2 is 0.68 times more than type 3, etc. The color intensity of red of the image taken at side pose ( 7 X ) has the highest tendency to be identified as type 1 whereas the color intensity of blue of the image taken at side pose ( 9 X ) has the lowest tendency. The color intensity of blue of the image taken at side pose ( 9 X ) has the highest tendency to be identified as type 2 whereas the color intensity of blue of the image taken at top pose ( 3 X ) has the lowest tendency. The correct result percentage of testing to identify the can waste type is presented in Table 5. The detection algorithm was simplified for cans classification. When a can is moved on the belt conveyor through the detection area, it will automatically be identifying.

From
The higher the level correctly of identification cans waste will denote the higher the level accuracy which will indicate the better the sorting system performance. We found that the approach can identify correctly more than 70%. It is enough for a good sorting system. This method is a useful procedure for analysing and comparing three types of cans. These results are a definite improvement on all other published results so far which quote a sorting accuracy of only.

Conclusion
Employing the probability approach of trinomial distribution in term regression, the predictor variables that affect the identification of cans waste type based on the images are the color intensity of green and blue which the image is taken at top pose, the color intensity of red which the image is taken at down pose, and the color intensity of red, green and blue which the image is taken at side pose successively.
The model can identify cans waste into three types based on the images correctly by 73.13%. The level correctly which more than 70% is enough for a good sorting system. This method is a useful procedure for analysing and comparing three types of cans. but the other approaches can be tried to improve performance sorting system.
It is estimated that more investigations on the modern system are necessary. Although knowing how each different component works is a step in the right track, complete combination will give a better sense of the identification system's overall cycle time, failure rates, and problem areas.