Determination of the parameter color of fresh engine oils by color indices and predictive models

: Color prediction tools for motor oils, which can be used to assess their quality are presented in the paper. Three measuring devices working on optical principle for color prediction was used and compared. Color models and color index are defined by RGB, Lab and LCh color components. It was found that the combined use of seven color indices leads to an increase in accuracy for predicting the value characteristics of motor oils.. The obtained results demonstrate that the algorithm used for the analysis of color characteristics of motor lubricating oils allowed the creation of adequate regression models. The proposed research tools can be used in practice, which enables a quick assessment of the quality of motor oils based on the parameter color.


Introduction
The quality of the fresh oils is crucial for the excellent performance of both the equipment and the oil itself.Minimal contamination such as mechanical impurities and moisture can be detrimental to providing a stable oil film, heat dissipation, air, and corrosion protection.
The color is one of the main characteristics of motor oils, which gives information about the condition of both fresh and used oils.When moisture gets into the oil, oil color is changed and it becomes cloudy, when soot gets into the oil, it turns black.Upon oxidation, the oil changes color to reddish brown.The test is conducted in laboratory conditions and takes time.It would be much easier if more accessible, mobile devices were used to perform this analysis, which would provide information about the quality of the product.At the current level of development of science and technology, contact and non-contact measuring devices are being developed [1].Technical instruments working on an optical principle, such as document cameras, digital cameras, colorimeters, video sensors of mobile phones and tablets, have the potential to be used for express analysis of the characteristics of motor oils.Video sensors have the advantage of being affordable, widely distributed and through the data obtained from them models can be created describing the relationship between the characteristics of the analysed product and color components and indices.
The main stage in the development of recognition systems operating on an optical principle is the assessment of the separable motor oils from each other through various features that describe them [2].It is also possible to predict various characteristics of oils that are usually determined in laboratory conditions and require specific hardware and trained personnel to use it [3].Based on the research [4] of two viscosity classes of motor oils, it was found that the combined use of color indices leads to an increase in the accuracy of prediction of motor oil characteristic values.It has been shown that seven color indices can be used to predict these characteristics.Analytical models have been created for automated prediction of the main characteristics of motor lubricating oils.The obtained data can be used to predict a change in their properties, in laboratory conditions.
The main purpose of the present work is to deepen the research by covering all viscosity grades of motor oils available on the market and to determine their quality through color index analysis techniques.

Material and methods
The color of the analysed 112 samples of fresh engine oils was determined according to ASTM D 1500.Four viscosity grades were tested, designated as xW-20, xW-30, xW-40 and xW-50 where "x" is from 0 to 20.Measurements were made in a licensed laboratory.They were conducted at room temperature 22±3 o C and relative air humidity 50±5 %RH.The samples are from different batches of motor oils with different applications and operating levels, according to The European Automobile Manufacturers' Association (ACEA).Table 1 shows the mean values and standard deviation of the parameter color measured on fourth viscosity grades motor oils.

L3-L6
Table 2 shows the mean values and standard deviation of the color of the engine oils.They are divided by color classes.They are presented with their mean and standard deviation.The data presented is color from the Lab color model.To determine the optical characteristics of motor oils, two video sensors and a colorimeter were used, and the examined sample of each oil was 5 ml.Document camera Triumph Board A 405 (TRIUMPH BOARD a.s., Prague, The Czech Republic).Measurements with this camera's video sensor are in daylight, labelled Dev1.The measurement was made using a 45x40x35 cm styrofoam box.This measurement is labelled Dev2.The camera is placed vertically in the centre of the box and above the sample holder.The distance between the camera and the sample holder is 20 cm.When measuring with video sensors, a region of interest (ROI) measuring 1x1 cm is selected from the resulting color digital image.Colorimeter (Dev3), model PCE-RGB-2 (PCE Holding GmbH, Germany).Manufactured according to standard DIN 5033.The device measures 10-bit RGB color.When measuring with a colorimeter, the color was determined at three points of the sample and their average value was taken.Table 3 summarizes the data on the used measuring devices.[5], indices are more effective than color components in separating object classes.Help distinguish the subject from the background in images.Also, the color indices can easily distinguish the levels of white, yellow, or brown between the different analysed objects.The 10-bit values of the RGB color components obtained by the colorimeter are converted to 8bit, according to the formula: From RGB model are calculated color index using the kind [6]: where R, G and B are color components from RGB model.The Chroma (C) and hue (h) values from LCh color model are determined by: Since "h" is obtained in radians it is necessary to convert it to degrees.This is done using the following formula: These values were used in the calculation of color indices.The indices are determined by the formulas summarized by Pathare et al. [7].

Methods used for the selection of informative color features
A more commonly used method for this purpose is the correlation coefficient between the features [8].This method also has disadvantages related to the type and volume of data used [9].For this reason, the methods of successively improving estimates can be used, which significantly reduce the number of features.These methods are suitable for feature selection, both for classification and regression of motor oil characteristics such as viscosity, alkalinity, color.The following methods [10], were used to select characteristics describing the motor lubricating oils: SFCPP, FSNCA and FSNCA.The predictive ability of the obtained feature vectors was checked with the PCR and PLSR methods [2].A second-order polynomial model [11] was used, through which the main characteristics of motor oils were predicted.Coefficient of determination, Fisher's test and Student's test were used to evaluate this model.

Selection of informative color features and indices for color prediction (L)
Figure 1 shows a graph of the results of a selection of informative color features for color prediction.Data is for Dev1.The signs are plotted on the horizontal axis, and the values of their weighting coefficients on the vertical axis.

Selection method
Feature selection Figure 2 shows a graph of the results of a selection of informative color features for color prediction.Data is for Dev2 Fig. 2. Feature selection for regression on data from Dev2.
Table 5 plots the resulting feature vectors for Dev2.The SFCPP method has the most selected features compared to the other methods.Two components from RGB are selected and one each from Lab and LCh.Only 3 are the RGB indices, while the Lab indices are seven.According to the FSRNCA method, one component each from RGB, Lab and LCh was selected.There are four RGB indexes and only one Lab index.All three components from the RGB model and two from the Lab were selected using the RRelieFf method.There are no RGB indices selected, and there are three Lab indices.Table 5. Selected features based on data from Dev2.

Selection method
Feature selection Figure 3 shows a graph of the results of a selection of informative color features for color prediction.Data is for Dev 3. Table 6 plots the obtained feature vectors for Dev3.Using the SFCPP method, one color component was selected, as well as three RGB and Lab indices each.In the FSRNCA method, all three color components from the RGB model and two from Lab are selected.Only one RGB index is selected, while the Lab indices are three.According to the RRelieFf method, one color component from RGB and LCh is selected each.There is only one RGB index selected, while the Lab indices are five.Table 6.Selected features based on data from Dev3.

Selection method
Feature selection 3.2 Verification of the ability to predict the color of fresh motor oils by selected feature vectors.
The ability to predict the color of fresh motor oils was evaluated.Partial least squares regression (PLSR) and principal component regression (PCR) methods were applied.Data from the selected feature vectors were used.These vectors were reduced to the number of components (LVs and PCs) that described more than 95% of the variance in the data.Mean squared error (SSE) and root mean squared error (RMSE) were estimated.The coefficient of determination (R 2 ) was also used as a criterion.The assessment of predictability is based on the accuracy of a linear model obtained between the principal components or latent variables to which the feature vectors have been reduced and the corresponding motor oil characteristic to be predicted.The results of testing the predictive ability of the selected feature vectors on data from Dev1 are shown in Table 7.When using PLSR to predict the color of motor oils, coefficient of determination values (up to 0,38) were observed, which were slightly higher than those obtained by PCR (reaching 0,24-0,38).On the other hand, the errors of PLSR are slightly larger than those of PCR.In PLSR, the SSE errors reach values as high as 2,22, and in PCR as high as 2,2.The RMSE errors are the same for both regression methods and reach a maximum value of 0,53.The results of testing the predictive ability of the selected feature vectors on Dev2 data are shown in Table 8.When using PLSR to predict the color of motor oils, coefficient of determination values (up to 0,43) were observed that were slightly higher than those obtained by PCR (reaching 0,32-0,42).On the other hand, the errors of PLSR are slightly larger than those of PCR.In PLSR, the SSE errors reach values as high as 2,31, and in PCR as high as 2,3.The RMSE errors are the same for both regression methods and reach a maximum value of 0,54.The results of testing the predictive ability of the selected feature vectors on Dev3 data are shown in Table 9.When using PLSR to predict the color of motor oils, coefficient of determination values (up to 0,24) were observed that were slightly larger than those obtained by PCR (reaching 0,19-0,24).On the other hand, the errors of PLSR are slightly larger than those of PCR.In PLSR, SSE errors reach values up to 16,93, and in PCR up to 12-17.The RMSE errors are the same for both regression methods and reach a maximum value of 0,46.The obtained results show that the models obtained from data from a document camera when shooting with homogeneous light (Dev2) and those obtained with a colorimeter (Dev3) describe the experimental data with sufficient accuracy.Images captured in ambient light and the resulting color indices are greatly affected by ambient noise.This is proven by the obtained predictive models and the insufficient accuracy with which the experimental data are described and their poor predictive power.Models obtained with a document camera under homogeneous lighting (Dev2) are about 5% more accurate than those obtained with the same device but under daylight (Dev1).Among the compared devices, the colorimeter (Dev3) shows the highest accuracy and repeatability of the obtained results.Additional analysis was done on the patterns obtained from data from this device.Figure 4 shows a general type of model obtained from Dev3 colorimeter data.The independent variables x and y plotted on the horizontal axis represent the first and second principal components.The dependent variable z is a predicted characteristic of motor oils.The analysis of the residuals of this model showed that they have a normal distribution and are closely located on the normal probability plot.

Establishment of prediction model for color of engine oils
From the preliminary analyses made on the selection of informative features and the assessment of the predictive ability of the vectors composed of them, it was found that after reducing the vectors of features with principal components, lower values of the coefficient of determination are obtained, compared to the reduction by latent variables with PLSR method.On the other hand, the principal component errors are lower.This is a prerequisite after reducing the feature vectors to obtain regression predictive models for the main properties of motor oils that describe the experimental data with sufficient accuracy.For this reason, feature vectors reduced with principal components were used to create regression models.Also, preliminary analysis found that the data obtained with Dev3 lacked sufficient informativeness.This is probably due to the specifics of the operation of this device.For this reason, no regression models were built on Dev3 data.

Establishment of regression models based on data for Dev 1.
Based on data from Dev1, feature vectors that have the greatest predictive ability for the main characteristics of motor oils have been selected.Selected feature vectors are summarized in Table 10.
Table 10.Vectors and their corresponding features on data from Dev1.
Data from the CFV2D1 trait vector were reduced to two principal components.After removing the non-significant coefficients from the main model that had a p-Value>>α, a model with three coefficients was obtained.A model for predicting L of the form was obtained: The resulting model was evaluated.Table 11 shows the results of this check.A low standard error value is observed.The coefficients of the model are significant because the p-Value is much less than the accepted significance level α=0,05.The error value SE is low.According to Fisher's criterion, the calculated F is much lower than its critical stability under the corresponding degrees of freedom.The t values of the model coefficients are also much higher than the critical one.
The coefficient of determination has a value above 0,6.The results show that the obtained model describes a significant part of the variation of the kinetic viscosity of motor oils.These parameters are not a sufficient criterion to evaluate the model.It is necessary to analyse the residuals.0,16 -0,001 0,0001 -6,77 0,00 x*y -0,2 0,1 -0,000 13 0,0001 -1,99 0,05 The resulting model and plots of its residuals are shown in Figure 5.The first two principal components, which are the independent variables in the model, are plotted on the horizontal axes.On the vertical axis are those of the dependent predicted variable.You can see the area of variation of the two factors where the dependent variable has the largest values is when they are at their upper levels.A sign of the normal distribution of the residuals is their location on a straight line.The distribution of the residuals and their location around the normal line, in the normal probability plot, they are close to a normal distribution.When analysing the residuals, it is established that there is no systematic deviation of the actual data from the theoretical ones, which is also a sign of their normal distribution.

Establishment of regression models based on data for Dev 2
Based on data from Dev2, feature vectors have been selected that have the greatest predictive ability for the main characteristics of motor oils.Selected feature vectors are summarized in Table 12.Table 13 shows data for evaluating the resulting model.The resulting model was evaluated.A low standard error value is observed.
The coefficients of the model are significant because the p-Value is much less than the accepted significance level α=0.05.The error value SE is low.
According to Fisher's criterion, the calculated F is much lower than its critical stability under the corresponding degrees of freedom.The t values of the model coefficients are also much higher than the critical one.
The coefficient of determination has a value above 0,8.The results show that the obtained model describes a significant part of the variation of the kinetic viscosity of motor oils.
These parameters are not a sufficient criterion to evaluate the model.It is necessary to analyse the residuals.The resulting model and plots of its residuals are shown in Figure 6.The first two principal components, which are the independent variables in the model, are plotted on the horizontal axes.On the vertical axis are those of the dependent predicted variable.You can see the area of variation of the two factors where the dependent variable has the largest values is when they are at their upper levels.A sign of the normal distribution of the residuals is their location on a straight line.The distribution of the residuals and their location around the normal line, in the normal probability plot, they are close to a normal distribution.When analysing the residuals, it is established that there is no systematic deviation of the actual data from the theoretical ones, which is also a sign of their normal distribution.From here it can be considered that the prerequisites of the regression analysis are fulfilled.

Comparative analysis of the obtained results
The research in the present work is related to the assessment of the possibility of predicting the main characteristics of motor lubricating oils through the analysis of their color signs and indices.Informative features have been established for these products using color digital images and colorimeter data.In Table 14, a comparative analysis of the used methods is made and the results of their use in the present study is presented.During the selection of features, it was found that they have different predictive capabilities with respect to the investigated characteristics of motor oils, which is confirmed by the research done using partial least squares regression and principal component regression.This is also confirmed by the analysis of the created predictive models, describing with sufficient accuracy the relationship between the selected features and the characteristics of motor oils.

Conclusions
The methods and algorithms adapted and applied in the present work for tracking the change of main characteristics of motor lubricating oils, depending on their color characteristics, are entirely aimed at using methods that would be sufficiently effective in terms of quick and simplified classification and at the same time giving satisfactory accuracy according to the technological requirements for the characteristics of motor oils.
According to the obtained results, the FSRNCA and RRelieFf methods were found to be suitable for the selection of color features for creating regression models.
Tools have been developed to predict the main characteristics of motor oils, which is based on the main color components and color indices calculated from them and on a certain set of ratios between them.
A comparative study was made to evaluate the influence of the used technical measuring tools on the accuracy of forecasting the main characteristics of motor lubricating oils.
It was found that the use of a video camera and homogeneous lighting gives better results in predicting the specified characteristics than a video camera used in daylight, as well as a colorimeter working with the light reflected from the object.
It has been found that the individual color components, in combination with Lab indices, are suitable for predicting the color of oils.Using color components to predict this characteristic of motor oils is inappropriate.
The obtained results demonstrate that the algorithm used for the analysis of color characteristics of motor lubricating oils allowed the creation of adequate regression models.
These models allow quick and non-destructive determination of main characteristics of motor oils, in the specified measurement conditions.As confirmed by the results, the obtained predictive models manage to keep low values of errors and high values of the coefficient of determination, especially in the class corresponding to the total alkalinity of the oils, which is used as a criterion by manufacturers for the sale of lubricant products.This fact has the prospect of directly applying data from visual images, in combination with selected predictive models, in systems for evaluating the quality of motor lubricating oils.

Fig. 1 .
Fig. 1.Feature selection for regression on data from Dev1.Table 4 plots the resulting feature vectors for Dev1.The most traits, compared to the other selection methods, were selected by the SFCPP methoda total of 18 traits.All components from the RGB model and two components each from Lab and LCh are selected.Only two are the RGB indices, while the Lab indices are eight.In the FSRNCA method, one color component is selected from the RGB and

a
) general view; b) a normal probability plot of residuals value; c) residual values distribution.

a
) general view; b) a normal probability plot of residuals value; c) residual values distribution.

Table 1 .
The characteristics of the tested engine oils.

Table 2 .
The characteristics of the tested engine oils by color classes.

Table 4 .
Feature selection for regression on data from Dev1.

Table 7 .
PLSR and PCR results for color prediction on Dev 1 data.

Table 8 .
PLSR and PCR results for color prediction on Dev 2 data.

Table 9 .
PLSR and PCR results for color prediction on Dev 3 data.

Table 12 .
Vectors and their corresponding features on data from Dev 2.

Table 13 .
Data for estimating the resulting regression model for L on data from Dev2.

Table 14 .
A comparative analyses of the used methods.