The use of fuzzy logic in classification

When determining the degree of coincidence of any multi-feature obtained information, received in the form of a fuzzy vector, to a pre-established known pattern, two general steps should be followed. The first step is to eliminate the features that have little or no effect to the final results and to maintain only those that will influence the pattern recognition. This step could be defined as the classification process and is imperative for the simplification of the problem. One example of classification that could considerably reduce system costs is when using sensors distributed along an industrial process to manage information at a central location. Several methods could be used for classification, such as statistical methods, rough sets, fuzzy logic or information theory. The second step is to find out the correlation between the received fuzzy vector and the vector defining the known pattern using the previously selected features. For this part, the use of fuzzy logic is extremely convenient. The present work analyzes some of the methods used for classification and pattern recognition based on concrete and practical examples


Introduction
When analyzing an information system or a database, frequently we face problems like attributes redundancy, missing or diffuse values, which are due in general to noise, and missing partial data. Several approaches for minimizing the number of attributes necessary to represent the desired category structure by eliminating redundancy have been developed. The lack of data or complete knowledge of the system makes, when developing a model, a practically impossible task using conventional means. This lack of data can be attributed to sensors failure, or simple due to incomplete system information. At last, diffuse values could be related to noise or imprecise measurements from sensors. In many applications, the information is collected from different sensors, which are corrupted by noise and outliers. Different methods have been presented [1,2,3,4,5,6]. The present work is devoted to the use of fuzzy logic only as instrument for solving this problem.
For dealing with interval-valued information systems one frequently used procedure is the discretization. Yee Leung et al. [1] have presented a very useful method for obtaining rules, based on rough sets for discriminating the minimum number of attributes, and giving a first approach in the objects classification. Some aspects of the method are reproduced here.

Attribute reduction
Procedure: 1. Table preparation: From the original table,  a new one presenting the minimum and maximum  values for each parameter and for each object is  generated. 2. Define the misclassification rates (αij k ), The region of coincidence or misclassification rate of the two attributes may vary from 0 when there is not coincidence at all, to 1 when the two attributes coincide completely.
In general, the probability that objects in class ui are misclassified into class uj according to attribute k can be represented by [1]: Where α is the permissible misclassification rate which gives the permitted error in the classification. 3. Define αij as the error that class ui being misclassified into class uj in the system k. αij = min { αij k : k m}, where m is the maximum number of attributes 4. Find out the maximum mutual classification error between classes (βij k )

Note that in general
5. For each pair of classes, find out the permissible misclassification rate between classes ui & uj in the system k βij = min βij k for 1 k m Let's define the permissible misclassification rate between classes as α. If βij α, there must exist an attribute Ak so that, by using Ak, the two classes Ui and Uj can be separated within the permissible misclassification rate α.

Attribute Reduction Using Information Theory
Several methods using information theory have been developed. The presented method here is developed in detail in [6]. The general procedure is as follows: It results logical to think that when the two classes coincide for some parameter k, the information obtained from this parameter for discriminating between classes i and j is 0, and that it increases as the coincidence diminishes. This leads to the representation of this information, from Shannon and Hartley definition, using a logarithmic scale. Here the logarithm is used to provide the additivity characteristic for independent uncertainty.
For expressing it with logarithms base 10, it is given as Similarly, the minimum information required for the classification between two classes i and j for an attribute k and permissible misclassification rate α is given by If Iij k  Iα k , the two classes can be separated using the attribute k. An algorithm has been created for solving this task.

Fuzzy logic classification
The major task of fuzzy-based pattern classification is the extraction of knowledge from numerical data to build a rule base, which will permit the classification of new data members. One way of calculating the similarity is given below [7,8]: Let P*(X) be a group of fuzzy sets with Ai  0, and Ai  X. Defining two fuzzy sets from this family of sets, A, B  P*(X), the expression describes the degree of similarity of the two sets A and B. When the approaching degree approaches unity, the two sets will have higher degree of similarity, and an approaching degree near zero implies the distinction between the two fuzzy sets.
Defining a new data sample B with m fuzzy attributes, the approaching degree concept can be applied to compare the new data pattern with some known data pattern Ai. Each of the known patterns Ai is characterized by the same m attributes and given by For each of the known k-patterns, the approaching degree expression is given by where j  is a normalizing weighting factor, taken unitary in this work.
If it is considered that the different attributes are not of equal importance, then relative weights must be calculated. The method consists in creating a fuzzy model for each parameter based on the previous obtained information, and for each new received sample find out the degree of compatibility (IC), given by the previous equation, with each class. The maximum obtained from this expression is the class to which the received information belongs to. The previous concepts are applied in the following examples.

Abalone database
This database [9] has been created in order to predict the age of abalone from physical measurements. Here it is used only like one example, and the obtained conclusions not necessarily reflect all the possibilities that could be obtained in the classification.
The authors selected 150 entries from the database and divided them in three groups: Group I, from 6 to 10 years old, Group II, from 11 to 15 years old, and Group III, from16 to 20 years old. The attributes are as follows: A-Length, B-Diameter, C-Height, D-Whole Weight, E-Shucked Weight, F-Viscera Weight, and G-Shell Weight. The elaborated information is presented in Table 1.
From Table 2, showing the maximum mutual classification error between classes in logarithmic form, it is clear that is not possible to discriminate between groups II and III, using the given attributes   m k 1 Iij k < Iα) and the given error of misclassification α = 0.2. It is possible to discriminate between groups I and III, using attributes D or A. Finally, there is a possibility of discriminating between groups I and II, using fuzzy logic and practically all the attributes. But, in general as per the obtained results in the example, the conclusion could be that the selected attributes in this case are not useful for determining the abalone age, using intervals of 5 years. As per Table 2, the discrimination between all the selected classes can be done with an error of 0.9, which is in general not acceptable.  Figure 1 a) and b) show the membership functions for groups I and II for the attributes A, B, C, and D. Parameter D was presented only for making obvious that it can't be used for discrimination. Three values from the group I have been taken aleatory from the abalone table [9] like examples, and calculated with MATLAB the their compatibility index (CI) when using the membership functions of group I and group II. From Table 3 is clear that the CI between the selected examples from group I , when calculated with the group I G1-G1) is much bigger the CI when calculated with group I (G1-G2)I.
From Table 3 was clear that for comparing group 1 and Group II the quantity of information for discriminating is very small, so the analysis has been focused on parameters A, B, and C It is important to say that here is not made a deep analysis of the abalone classification. The intention has been to present the possibilities of the method and the database is used only like one example.

Vertical handoff target selection in a heterogeneous wireless network
The information for this example has been taken basically from [10], Variations are introduced in the way of solving the problem. Wireless and mobile networking is becoming an increasingly important and popular way to provide global information access to users on the move.
The handoff process has two major stages: handoff initiation, and handoff execution [2]. In the handoff initiation phase, a decision is made regarding the selection of the new Base Station (BS), or Access Point (AP), to which the Mobile Station (MS) will be transferred. In the execution phase, new radio links are formed between the BS/AP and MS, and resources are allocated [10 ] There exist several methods for solving this problem [10]. Here is presented a simple method for the selection of the best network based in the RSS, the velocity, and the cost. The following table 4 shows the input parameters for the selection. There are four (target networks) alternatives A1, A2, A3, and A4 from which it is necessary to select an optimum target network for the user The desired condition for each decision maker is shown in Table 5, where the selected fuzzy variables are: very low (VL), low(L), medium (M), high(H), and very high (VH).
The membership functions are shown on Figure 2 a), and b). As there is not criterion for the membership functions selection, all of them have been taken similarly, only taking into consideration the different units. Table 6 shows the Compatibility Index calculated for each network.
From the table can be seen that the best option as considering the given conditions is the network A3.

7.Conclusions
Several methods are developed for the classification with imprecise or missing information. The lack of data can be attributed to sensors failure, or simple due to incomplete system information. At last, diffuse values could be related to noise or imprecise measurements from sensors. It is not possible to affirm that one method is always better than other. In any case, the selection using fuzzy logic is simply, and extremely useful when dealing with large databases. The rough sets and information theory permit minimize the number of attributes. Both methods are based in the definition of the misclassification rates (αij k ). For any two classes, if this index is bigger than the defined permissible misclassification rate between classes α, the discrimination between these two classes is not possible. Fuzzy logic permits to obtain results from models created using MATLAB or other specialized software.
The first used example is the abalone database, in the analysis it is taken the dimensions only. In general as per the obtained results in the example, the conclusion could be that the selected attributes in this case are not always useful for determining the abalone age, using intervals of 5 years. Using fuzzy logic, it has been made the differentiation between groups I and II, but was not possible between II and III.
The network selected in the second example, is A3. The fundamental object of this example was to show the simplicity of the method. For obtaining trusted results, it is necessary to make a complete analysis of the used membership functions, which can be done having a group of experts for solving this task.

Attributes
Decision