Soundness and Completeness of Inference Rules for New Vague Functional Dependencies

. In this paper we introduce a new deﬁnition of vague functional dependency based on application of appropriately chosen similarity measures. The deﬁnition is adjusted in order to be applicable to both, the imprecise and precise vague functional dependencies. Ultimately, the set of inference rules for new vague functional dependencies is given, and is proven to be sound and complete.


Introduction
Let U be the universe of discourse.
Suppose that V is a vague set in U.
We shall write where t V (u) , 1 − f V (u) ⊆ [0, 1] is the vague value joined to u ∈ U.
The ordinary case t V (u) = 1 − f V (u) = 0, we read as: the element u does not belong to the vague set V. In such scenario, we write where U = {u 1 , u 2 , u 3 } is some universe of discourse, and V is some vague set in U.
Let R (A 1 , A 2 , ..., A n ) be a relation scheme on domains U 1 , U 2 ,..., U n , where A i is an attribute on the universe of discourse U i , i ∈ {1, 2, ..., n} = I.
Suppose that V (U i ) is the family of all vague sets in U i , i ∈ I.
A vague relation r on R (A 1 , A 2 , ..., A n ) is a subset of the cross product V (U 1 ) × V (U 2 ) × ... × V (U n ). * e-mail: dzenang@pmf.unsa.ba A tuple t of r is then of the form Note that we may (more freely speaking) consider t [A i ] the value of the attribute A i on t.
A vague relation r on R (A 1 , A 2 , ..., A n ) can be visibly represented as a two-dimensional table with n columns and the table headings A 1 , A 2 ,..., A n , where each horizontal row of the table is a tuple of r, and each column of the table contains the attribute values under the corresponding heading.
Let r be the vague relation instance on R (Name, Int, S ucc) given by Table 1. Since the truth value 0.7 is quite high, the false value 0.1 = 1 − 0.9 is pretty small, and the difference 0.9 − 0.7 = 0.2 is also very small, we conclude that Sara's intelligence must be close to 115. However, 0.9 > 0.7, 0.05 = 1 − 0.95 < 0.1, and 0.95 − 0.9 = 0.05 < 0.2 = 0.9 − 0.7, so Sara's intelligence is definitively closer to 130 (from bellow) than to 115 (note that 115 {⟨115, [0.7, 0.9]⟩, ⟨130, [0.9, 0.95]⟩}). Reasoning in the same way, we conclude that Sara's success is between 5 and 10, and it is closer to 10 than to 5. The data about Katie are quite precise. As opposed to Ted, however, she is a very intelligent person who is not so successful. Compared to Ted and Katie, Sara is a relatively intelligent person who is relatively successful. Finally, Jim is a pretty intelligent person who is also very successful.
For the basic relational concepts, see, e.g., [10]. Let r 1 be the fuzzy relation instance on R (Name, Int, S ucc) given by Table 2 (now, we assume that Int and S ucc are fuzzy attributes on U 2 and U 3 , respectively). Table 2.
The authors in [8] and [1], for example, apply fuzzy membership values to incorporate fuzzy data into relational database theory.
Note that the fuzzy relation instance r 1 may be represented as the vague relation instance given by Table 3. Similarly, the relation instance r 2 on R (Name, Int, S ucc) given by Table 4 (now, we assume that the attributes Int and S ucc are ordinary attributes on U 2 and U 3 , respectively), may be represented as the vague relation instance given by Table 5.

{T ina} {145} {10}
For the ordinary relational database theory, see [16]. The aforementioned examples show clearly that the vague relation concept represents a natural generalization of the ordinary relation concept and the fuzzy relation concept. While the relation theory is not able to handle imprecise data almost at all, and the knowledge about fuzzy data has its own limitations, the quality of the information about vague data is obviously much more refined.
Let 1] be the vague values joined to u 1 ∈ U 1 and u 2 ∈ U 2 , respectively, where We define the similarity measure S E (a 1 , a 2 ) between the vague values a 1 and a 2 following Lu-Ng [12]. Note that several authors, including Chen [5], [6], Hong-Kim [7], Li-Xu [11], Szmidt-Kacprzyk [15], Grzegorzewski [9], proposed various definitions of similarity measures between vague sets and distances between intuitionistic fuzzy sets. According to Lu-Ng [12], however, the similarity measure given above, reflects reality in a more appropriate manner when it comes to more general cases. Let be two vague sets in some universe of discourse U. We define the similarity measure S E (A, B) between the vague sets A and B as follows: where |U| denotes the number of elements in U.
As it is usual, we write A ⊆ B (and say that the vague are tuples of the vague relation instance r on R (Name, Int, S ucc) given by Table 1.
We obtain the following results: where, for example, 0.78 means that .
Let t 1 and t 2 be any two (vague) tuples in r. Finally, let X ⊆ {A 1 , A 2 , ..., A n } be some set of attributes.
We define the similarity measure S E X (t 1 , t 2 ) between the tuples t 1 and t 2 on the attribute set X as Now, we are able to derive some auxiliary results.
for any t 1 and t 2 in r.
where t 1 , t 2 and t 3 are some three, mutually distinct tuples in r, and X is a subset of Proof. Let r be the vague relation instance on R (Name, Int, S ucc) given by Table 1.
Suppose that X = {Int, S ucc}. We have (see matrices I and S ), Moreover, .., U n } is the universe of discourse that corresponds to the attribute A ∈ X.

Vague functional dependencies
Let R (A 1 , A 2 , ..., A n ) be a relation scheme on domains U 1 , U 2 ,..., U n , where A i is an attribute on the universe of discourse U i , i ∈ I. Suppose that r is a relation instance on R (A 1 , A 2 , ..., A n ). Furthermore, let X and Y be subsets of Relation instance r is said to satisfy the functional dependency X → Y, if for every pair of tuples t 1 and t 2 in r, As it is known, the relational model restricts the attribute values to be atomic (if the attribute value is precise and crisp, then the value is atomic), i.e., t [A i ] ∈ U i , i ∈ I for every t ∈ r. Moreover, each U j , j ∈ I is equipped with the identity relation i j : Unfortunately, the ordinary relational database model is far from being enough to capture all of the information about the real-world facts. Namely, the attribute values are usually imprecise ones, i.e., fuzzy. In order to be able to store such fuzzy attribute value, one stores a set of crisp values in place of the fuzzy value, where the crisp values are some, mutually distinct elements from the attribute domain, and are similar to the fuzzy value. Therefore, the following definition is more than justified.
Let R (A 1 , A 2 , ..., A n ) be a relation scheme on domains U 1 , U 2 ,..., U n , where A i is an attribute on the universe of discourse U i , i ∈ I. A fuzzy relation instance r on R (A 1 , A 2 , ..., A n ) is a subset of the cross product 2 U 1 × 2 U 2 × ... × 2 U n of the power sets of the domains of the attributes. A tuple t of r is then of the form As we already noted, any fuzzy attribute value is described by some set of crisp values, where each of the crisp values is similar to the fuzzy value. More precisely, each attribute domain U j , j ∈ I is equipped with some similarity relation s j : U j × U j → [0, 1], where s j : U j × U j → [0, 1] is said to be a similarity relation on U j , if for every x, y, z ∈ U j , the conditions: s j (x, x) = 1, s j (x, y) = s j (y, x), and s j (x, z) ≥ max Hence, we calculate φ A l t j , t k instead of calculating i l t j [A l ] , t k [A l ] , i.e., instead of checking whether or not Consequently, we calculate φ X t j , t k instead of checking whether or not t j [X] = t k [X], where X is a subset of {A 1 , A 2 , ..., A n }, and φ X t j , t k is the conformance of the attribute set X on tuples t j and t k , defined by For the similarity-based fuzzy relational database approach, we refer to [2]- [4]. Now, the condition: is the conformance of the attribute set X resp. Y on tuples t 1 and t 2 , then φ (Y [t 1 , t 2 ]) ≥ φ (X [t 1 , t 2 ]). More precisely, we could say that some fuzzy relation instance r on R (A 1 , A 2 , ..., A n ) satisfies the fuzzy functional dependency X → F Y, if for every pair of tuples t 1 and t 2 in r, Consider the following example. Let R (T ea, Exp, S al) be a relation scheme on domains U 1 = {Grace, Harry, Oscar}, U 2 = {low, high}, U 2 = {3800US D, 4500US D}, where Exp (as experience) and S al (as salary) are fuzzy attributes on universes U 2 and U 3 , respectively, and T ea (as teachers) is ordinary attribute on the universe of discourse U 1 .  (3800US D, 4500US D) = 0.5. Let r 3 be the fuzzy relation instance on R (T ea, Exp, S al) given by Table 6. Let's check if the fuzzy relation instance r 3 satisfies X → F Y. We obtain, Note that the dependency: teachers with similar experiences should have similar salaries, tells the truth about the real-world. Obviously, both scenarios t 2 Exp = {low} and t 2 Exp = {low, high} are possible in reality. In the first case, Grace and Oscar have similar experiences and similar salaries, where the salaries are more similar than the experiences are. In the second case, their salaries are similar, but not identical, although their experiences are identical. This discussion shows that the dependency: teachers with similar experiences should have similar salaries, makes sense by itself, and that the instance r 3 should satisfy this dependency in both cases, is not adequate for determining whether or not the fuzzy relation instance r satisfies the fuzzy functional dependency X → F Y. If this condition is satisfied, the instance r satisfies the dependency X → F Y for sure. Otherwise, if the condition fails, the instance r may or may not satisfy X → F Y.
In order to overcome these difficulties and correct the irregularities, Sozat and Yazici [14] introduced the following definition.
Let R (A 1 , A 2 , ..., A n ) be a relation scheme on domains U 1 , U 2 ,..., U n , where A i is an attribute on the universe of discourse U i , i ∈ I. Suppose that r is a fuzzy relation instance on R (A 1 , A 2 , ..., A n ). Furthermore, let X and Y be subsets of {A 1 , A 2 , ..., A n }, and θ ∈ [0, 1]. Fuzzy relation instance r is said to satisfy the fuzzy functional depen- for every t 1 , t 2 ∈ r, and θ ∈ [0, 1], i.e., r satisfies X More generally, if it happens that for every t 1 , In particular, the instance r 3 satisfies the dependency , Table 6). Furthermore, if t 2 Exp = {low, high}, then the instance r 3 satisfies resp. violates the dependency X The value θ ∈ [0, 1] that appears in the notation X θ − → F Y is called the linguistic strength of the fuzzy functional dependency. If θ = 1, the fuzzy functional dependency X Now, one could try to say that some vague relation instance r on R (A 1 , A 2 , ..., A n ) satisfies the vague functional dependency X → V Y, if for every pair of tuples t 1 and t 2 in r, S E Y (t 1 , t 2 ) ≥ S E X (t 1 , t 2 ) (see, e.g., [12], [17]).
Recall the vague relation instance r given by Table 1. Consider the dependency: the intelligence level of a person more or less determines the degree of success.
Since the values of the attributes intelligence and success may be imprecise, we may consider this dependency as a vague functional dependency. We can write it in the form X → V Y, where X = {Int} and Y = {S ucc} (see, Table  1).
Let's check if the vague relation instance r satisfies X → V Y.
Since (see, matrices I and S ), Reasoning as in the case of fuzzy functional dependencies, we conclude that the condition S E Y (t 1 , t 2 ) ≥ S E X (t 1 , t 2 ), t 1 , t 2 ∈ r, must be adapted. We introduce the following definition.
Let R (A 1 , A 2 , ..., A n ) be a relation scheme on domains U 1 , U 2 ,..., U n , where A i is an attribute on the universe of discourse U i , i ∈ I. Suppose that r is a vague relation instance on R (A 1 , A 2 , ..., A n ). Furthermore, let X and Y be subsets of {A 1 , A 2 , ..., A n }, and θ ∈ [0, 1]. Vague relation instance r is said to satisfy the vague functional dependency X θ − → V Y, if for every pair of tuples t 1 and t 2 in r, Now, the vague relation instance r given by Table 1, satisfies resp. violates the vague functional dependency X For yet another definition of vague functional dependency, called α-vague functional dependency, see [13].

Soundness of inference rules for vague functional dependencies
The following rules are the inference rules for vague functional dependencies (VFDs). Suppose that r satisfies X Since r violates X θ 2 − → V Y, we know that there are tuples t 1 and t 2 in r, such that The instance r satisfies the dependency X This contradicts the fact that S E Y (t 1 , t 2 ) < θ 2 .
If min This contradicts the fact that S E Y (t 1 , t 2 ) < S E X (t 1 , t 2 ).
Consequently, r satisfies X The cases VF2-VF4 are discussed similarly. This completes the proof.

Soundness of additional inference rules for vague functional dependencies
The following inference rules are additional inference rules for vague functional dependencies. We shall prove that these rules follow from the rules: VF1, VF2, VF3 and VF4. This will mean that the vague functional dependencies obtained by the additional rules can certainly be obtained by successive application of the rules: VF1, VF2, VF3 and VF4. The additional inference rules, however, can make such an effort much shorter and easier.
(proof II for VF5) We have: 3) X The cases VF6 and VF7 are discussed similarly. This completes the proof.

Completeness of inference rules for vague functional dependencies
Let R (A 1 , A 2 , ..., A n ) be a relation scheme on domains U 1 , U 2 ,..., U n , where A i is an attribute on the universe of discourse U i , i ∈ I. Suppose that V is some set of vague functional dependencies on {A 1 , A 2 , ..., A n }. The closure V + of V is the set of all vague functional dependencies that can be derived from V by repeated applications of the inference rules: VF1, VF2, VF3 and VF4.
Note that the set V + is infinite one regardless of whether the set V is finite or not. Namely, if X