Probabilistic Matrix Factorization Recommendation Algorithm with User Trust Similarity

. In this paper, we describe the formatting guidelines for Conference Proceedings. Whether the user similarity calculation is reasonable in the traditional collaborative filtering recommendation algorithm directly affects the result of the collaborative filtering recommendation algorithm. This paper proposes a probabilistic matrix factorization recommendation algorithm with user trust similarity which combines improved similarity of users’ trust and probability matrix factorization recommendation method. The results show that proposed algorithm could relieve user cold start issues and effectively reduce the error of recommendation.


Introduction
With the rapid growth of data, "information overload" has become the most important issue we need to solve at present. How to achieve accurate and personalized recommendations in massive data, this issue is particularly prominent in major e-commerce platforms, mobile applications and other fields. The traditional collaborative filtering recommendation algorithm is one of the most widely used recommendation methods and is also the most successful. [1][2][3] However, the CF needs to be based on user history information, such as the user's evaluation of the product.
The main drawback of this is that as the amount of data continue to increase, the number of points generated by users is limited. The data scarcity of the user history rating matrix severely restricts the calculation of similarity in the collaborative filtering method and reduces the recommendation accuracy of the algorithm. [4][5]Therefore, generating accurate recommendations for target users through user similarity calculations has become a problem to be solved at this stage.
Under these circumstances, some existing solutions focus on trust relationships between users. [6][7] The literature [8] proposed a PIP method, Consider the similarity between users from three aspects of Proximity and Impact and Popularity. The literature [9] proposed that the NHSM method achieved good results on the basis of the PIP method. Literature [10] use the matrix decomposition method which is based on ACOS and AMSD [11][12] to calculate the user similarity has got better application effect.
In order to solve the problems of the traditional recommendation algorithm and make full use of existing data information. This paper proposes a collaborative filtering recommendation algorithm that integrates user implicit trust. The algorithm puts forward a new method by taking into account the differences in ratings between users' common scoring items and users and the differences in different user rating items. The results prove that the accuracy of the recommendation is improved.

User trust similarity
The traditional rating trust relationship is: If user A trusts user B, then user B also trusts user A. This is equivalent to the fact that the trust matrix is a symmetric matrix. But in real life, Trust relationships are often not equal. This relationship can be better reflected in social networks, so we consider that the real trust relationship matrix is asymmetrical. In other words, user A's trust in user B may not equal to user B's trust in user A. The main reason to consider the difference in trust is that the number of items in the two ratings is different. If two users have the same number of scoring items, they have the same degree of trust. If two users have different number of scoring items. In the case of a certain number of common scoring items between the two, users with more evaluation numbers have a lower degree of trust for users with a lower number of ratings.
For the two users A and B, considering the relative proportion of the ratings number , set the trust coefficient of A to B as formula (1): In the formula, I A is the collection of items that user A rated excessive. Similarly, I B is the collection of items that user B rated. In addition to the trust difference between users, the traditional algorithms only consider MATEC Web of Conferences 208, 05004 (2018) https://doi.org/10.1051/matecconf/201820805004 ICMIE 2018 the directional differences of different user rating vectors when calculating users' similarities ,and don't take different users' specific scores into consideration. When the two vectors are in the same direction but the values are inconsistent, the traditional methods of calculation seem unreasonable. If the direction of the two users' interest vectors are the same as the direction of the third user, the smaller of the score difference with third user, the stronger of the trust relationship between them.
In response to the problem mentioned above, For User A and User B, consider the differences in the values of the ratings. The score difference coefficient between users is defined as formula (2): In the formula, I is the collection of items for User A and User B rated. R A，j and R B，j are the rating of item j that user A and user B rated respectively. Max is the maximum score, Min is the minimum score, θ is a constant between 0 and 1. Finally, the improved trust similarity formula is defined as followed: In formula (3), sim(A, B) uses Pearson similarity as an improvement object in this paper.

User trust similarity probabilistic matrix factorization
User trust similarity matrix is decomposed into two lowdimensional implicit vector feature matrices. Where m indicates the number of users. The user feature matrix U is a l * m dimension matrix. And the trust feature matrix Z is a l * m dimensional matrix. It is necessary to get the most suitable U and Z to minimize the factorization error of the user trust similarity matrix, it is equivalent to let the inner product of U and Z satisfy the minimum error of the user trust similarity matrix C. The error formula defined as: In formula (4), the vectors U i represents the characteristics of user i , and Z k represents the trust characteristics of the group of k, N(x|μ, σ c 2 ) represents the probability density function of a Gaussian distribution with mean μ and variance of σ c 2 . When there is a trust relationship between user i and user k, Then I ik c =1, otherwise it is 0. The Gaussian distributions of user feature and user trust feature are defined as follows: In formula (5), σ U 2 is the variance of U distribution, σ Z 2 is the variance of the Z distribution, I represents the identity matrix.
Mentioned in section 2.1, the similarity of trust takes into account between the different users number and scale of user preferences for projects. The similarity of trust between users takes the different number and scale of user preferences for items into account. where C ik is the trust degree of user i to user k. It is shown as: In formula (4), I i is a set of items that rated by user i; I k is a set of items that rated by user k. R i,j is the user i 's rating to the item j; R k,j is the user k's rating to the item j. According to Bayesian theorem, the user trust similarity matrix is decomposed into user characteristic matrix and user trust matrix: p(U, Z|C, σ c 2 , σ u 2 , σ z 2 ) ∝ p(C|U, Z, σ c 2 )p(U|σ u 2 )p(U|σ u 2 )p(Z|σ Z 2 ) (7)

User-item similarity probabilistic matrix factorization
Similarly, the user-item rating matrix is decomposed into two low-dimensional implicit vector feature matrices. Where n indicates the number of items. The user feature matrix U is a l * m dimension matrix. And the item feature matrix V is a l * n dimensional matrix. It is necessary to get the most suitable U and V to minimize the factorization error of the user trust similarity matrix, it is equivalent to let the inner product of U and V satisfy the minimum error of the user trust similarity matrix R. The error formula defined as: In formula (8), the vectors V j represent the trust characteristics of item j , N(x|μ, σ R 2 ) represents the probability density function of a Gaussian distribution with mean μ and variance of σ R 2 . If item j have been rated by user i,then I ij R =1, otherwise it is 0. The Gaussian distributions of user feature and item feature are defined as follows: In formula (9), σ U 2 is the variance of U distribution, σ V 2 is the variance of the V distribution, I represents the identity matrix. According to Bayesian theorem, the user-item matrix is decomposed into user characteristic matrix and item characteristic matrix:

Joint probabilistic matrix factorization
The joint probabilistic matrix factorization can be used to obtain a user characteristic matrix that satisfies both the user trust relationship and the user-item matrix. Furthermore, the missing values in the user-item scoring matrix are obtained from the inner product of the user feature matrix and the item feature matrix under the constraint conditions. The logarithmic value of posterior of joint probability matrix is defined as: ln p (U, V, Z|C, R, σ c 2 , σ R 2 , σ u 2 , σ v 2 , σ z 2 ) In formula (11), C is a constant. In order to obtain the minimum error factorization of the user trust similarity matrix and the user-item rating matrix, which equivalent for formula (11) to take the maximum value. Formula (11) is simplified to formula (12), it's said that the maximum value of formula (11) corresponds to the minimum value of formula (12).

EXPERIMENT
The experiment mainly verified the following questions: -The influence of the choice of θ value in the user trust similarity formula on the recommendation effect, and the θ value of the best recommended effect; -After θ is determined, test the λ C value of the user trust similarity matrix through experiments; -Finally evaluate the recommended effect of the TSPMF algorithm.

Joint probabilistic matrix factorization
The data set selected movielens-100k as the experimental data set, 80% of the data set are training sets, the remaining 20% is test set. The rating range is 1-5. 1 is not liked, 5 is liked.

Experimental evaluation index
Mean absolute error (MAE) and root mean squared error (RMSE) are used as experimental evaluation indicators. The formulas are as follows： In the formula, T is the record numbers of items predicted. R i,j is the score predicted by TSPMF algorithm. R ̅ i,j is the score that user i rated before.

Experimental analysis
In the experiment, the result of Fig. 1 is the MAE value of θ when different K -nearest-neighbors are selected.  Fig.2, the mean value of the MAE is calculated to confirm that the value of θ in the user trusts similarity formula, when MAE takes the minimum value, θ = 0.9.
After the θ value is selected, the parameter λ C in the joint probabilistic matrix factorization is determined by calculation. When it is determined that the factorization dimension is d=10, Different MAE values corresponding to λ C in Fig.3, when MAE takes the minimum value, λ C =10.
In Fig.2, the mean value of the MAE is calculated to confirm that the value of θ in the user trusts similarity formula, when MAE takes the minimum value, θ = 0.9. After the θ value is selected, the parameter λ C in the joint probabilistic matrix factorization is determined by calculation. When it is determined that the factorization dimension is d=10, Different MAE values corresponding to λ C in Fig.3, when MAE takes the minimum value, λ C =10.
Experimental results show, the TSPMF algorithm proposed in this paper is obviously better than the other four algorithms of the same type. Prove that the method of integrating user trust to make recommendations is effective.It improves the consequent of the recommendation.

Conclusion
In this paper, a method of user trust probability matrix factorization (TSPMF) is proposed. To integrate the trust relationship between users into the traditional matrix factorization. Not only did it get accurate user similarity, but it further enhanced the effectiveness of the recommendations through trust relationships between users. The comparison experiment results show that the trust relationship between users plays an important role in the user's recommendation. The recommendation effect of TSPMF method is more stable and better than the others.