The growth of Internet has resulted in a tremendous amount of information available
and a vast array of choices for consumers. Recommender systems are designed to
help a user cope with this situation by selecting a small number of options to present
the user. They filter and recommend items based on a user’s preference model. Various
types of recommender systems have been proposed so far, their filtering techniques
fall into two categories. One is content-based filtering (e.g. [12]) and the other
is collaborative filtering or social filtering (e.g. [16]).
In content-based filtering, a user’s preference model is constructed for the individual
based upon the user’s ratings and descriptions (usually, textual expression) of the
rated items. Such systems try to find regularities in the descriptions that can be used
to distinguish highly rated items from others. On the other hand, collaborative filtering
tries to find desired items based on the preference of set of similar users. In order
to find out like-minded users, it compares other users’ ratings with the target user’s
ratings. It is not necessary to analyze the contents of items, therefore it can be applied
to many kind of domains where a textual description is not available or regularities in
the words used in the textual description are not informative (e.g. [4]). One of the
most popular algorithms in collaborative filtering is a correlation-based approach. In
this paper, we report experimental results comparing the collaborating filtering with
the Simple Bayesian Classifier as an alternative approach.
This paper is organized as follows. We present the central ideas of current typical
collaborative filtering algorithms. We define the two alternative formulations of the
Simple Bayesian Classifier for collaborative filtering. Then, we evaluate our algorithms
on database of user ratings for movies and jokes, and show that our approach
outperforms the correlation-based collaborative filtering algorithm. Finally, we discuss 数据挖掘论坛
the results and summarize this paper.
2 Collaborative Filtering
The main idea of collaborative filtering is to recommend new items of interest for a
particular user based on other users’ opinions. A variety of collaborative filtering algorithms
have been reported and their performance has been evaluated empirically
([2], [15], [16]). These algorithms are based on a simple intuition: predictions for a
user should be based on the preference patterns of other people who have similar interests.
Therefore, the first step of these algorithms is to find similarities between user
ratings. Resnick et al. [15] use the Pearson correlation coefficient as a measure of
preference similarity. The correlation between user j and j’ is:
数据挖掘工具
where all summations over i are over the items which have been rated by both j and j’.
The predicted value of user j for item i is computed as a weighted sum of other users’
资料全文下载 数据挖掘交友