Collaborative Filtering with the Simple Bayesian Classifier

The growth of Internet has resulted in a tremendous amount of information available
and a vast array of choices for consumers. Recommender systems are designed to
help a user cope with this situation by selecting a small number of options to present
the user. They filter and recommend items based on a user’s preference model. Various
types of recommender systems have been proposed so far, their filtering techniques
fall into two categories. One is content-based filtering (e.g. [12]) and the other
is collaborative filtering or social filtering (e.g. [16]).
In content-based filtering, a user’s preference model is constructed for the individual
based upon the user’s ratings and descriptions (usually, textual expression) of the
rated items. Such systems try to find regularities in the descriptions that can be used
to distinguish highly rated items from others. On the other hand, collaborative filtering
tries to find desired items based on the preference of set of similar users. In order


to find out like-minded users, it compares other users’ ratings with the target user’s
ratings. It is not necessary to analyze the contents of items, therefore it can be applied
to many kind of domains where a textual description is not available or regularities in
the words used in the textual description are not informative (e.g. [4]). One of the
most popular algorithms in collaborative filtering is a correlation-based approach. In
this paper, we report experimental results comparing the collaborating filtering with
the Simple Bayesian Classifier as an alternative approach.
This paper is organized as follows. We present the central ideas of current typical
collaborative filtering algorithms. We define the two alternative formulations of the
Simple Bayesian Classifier for collaborative filtering. Then, we evaluate our algorithms
on database of user ratings for movies and jokes, and show that our approach
outperforms the correlation-based collaborative filtering algorithm. Finally, we discuss 数据挖掘论坛
the results and summarize this paper.
2 Collaborative Filtering
The main idea of collaborative filtering is to recommend new items of interest for a
particular user based on other users’ opinions. A variety of collaborative filtering algorithms
have been reported and their performance has been evaluated empirically
([2], [15], [16]). These algorithms are based on a simple intuition: predictions for a
user should be based on the preference patterns of other people who have similar interests.
Therefore, the first step of these algorithms is to find similarities between user
ratings. Resnick et al. [15] use the Pearson correlation coefficient as a measure of
preference similarity. The correlation between user j and j’ is:
数据挖掘工具

where all summations over i are over the items which have been rated by both j and j’.
The predicted value of user j for item i is computed as a weighted sum of other users’

资料全文下载 数据挖掘交友

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:自动分类在搜索引擎性能优化中的应用
下一篇:Clustering for Collaborative Filtering Applications
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • Microsoft 聚类分析算法
  • Microsoft 决策树算法
  • Hidden Markov Model (HMM) Toolbox for Ma
  • 页面定时刷新功能实现
  • 决 策 树
  • Decision support systems applications re
  • Microsoft Naive Bayes 算法
  • Parallel C4.5 (PC4.5)
  • 自动分类在搜索引擎性能优化中的应用
  • 国内首台Cell刀片服务器集群投入运行 中国
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • On the Optimality of the Simple Bayesian
  • Clustering for Collaborative Filtering A
  • Collaborative Filtering with the Simple
  • 自动分类在搜索引擎性能优化中的应用
  • S-PLUS介绍(flash)
  • Creation and manipulation of decision tr
  • Parallel C4.5 (PC4.5)
  • 页面定时刷新功能实现
  • 分类比赛数据集
  • What’s New on the Web? The Evolution of
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静