RSS
热门关键字:  数据挖掘  数据仓库  商业智能  人工智能  搜索引擎

Collaborative filtering

来源: 作者:unkonwn 时间:2004-11-25 点击:

Collaborative filtering (CF) is the method of making automatic predictions (filtering) about the interests of a user by collecting taste information from many users (collaborating). The underlying assumption of CF approach is that: Those who agreed in the past tend to agree again in the future. For example, a collaborative filtering or recommendation system for music tastes could make predictions about which music a user should like given a partial list of that user′s tastes (likes or dislikes). Note that these predictions are specific to the user, but use information gleaned from many users. This differs from the more simple approach of giving an average (non-specific) score for each item of interest, for example based on its number of votes.

Collaborative filtering systems usually take two steps:

数据挖掘实验室

  1. Look for users who share the same rating patterns with the active user (the user who the prediction is for).
  2. Use the ratings from those like-minded users found in step 1 to calculate a prediction for the active user

Alternatively, item-based collaborative filtering popularized by Amazon.com (users who bought x also bought y) and first proposed in the context of rating-based collaborative filtering by Vucetic and Obradovic in 2000, proceeds in an item-centric manner:

数据挖掘实验室

  1. Build an item-item matrix determining relationships between pairs of items
  2. Using the matrix, and the data on the current user, infer his taste

See, for example, the Slope One item-based collaborative filtering family. 数据挖掘研究院

Another form of collaborative filtering can be based on implicit observations of normal user behavior (as opposed to the artificial behavior imposed by a rating task). In these systems you observe what a user has done together with what all users have done (what music they have listened to, what items they have bought) and use that data to predict the users behavior in the future or to predict how a user might like to behave if only they were given a chance. These predictions then have to be filtered through business logic to determine how these predictions might affect what a business system ought to do. It is, for instance, not useful to offer to sell somebody some music if they already have demonstrated that they own that music. 数据挖掘研究院

In the age of information explosion such techniques can prove very useful as the number of items in only one category (such as music, movies, books, news, web pages) have become so large that a single person cannot possibly view them all in order to select relevant ones. Relying on a scoring or rating system which is averaged across all users ignores specific demands of a user, and is particularly poor in tasks where there is large variation in interest, for example in the recommendation of music. Obviously, other methods to combat information explosion exist such as web search, clustering, and more. 数据挖掘研究院

More recently, collaborative filtering has been used in e-learning to promote and benefit from students′ collaboration. 数据挖掘实验室

Contents

[hide]
  • 1 Commercial systems
  • 2 Non-commercial systems
  • 3 Software libraries
  • 4 See also
  • 5 External links

Commercial systems

There are commercial sites that implement collaborative filtering systems. For example: 数据挖掘研究院

  • AlexLit.com
  • Amazon
  • Barnes and Noble
  • Findory.com
  • GenieLab - music
  • half.ebay.com
  • Hollywood Video
  • jimmys.tv - video
  • Loomia - web service
  • Musicmatch
  • Netflix
  • radiolibre.ca
  • Sourcelight Technologies Inc
  • StoryCode - books
  • TiVo

Non-commercial systems

There are also non-commercial collaborative filtering systems: 数据挖掘研究院

  • Alongtail - movies
  • AmphetaRate - RSS articles
  • Last.fm - music
  • Clinko - music & movies
  • Everyone′s a Critic - movies
  • FilmAffinity - movies
  • GiveALink.org - websites
  • Gnod The Global Network of Dreams. A recommendation system that recommends music, movies and authors of books
  • Gnomoradio - free music
  • Indy - free music
  • iRATE radio - free music
  • KindaKarma - authors, video games, movies and music
  • Moonranker - music, movies, and books
  • MovieCritic - movies, Macromedia closed it.
  • MovieLens - movies
  • MusicStrands - music
  • Music Recommendation System for iTunes - music
  • Musicmobs - music
  • Popularism - movies
  • Rate Your Music - music
  • StumbleUpon - websites
  • Upto11 - music
  • Wikilens - various

Software libraries

There are also software libraries which allow a developer to add collaborative filtering to an application or web site:

数据挖掘研究院

  • Taste - open-source, Java
  • Cofi - open-source, Java
  • CoFE - open-source, Java
  • RACOFI - open-source, Java
  • MultiLens - open-source, Java, an old version of the code which runs MovieLens. See also author′s page.
  • SUGGEST - Free, written in C. (A library, not open source.)
  • Rating-Based Item-to-Item - public domain, PHP
  • Vogoo PHP Lib - open-source, PHP
  • Music - open-source, PHP/SQL
  • consensus - open-source, Python

See also

  • Collective intelligence
  • The Long Tail
  • Recommendation system
  • Reputation system

External links

  • Collaborative Filtering Research Papers by James Thornton
  • Collaborative Filtering by Francis Heylighen
  • Collaborative Filtering Resources by Jun Wang
  • Evaluating collaborative filtering recommender systems (DOI: 10.1145/963770.963772)
  • GroupLens research papers. GroupLens is one of the research labs that did a lot of pioneering research in collaborative filtering.
  • ′Social Information Filtering: Algorithms for Automating "Word of Mouth"′ by Upendra Shardanand
  • ′Learning utility graphs for multi-issue negotiation using collaborative filtering′ - Valentin Robu
  • A collection of past and present "information filtering" projects (including collaborative filtering) at MIT Media Lab
  • Collaborative filtering visualized as a network using Amazon data on political book purchases
Retrieved from "https://secure.wikimedia.org/wikipedia/en/wiki/Collaborative_filtering"
上一篇:Logic programming
下一篇:Data mining
最新评论共有 0 位网友发表了评论
发表评论
评论内容:不能超过250字,需审核,请自觉遵守互联网相关政策法规。
匿名?