RSS
热门关键字:  数据挖掘  数据仓库  商业智能  人工智能  搜索引擎

Duplicate Detection in Click Streams

来源: 作者:unkonwn 时间:2004-12-13 点击:

Abstract

Duplicate Detection in Click Streams

by: Ahmed Metwally, Divyakant Agrawal, and Amr El Abbadi

Abstract:

数据挖掘研究院

We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solutions based on Bloom Filters, and discuss the space and time requirements for running the proposed algorithm in both the contexts of sliding, and landmark stream windows. We run a comprehensive set of experiments, using both real and synthetic click streams, to evaluate the performance of the proposed solution. The results demonstrate that the proposed solution yields extremely low error rates.

Keywords: 数据挖掘研究院

Data Streams, Duplicate Detection, Bloom Filters, advertising networks, sliding windows, landmark windows 数据挖掘研究院

Date:

数据挖掘研究院

September 2004 数据挖掘研究院

Document: 2004-23

数据挖掘研究院

资料全文下载 数据挖掘实验室

最新评论共有 0 位网友发表了评论
发表评论
评论内容:不能超过250字,需审核,请自觉遵守互联网相关政策法规。
匿名?