Several clustering algorithms can be applied to clustering in
large multimedia databases. The effectiveness and efficiency
of the existing algorithms, however, is somewhat limited,
since clustering in multimedia databases requires clustering
high-dimensional feature vectors and since multimedia
databases often contain large amounts of noise. In this paper,
we therefore introduce a new algorithm to clustering
in large multimedia databases called DENCLUE (DENsitybased
CLUstEring). The basic idea of our new approach is
to model the overall point density analytically as the sum
of influence functions of the data points. Clusters can then
be identified by determining density-attractors and clusters
of arbitrary shape can be easily described by a simple equation
based on the overall density function. The advantages
of our new approach are (1) it has a firm mathematical basis,
(2) it has good clustering properties in data sets with large
amounts of noise, (3) it allows a compact mathematical ... 数据挖掘实验室

