|
首页>数据挖掘知识>聚类算法> |
Automatic Subspace Clustering of High Dimensional Data for D |
|
Visited times , Welcome to Data Mining Forum & Data Mining Expert |
|
|
|
Clustering is a descriptive task that seeks to identify homo- geneous groups of objects based on the values of their at- tributes (dimensions) [24] [25]. Clustering techniques have been studied extensively in statistics [3], pattern recogni- tion [11] [19], and machine learning [9] [31]. Recent work in the database community includes CLARANS [33], Focused CLARANS [14], BIRCH [45], and DBSCAN [13]. Current clustering techniques can be broadly classified into two categories [24] [25]: partitional and hierarchical. Given a set of objects and a clustering criterion [39], parti- tional clustering obtains a partition of the objects into clus- ters such that the objects in a cluster are more similar to each other than to objects in di erent clusters. The popular K-means and K-medoid methods determine K cluster rep- resentatives and assign each object to the cluster with its representative closest to the object such that the sum of the distances squared between the objects and their represen- tatives is minimized. CLARANS [33], Focused CLARANS [14], and BIRCH [45] can be viewed as extensions of this approach to work against large databases. Mode-seeking clustering methods identify clusters by searching for regions in the data space in which the object density is large. DB- SCAN [13] finds dense regions that are separated by low density regions and clusters together the objects in the same dense region. A hierarchical clustering is a nested sequence of parti- tions. An agglomerative, hierarchical clustering starts by placing each object in its own cluster and then merges these atomic clusters into larger and larger clusters until all ob- jects are in a single cluster. Divisive, hierarchical clustering reverses the process by starting with all objects in cluster and subdividing into smaller pieces [24].
资料全文下载 数据挖掘实验室
|
|
|
|
|
[数据挖掘专家]
[数据挖掘研究院]
[数据挖掘论坛]
[数据挖掘实验室]
|
上一篇:ROCK: A Robust Clustering Algorithm for Categorical Attribut
下一篇:A Incremental Multi-Centroid, Multi-Run Sampling Scheme for
|
|
|
|