Cluster analysis divides data into meaningful or useful groups (clusters). If
meaningful clusters are the goal, then the resulting clusters should capture the “natural”structure of the data. For example, cluster analysis has been used to group related documents for browsing, to find genes and proteins that have similar functionality, and to provide a grouping of spatial locations prone to earthquakes. However, in other cases,cluster analysis is only a useful starting point for other purposes, e.g., data compression or efficiently finding the nearest neighbors of points. Whether for understanding or utility, cluster analysis has long been used in a wide variety of fields: psychology and other social sciences, biology, statistics, pattern recognition, information retrieval,machine learning, and data mining.
The scope of this paper is modest: to provide an introduction to cluster analysis in
the field of data mining, where we define data mining to be the discovery of useful, but non-obvious, information or patterns in large collections of data. Much of this paper is necessarily consumed with providing a general background for cluster analysis, but we also discuss a number of clustering techniques that have recently been developed
specifically for data mining. While the paper strives to be self-contained from a
conceptual point of view, many details have been omitted. Consequently, many
references to relevant books and papers are provided.
http://www.dmresearch.net/uploads/20060101/1000000000.mooter.com_moot_query=cluster%20data%20mining&session=1027&more=0&list=1&start=1&doc=12&url=http___www-users.cs.umn.edu_~han_dmclass_cluster_survey_10_02_00.pdf&IP=60.166.2.pdf 数据挖掘研究院

