Main



Developer zone
Username

Passwort


Forgot your password?
Signup!

 



Links
Links
SourceForgeLogo









You are here: Data Mining » Cluster Analysis

Cluster Analysis


One big area of Data Mining is Cluster Analysis. Given a set of data points, each having a set of attributes, and a similarity measure among them. The task now is to discover clusters such that data points in one cluster are more similar to one another and also data points in separate clusters are less similar to one another.


[image taken from: Pang-Ning Tan et. al]

The problem faced by clustering is ill-posed due to the fact that several different notation of a cluster co-exist (density, size and shape, hierarchical, etc). Generally 2 different approaches of clustering exist. Hierarchical (nested) and Partitional (unnested) clustering. In partitional clustering the points are assigned into non-overlapping or exclusive clusters whereas in hierarchical clustering the points may belong to multiple clusters (non-exclusive). i.e. student can be enrolled as student and as employee at a university.

There is a row of other properties and distinctions that define a set of clusters (or they do their best to give a clear description about their interpretation of clusters and how they are distinguished). An example of one such property is the density. There, each cluster has a considerable higher density of points than outside of the cluster (See figure).



I implemented a few Cluster Algorithms that try to discover the true clusters in the provided data-sets. Depending on their assumption about their interpretation of clusters, the results will turn out to be quite different from one another.
Game Development



Machine Learning tools





on Top on Top