Papers
Topics
Authors
Recent
Search
2000 character limit reached

Clustering validity based on the most similarity

Published 16 Feb 2013 in cs.LG and stat.ML | (1302.3956v1)

Abstract: One basic requirement of many studies is the necessity of classifying data. Clustering is a proposed method for summarizing networks. Clustering methods can be divided into two categories named model-based approaches and algorithmic approaches. Since the most of clustering methods depend on their input parameters, it is important to evaluate the result of a clustering algorithm with its different input parameters, to choose the most appropriate one. There are several clustering validity techniques based on inner density and outer density of clusters that represent different metrics to choose the most appropriate clustering independent of the input parameters. According to dependency of previous methods on the input parameters, one challenge in facing with large systems, is to complete data incrementally that effects on the final choice of the most appropriate clustering. Those methods define the existence of high intensity in a cluster, and low intensity among different clusters as the measure of choosing the optimal clustering. This measure has a tremendous problem, not availing all data at the first stage. In this paper, we introduce an efficient measure in which maximum number of repetitions for various initial values occurs.

Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.