Title of article :
Statistical modeling of dissimilarity increments for d-dimensional data: Application in partitional clustering
Aidos، نويسنده , , Helena and Fred، نويسنده , , Ana، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2012
This paper addresses the use of high order dissimilarity models in data mining problems. We explore dissimilarities between triplets of nearest neighbors, called dissimilarity increments (DIs). We derive a statistical model of DIs for d-dimensional data (d-DID) assuming that the objects follow a multivariate Gaussian distribution. Empirical evidence shows that the d-DID is well approximated by the particular case d=2. We propose the application of this model in clustering, with a partitional algorithm that uses a merge strategy on Gaussian components. Experimental results, in synthetic and real datasets, show that clustering algorithms using DID usually outperform well known clustering algorithms.
Dissimilarity increments , Likelihood-ratio test , Minimum Description Length , Gaussian mixture decomposition , Partitional clustering
Journal title :