|4/2017 - 10|
K-Linkage: A New Agglomerative Approach for Hierarchical ClusteringYILDIRIM, P. , BIRANT, D.
|Click to see author's profile on SCOPUS, IEEE Xplore, Web of Science|
|Download PDF (1,497 KB) | Citation | Downloads: 197 | Views: 302|
clustering, data mining, data processing, knowledge discovery, unsupervised learning
clustering(33), hierarchical(31), applications(11), systems(9), agglomerative(8), fast(7), data(7), algorithm(7), linkage(6), jeswa(6)
Blue keywords are present in both the references section and the paper title.
About this article
Date of Publication: 2017-11-30
Volume 17, Issue 4, Year 2017, On page(s): 77 - 88
ISSN: 1582-7445, e-ISSN: 1844-7600
Digital Object Identifier: 10.4316/AECE.2017.04010
Web of Science Accession Number: 000417674300010
SCOPUS ID: 85035794377
In agglomerative hierarchical clustering, the traditional approaches of computing cluster distances are single, complete, average and centroid linkages. However, single-link and complete-link approaches cannot always reflect the true underlying relationship between clusters, because they only consider just a single pair between two clusters. This situation may promote the formation of spurious clusters. To overcome the problem, this paper proposes a novel approach, named k-Linkage, which calculates the distance by considering k observations from two clusters separately. This article also introduces two novel concepts: k-min linkage (the average of k closest pairs) and k-max linkage (the average of k farthest pairs). In the experimental studies, the improved hierarchical clustering algorithm based on k-Linkage was executed on five well-known benchmark datasets with varying k values to demonstrate its efficiency. The results show that the proposed k-Linkage method can often produce clusters with better accuracy, compared to the single, complete, average and centroid linkages.
|References|||||Cited By «-- Click to see who has cited this paper|
| H. Yoon, S. Park, "Determining the structural parameters that affect overall properties of warp knitted fabrics using cluster analysis," Textile Research Journal, vol. 72, no. 11, pp. 1013-1022, 2002. |
[CrossRef] [Web of Science Times Cited 3] [SCOPUS Times Cited 8]
 P. Prada, A. Curran, K. Furton, "Characteristic human scent compounds trapped on natural and synthetic fabrics as analyzed by SPME-GC/MS," Journal of Forensic Science & Criminology, vol. 1, no. 1, pp. 1-10, 2014.
 Y. Loewenstein, E. Portugaly, M. Fromer, M. Linial, "Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space," Bioinformatics, vol. 24, no. 13, pp. i41-i49, 2008.
[CrossRef] [Web of Science Times Cited 54] [SCOPUS Times Cited 68]
 D. Wei, Q. Jiang, Y. Wei, S. Wang, "A novel hierarchical clustering algorithm for gene sequences," BMC Bioinformatics, vol. 13, no. 174, pp. 1-15, 2012.
[CrossRef] [Web of Science Times Cited 17] [SCOPUS Times Cited 24]
 Y. Bang, C. Lee, "Fuzzy time series prediction using hierarchical clustering algorithms," Expert Systems with Applications, vol. 38, no. 4, pp. 4312-4325, 2011.
[CrossRef] [Web of Science Times Cited 20] [SCOPUS Times Cited 24]
 H. Gao, J. Jiang, L. She, Y. Fu, "A new agglomerative hierarchical clustering algorithm implementation based on the Map Reduce framework," International Journal of Digital Content Technology and its Applications, vol. 4, no. 3, pp. 95-100, 2010.
[CrossRef] [SCOPUS Times Cited 16]
 S. Horng, M. Su, Y. Chen, T. Kao, R. Chen, J. Lai, C. Perkasa, "A novel intrusion detection system based on hierarchical clustering and support vector machines," Expert Systems with Applications, vol. 38, no. 1, pp. 306-313, 2011.
[CrossRef] [Web of Science Times Cited 116] [SCOPUS Times Cited 167]
 J. Almeida, L. Barbosa, A. Pais, S. Formosinho, "Improving hierarchical cluster analysis: A new method with outlier detection and automatic clustering," Chemometrics and Intelligent Laboratory Systems, vol. 87, no. 2, pp. 208-217, 2007.
[CrossRef] [Web of Science Times Cited 66] [SCOPUS Times Cited 82]
 S. Deininger, M. Ebert, A. Fu¨tterer, M. Gerhard, C. Ro¨cken, "MALDI imaging combined with hierarchical clustering as a new tool for the interpretation of complex human cancers," Journal of Proteome Research, vol. 7, no. 12, pp. 5230-5236, 2008.
[CrossRef] [Web of Science Times Cited 139] [SCOPUS Times Cited 150]
 A. Shalom, M. Dash, "Efficient partitioning based hierarchical agglomerative clustering using graphics accelerators with Cuda," International Journal of Artificial Intelligence & Applications, vol. 4, no. 2, pp. 13-33, 2013.
 H. A. Dalbouh, N. M. Norwawi, "Bidirectional agglomerative hierarchical clustering using AVL tree algorithm," International Journal of Computer Science Issues, vol. 8, no. 5, pp. 95-102, 2011.
 E. Althaus, A. Hildebrandt, A. K. Hildebrandt, "A Greedy algorithm for hierarchical complete linkage clustering," in International Conference on Algorithms for Computational Biology, Tarragona, 2014, pp. 25-34.
[CrossRef] [SCOPUS Times Cited 1]
 A. Mamun, R. Aseltine, S. Rajasekaran, "Efficient record linkage algorithms using complete linkage clustering," PLOS ONE, vol. 11, no. 4, pp. 1-21, 2016.
[CrossRef] [Web of Science Times Cited 1] [SCOPUS Times Cited 2]
 O. Yim, K. Ramdeen, "Hierarchical Cluster Analysis: Comparison of three linkage measures and application to psychological data," The Quantitative Methods for Psychology, vol. 11, no. 1, pp. 8-21, 2015.
[CrossRef] [Web of Science Times Cited 15]
 Y. Li, L. R. Liang, " Hierarchical clustering of features on categorical data of biomedical applications," in Proceedings of the ISCA 21st International Conference on Computer Applications in Industry and Engineering, Hawaii, 2008.
 E. Nasibov, C. Kandemir-Cavas, "OWA-based linkage method in hierarchical clustering: Application on phylogenetic trees," Expert Systems with Applications, vol. 38, no. 10, pp. 12684-12690, 2011.
[CrossRef] [Web of Science Times Cited 9] [SCOPUS Times Cited 11]
 S. Hirano, X. G. Sun, S. Tsumoto, "Comparison of clustering methods for clinical databases," Information Sciences, vol. 159, no. 3-4, pp. 155-165, 2004.
[CrossRef] [Web of Science Times Cited 30] [SCOPUS Times Cited 41]
 J. Bien, R. Tibshirani, "Hierarchical clustering with prototypes via minimax linkage," Journal of the American Statistical Association, vol. 106, no. 495, pp. 1075-1084, 2011.
[CrossRef] [Web of Science Times Cited 22] [SCOPUS Times Cited 29]
 M. Gagolewski, M. Bartoszuk, A. Cena, "Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm," Information Sciences, vol. 363, pp. 8-23, 2016.
[CrossRef] [Web of Science Times Cited 8] [SCOPUS Times Cited 9]
 S. Dasgupta, P. Long, "Performance guarantees for hierarchical clustering," Journal of Computer and System Sciences, vol. 70, no. 4, pp. 555-569, 2005.
[CrossRef] [Web of Science Times Cited 53] [SCOPUS Times Cited 65]
 J. Wu, H. Xiong, J. Chen, "Towards understanding hierarchical clustering: A data distribution perspective," Neurocomputing, vol. 72, no. 10-12, pp. 2319-2330, 2009.
[CrossRef] [Web of Science Times Cited 15] [SCOPUS Times Cited 20]
 A. Mirzaei, M. Rahmati, "A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations," IEEE Transactions on Fuzzy Systems, vol. 18, no. 1, pp. 27-39, 2010.
[CrossRef] [Web of Science Times Cited 30] [SCOPUS Times Cited 40]
 P. Contreras, F. Murtagh, "Fast, linear time hierarchical clustering using the Baire metric," Journal of Classification, vol. 29, no. 2, pp. 118-143, 2012.
[CrossRef] [Web of Science Times Cited 9] [SCOPUS Times Cited 14]
 A. Barirani, B. Agard, C. Beaudry, "Competence maps using agglomerative hierarchical clustering," Journal of Intelligent Manufacturing, vol. 24, no. 2, pp. 373-384, 2011.
[CrossRef] [Web of Science Times Cited 7] [SCOPUS Times Cited 9]
 H. Clifford, F. Wessely, S. Pendurthi, R. Emes, "Comparison of clustering methods for investigation of genome-wide methylation array data," Frontiers in Genetics, vol. 2, no. 88, pp. 1-11, 2011.
[CrossRef] [SCOPUS Times Cited 10]
 Y. M. Yacob, H. A. M. Sakim, N. A. M. Isa, "Decision tree-based feature ranking using Manhattan hierarchical cluster criterion," International Journal of Mathematical, Computational, Physical, Electrical and Computer Engineering, vol. 6, no. 2, pp. 765-771, 2012.
 A. Bouguettaya, Q. Yu, X. Liu, X. Zhou, A. Song, "Efficient agglomerative hierarchical clustering," Expert Systems with Applications, vol. 42, no. 5, pp. 2785-2797, 2015.
[CrossRef] [Web of Science Times Cited 34] [SCOPUS Times Cited 40]
 M. Luczak, "Hierarchical clustering of time series data with parametric derivative dynamic time warping," Expert Systems with Applications, vol. 62, pp. 116-130, 2016.
[CrossRef] [Web of Science Times Cited 4] [SCOPUS Times Cited 5]
 D. Eppstein, "Fast hierarchical clustering and other applications of dynamic closest pairs," Journal of Experimental Algorithmics, vol. 5, p. 1-10, 2000.
[CrossRef] [SCOPUS Times Cited 41]
 Y. Lu, Y. Wan, "PHA: A fast potential-based hierarchical agglomerative clustering method," Pattern Recognition, vol. 46, no. 5, pp. 1227-1239, 2013.
[CrossRef] [Web of Science Times Cited 16] [SCOPUS Times Cited 17]
 D. Müllner, "fastcluster: Fast hierarchical, agglomerative clustering routines for R and Python," Journal of Statistical Software, vol. 53, no. 9, 2013.
 E. Masciari, G. M. Mazzeo, C. Zaniolo, "A new, fast and accurate algorithm for hierarchical clustering on Euclidean distances," in Pacific-Asia Conference on Knowledge Discovery and Data Mining, Gold Coast, 2013.
[CrossRef] [SCOPUS Times Cited 5]
 I. Davidson and S. S. Ravi, "Towards efficient and improved hierarchical clustering with instance and cluster level constraints", Technical Report, Department of Computer Science, University at Albany, 2005.
 S. Bobdiya, K. Patidar, "An efficient ensemble based hierarchical clustering algorithm," International Journal of Emerging Technology and Advanced Engineering, vol. 4, no. 7, pp. 661-666, 2014.
 L. Zheng, T. Li, C. Ding, "A framework for hierarchical ensemble clustering," Acm Transactions on Knowledge Discovery from Data, vol. 9, no. 2, 2014.
[CrossRef] [Web of Science Times Cited 21] [SCOPUS Times Cited 7]
 Z. Chen, S. Zhou, J. Luo, "A robust ant colony optimization for continuous functions," Expert Systems with Applications, vol. 81, pp. 309-320, 2017.
[CrossRef] [Web of Science Times Cited 6] [SCOPUS Times Cited 8]
 J. Vacák, "Adaptation of fuzzy cognitive maps by migration algorithms," Kybernetes, vol. 41, no. 3, pp. 429-443, 2012.
[CrossRef] [Web of Science Times Cited 35] [SCOPUS Times Cited 38]
 R. Precup, M. Sabau, E. M. Petriu, "Nature-inspired optimal tuning of input membership functions of Takagi-Sugeno-Kang fuzzy models for anti-lock braking systems," Applied Soft Computing, vol. 27, pp. 575-589, 2015.
[CrossRef] [Web of Science Times Cited 24] [SCOPUS Times Cited 33]
 S. Vrkalovic, T. Teban, I. Borlea, "Stable Takagi-Sugeno fuzzy control designed by optimization," International Journal of Artificial Intelligence, vol. 15, no. 2, pp. 17-29, 2017.
 C. D. Manning, P. Raghavan, H. Schütze, "Hierarchical clustering", An Introduction to Information Retrieval, pp. 377-402, Cambridge University Press, 2012.
 B. Walter, K. Bala, M. Kulkarni, K. Pingali, "Fast agglomerative clustering for rendering," in The IEEE Symposium on Interactive Ray Tracing, Los Angeles, 2008.
Web of Science® Citations for all references: 754 TCR
SCOPUS® Citations for all references: 984 TCR
Web of Science® Average Citations per reference: 18 ACR
SCOPUS® Average Citations per reference: 23 ACR
TCR = Total Citations for References / ACR = Average Citations per Reference
We introduced in 2010 - for the first time in scientific publishing, the term "References Weight", as a quantitative indication of the quality ... Read more
Citations for references updated on 2018-04-20 20:20 in 234 seconds.
Note1: Web of Science® is a registered trademark of Clarivate Analytics.
Note2: SCOPUS® is a registered trademark of Elsevier B.V.
Disclaimer: All queries to the respective databases were made by using the DOI record of every reference (where available). Due to technical problems beyond our control, the information is not always accurate. Please use the CrossRef link to visit the respective publisher site.
Faculty of Electrical Engineering and Computer Science
Stefan cel Mare University of Suceava, Romania
All rights reserved: Advances in Electrical and Computer Engineering is a registered trademark of the Stefan cel Mare University of Suceava. No part of this publication may be reproduced, stored in a retrieval system, photocopied, recorded or archived, without the written permission from the Editor. When authors submit their papers for publication, they agree that the copyright for their article be transferred to the Faculty of Electrical Engineering and Computer Science, Stefan cel Mare University of Suceava, Romania, if and only if the articles are accepted for publication. The copyright covers the exclusive rights to reproduce and distribute the article, including reprints and translations.
Permission for other use: The copyright owner's consent does not extend to copying for general distribution, for promotion, for creating new works, or for resale. Specific written permission must be obtained from the Editor for such copying. Direct linking to files hosted on this website is strictly prohibited.
Disclaimer: Whilst every effort is made by the publishers and editorial board to see that no inaccurate or misleading data, opinions or statements appear in this journal, they wish to make it clear that all information and opinions formulated in the articles, as well as linguistic accuracy, are the sole responsibility of the author.