Clustering Approach using Hierarchical Coupling Learning for Categorical Data
Pages : 534-537
Download PDF
Abstract
A bulk data is generated from various sources. Many real world applications generate categorical data with finite unordered feature values. Like numerical data categorical data cannot be directly processed using algebraic operation. Hence many machine learning numerical processing algorithms cannot be directly applied to the categorical dataset. The categorical data is converted in the numerical form and then such numerical machine learning algorithms can be applied. A lot of work has been done in literature for data representation. For good data representation intrinsic data characteristics should be effectively captured. Some technique in literature focuses on low level strong coupling between feature values while other are focusing on high level clusters of feature values. In the proposed system Coupled Unsupervised categorical data Representation (CURE) framework is proposed. It uses hierarchical learning structure. It defines value to value as well as value clusters coupling. Along with the data representation the Coupled Data Embedding(CDE) algorithm is proposed to generate clusters of categorical data using numerical representation. The clusters of numerical data are generated using k-menas algorithm. The results are compared with clustering of categorical data using k-mode algorithm. Categorical data Representation, clustering, Unsupervised Learning, kmeans
Keywords: Categorical data Representation, clustering, Unsupervised Learning, kmeans