Survey on Clustering of Text using COATES Methodology
Pages : 900-903
Download PDF
Abstract
In many text mining applications, side-information is available along with the text documents. Side-information, such as document provenance information, the links in the document, user-access behavior from weblogs, or other non-textual attributes which are present into the documents. Such attributes lead to better clustering results. However, the relative importance of this side-information may be difficult to estimate, especially when some of the information is noisy. We require a better way to perform the mining process, to maximize the advantages of side information. In this paper, we design an algorithm which combines classical partitioning algorithms with probabilistic models in order to create an effective clustering approach.
Keywords: Clustering, Data Mining
Article published in International Journal of Current Engineering and Technology, Vol.6, No.3 (June-2016)