Survey on Clustering Algorithm & Diagnosing Unsupervised Anomalies for Network Security
Pages : 2122-2125
Download PDF
Abstract
Data clustering is a process of putting similar data into groups. A clustering algorithm partitions a data set into several groups such that the similarity within a group is larger than among groups. This paper reviews four types of clustering techniques- k-Means Clustering, K-Median Clustering, Density Based Clustering, Filtered clustered. Performance of the 4 techniques are presented and compared. In this paper, we also discussed completely unsupervised approach to detect the attack, without relying on signature, labeled traffic & training. Also discussed limitations of supervised network attacks in an increasingly complex & ever evolving internet. To show the feasibility of such knowledge-independent (unsupervised) approach, we develop UNADA, Unsupervised Network Anomaly Detection Algorithm. UNADA uses novel & robust multi-clustering based detection technique and evaluate its ability to detect & characterize network attack without any previous knowledge. The evidence of traffic structure provided by these multiple clustering is then combined to produce abnormality ranking of traffic flows using correlation-distance based approach. Additionally, we compare its performance against previous unsupervised detection methods using traffic from two different networks.
Keywords: Data clustering, Density based Clustering, Filtered cluster, K-Means clustering, K-Median clustering, Unsupervised Anomaly Detection.
Article published in International Journal of Current Engineering and Technology, Vol.3,No.5(Dec- 2013)