Efficient Ranked Multi-Keyword Search using Machine Learning Algorithms
Pages : 978-983
Download PDF
Abstract
The increasing amount of documents in the search index of information retrieval system make the problem of ranking documents difficult. The updated state of the problem leads to the point where machine learning becomes the most effective way to optimize the ranking function. Keyword searching is an effective method to retrieve information from such important networks. The aim of keyword search is to find the answers covering all or part of the queried keywords. A challenge in keyword search systems is to ranking the answers according to their importance. This importance lies in the textual content and structural compactness of the answers. Classification is the process of groping the text documents based on phrase, word, & combination of them with respect to set of predefined categories. Data classification has many different applications such as mail routing, email filtering, content classification, news monitoring and narrow-casting. Keywords are retrieved from documents to classify the documents. Keywords are subpart of words that contain the most important information about the content of the document. Keyword retrieval is a process used to take out the important keywords from documents. With the help of proposed system keywords are extracted from documents using TF-IDF and Naïve Baye’s algorithm. TF-IDF algorithm is used to select the different words. The words which have high similarity are taken as keywords. The research has been done using Naive Baye’s algorithms and their performance is analyzed with help of on machine learning. This system uses keyword with top-k ranked search over secure server data. The system provides the accurate result ranking documents & search efficiency due to the use of tree based index and efficient search algorithm.
Keywords: Keyword based search, machine learning, naïve bayes algorithm, TF-IDF algorithm, Ranking, classification.