Text Chunker for Punjabi
Pages : 3349-3353
Download PDF
Abstract
Parsing is the process of assigning a parse tree to the sentence. There are many problems related to the process of full parsing. Shallow parsing or chunking is the alternative for full parsing. In chunking the phrases of the sentences are chunked together. Chunking is more efficient and robust as it takes less time and always gives a solution. It is often deterministic as it gives only one solution to a problem. Chunkers are used in a large no. of NLP applications. Such as information extraction, named entity recognition, spell checkers, search etc . Chunkers are relatively difficult to build for Indian languages as there arise many problems during the system development. Chunkers identify the noun or verb etc chunks. Chunks are the non-overlapping regions. In this work, first standardized text chunker for Punjabi language is built and the greedy based algorithm is used for the machine learning and training of data set.
Keywords: Natural language Processing (NLP), Part of Speech Tagge r(POS), Punjabi chunker
Article published in International Journal of Current Engineering and Technology, Vol.5, No.5 (Oct-2015)