Design and evaluation of text pre-processor: A tool for text pre-processing

Amit Prasad Rauth, Anjan Pal

Research output: Contribution to journalArticlepeer-review

Abstract

This paper introduces the Text Pre-processor, a tool that integrates several text preprocessing tasks such as tokenization, parts-of-speech tagging, and elimination of stop words. These pre-processing tasks are prerequisite for any text processing tasks such as sentiment analysis or text summarization. However, there does not exist any one-stop solution to perform multiple text pre-processing tasks. The Text Pre-processor serves to cover this gap. The tool includes five modules. These include text editor, single file processing, file to file processing, multiple file processing, as well as split and merge files. Informed by the technological acceptance model, a qualitative user study was conducted to evaluate the efficacy of the tool. Participants generally found the tool efficacious.

Original languageEnglish
Pages (from-to)169-184
Number of pages16
JournalAdvances in Modelling and Analysis A
Volume54
Issue number2
Publication statusPublished - 28 Sept 2017

Bibliographical note

Publisher Copyright:
© 2017 AMSE Press. All rights reserved.

Keywords

  • Natural language processing
  • Text mining tool
  • Text pre-processing
  • Text processing

Cite this