Feature Extraction to Filter Out Low-Quality Answers from Social Question Answering Sites

Pradeep Kumar Roy, Zishan Ahmad, Jyoti Prakash Singh, Snehasish Banerjee

Research output: Contribution to journalArticlepeer-review


Social Question Answering sites (SQAs) are online platforms that allow Internet users to ask questions, and obtain answers from others in the community. SQAs have been marred by the problem of low-quality answers. Worryingly, answer quality on SQAs have been reported to be following a downward trajectory in recent years. To this end, existing research has predominantly focused on finding the best answer, or identifying high-quality answers among the available responses. However, such scholarly efforts have not reduced the volume of low-quality answers on SQAs. Therefore, the goal of this research is to extract features in order to weed out low-quality answers as soon as they are posted on SQAs. Data from Stack Exchange was used to carry out the investigation. Informed by the literature, 26 features were extracted. Thereafter, machine learning algorithms were implemented that could correctly identify 85% to 96% of low-quality answers. The key contribution of this research is the development of a system to detect subpar answers on the fly at the time of posting. It is intended to be used as an early warning system that warns users about answer quality at the point of posting.
Original languageEnglish
Pages (from-to)7933-7944
Number of pages12
JournalIETE Journal of Research
Issue number11
Early online date21 Mar 2022
Publication statusPublished - 2023

Bibliographical note

© 2022 Informa UK Limited. This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details


  • answer quality
  • data balancing
  • low-quality answers
  • social question answering

Cite this