Is this question going to be closed? Answering question closibility on Stack Exchange

Pradeep Kumar Roy, Jyoti Prakash Singh, Snehasish Banerjee

Research output: Contribution to journalArticlepeer-review

Abstract

Community question answering sites (CQAs) are often flooded with questions that are never answered. To cope with the problem, experienced users of Stack Exchange are now allowed to mark newly-posted questions as closed if they are of poor quality. Once closed, a question is no longer eligible to receive answers. However, identifying and closing subpar questions takes time. Therefore, the purpose of this paper is to develop a supervised machine learning system that predicts question closibility, the possibility of a newly posted question to be eventually closed. Building on extant research on CQA question quality, the supervised machine learning system uses 17 features that were grouped into four categories, namely, asker features, community features, question content features, and textual features. The performance of the developed system was tested on questions posted on Stack Exchange from 11 randomly chosen topics. The classification performance was generally promising and outperformed the baseline. Most of the measures of precision, recall, F1-score, and AUC were above 0.90 irrespective of the topic of questions. By conceptualizing question closibility, the paper extends previous CQA research on question quality. Unlike previous studies, which were mostly limited to programming-related questions from Stack Overflow, this one empirically tests question closibility on questions from 11 randomly selected topics. The set of features used for classification offers a framework of question closibility that is not only more comprehensive but also more parsimonious compared with prior works.
Original languageEnglish
Number of pages17
JournalJournal of Information Science
Early online date13 Oct 2022
DOIs
Publication statusE-pub ahead of print - 13 Oct 2022

Bibliographical note

© The Author(s) 2022. This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details

Keywords

  • closed question
  • community question answering
  • machine learning
  • question quality
  • Stack Exchange
  • unanswered question

Cite this