a 2024

Data Set Size Analysis for Detecting the Urgency of Discussion Forum Posts

ŠVÁBENSKÝ, Valdemar; François BOUCHET; Francine TARRAZONA; Michael LOPEZ II; Ryan S. BAKER et al.

Basic information

Original name

Data Set Size Analysis for Detecting the Urgency of Discussion Forum Posts

Authors

ŠVÁBENSKÝ, Valdemar; François BOUCHET; Francine TARRAZONA; Michael LOPEZ II and Ryan S. BAKER

Edition

14th International Conference on Learning Analytics and Knowledge, 2024

Other information

Language

English

Type of outcome

Konferenční abstrakta

Country of publisher

United States of America

Confidentiality degree

is not subject to a state or trade secret

References:

Marked to be transferred to RIV

No

Organization

Repository – Repository

Keywords in English

learning analytics; educational data mining; urgency detection; replication
Changed: 23/3/2024 03:48, RNDr. Daniel Jakubík

Abstract

In the original language

In both Massive Open Online Courses (MOOCs) and private courses, instructors face a large amount of queries in discussion forum posts that may merit a response. There has been ongoing research on how to employ machine learning to predict a post’s urgency in order to focus instructors’ attention. However, it is unclear how large a course is needed to develop these models. We took a publicly available data set of 3,503 labeled forum posts and code from one such prior study. We re-trained the six models described in the study, but with progressively smaller sample sizes, to determine if the models’ performance would be preserved. Likewise, we demonstrate that using random subsets even as small as 10% of the original data set achieves comparable performance to full data sets in five out of six models.

Files attached