Crowdsourced comparative judgement for evaluating learner texts: How reliable are judges recruited from an online crowdsourcing platform?

Authors: P Thwaites, N Vandeweerd, M Paquot

Published: 2025

Publication: Applied Linguistics, 2025 - academic.oup.com

The study demonstrates that crowdsourcing platforms can recruit judges to evaluate learner texts with reliability and validity comparable to assessments conducted by trained linguists.

Methods: Judges recruited via an online crowdsourcing platform conducted comparative judgement assessments of learner texts to measure writing proficiency.

Key Findings: Reliability and concurrent validity of learner text evaluations performed via crowdsourced judges compared to linguist evaluations.

Limitations: Potential variability in judge expertise and availability on crowdsourcing platforms may affect evaluation consistency in broader applications.

Institution: University College Londonouvain, Radboud University Nijmegen, Fonds de la Recherche Scientifique – FNRS

Research Area: Applied Linguistics, Educational Assessment, Crowdsourcing

Discipline: Applied Linguistics

Citations: 10