Are You a Real Software Engineer? Best Practices in Online Recruitment for Software Engineering Studies

14 citations

Abstract

Online research platforms, such as Prolific, offer rapid access to diverse participant pools but also pose unique challenges in participant qualification and skill verification. Previous studies reported mixed outcomes and challenges in leveraging online platforms for the recruitment of qualified software engineers. Drawing from our experience in conducting three different studies using Prolific, we propose best practices for recruiting and screening participants to enhance the quality and relevance of both qualitative and quantitative software engineering (SE) research samples. We propose refined best practices for recruitment in SE research on Prolific. (1) Iterative and controlled prescreening, enabling focused and manageable assessment of submissions (2) task-oriented and targeted questions that assess technical skills, knowledge of basic SE concepts, and professional engagement. (3) AI detection to verify the authenticity of free-text responses. (4) Qualitative and manual assessment of responses, ensuring authenticity and relevance in participant answers (5) Additional layers of prescreening are necessary when necessary to collect data relevant to the topic of the study. (6) Fair or generous compensation post-qualification to incentivize genuine participation. By sharing our experiences and lessons learned, we contribute to the development of effective and rigorous methods for SE empirical research. particularly the ongoing effort to establish guidelines to ensure reliable data collection. These practices have the potential to transferability to other participant recruitment platforms.

14
Citations
Research
Paper Only

Peer Review & Critical Discussion

3 threads

Potential Selection Bias in 2023 Cohort

DSJDr. Sarah J.
Verified PhD Candidate
12 replies

The participant pool shows a concerning overrepresentation of users from high-income demographics. Looking at Table 3, we can see that 78% of respondents had annual incomes above $75k, which significantly limits the generalizability of these findings to broader populations.

2 hours ago

Non-naive Participants Issue

MCM. Chen (OpenAI)
Data Scientist
8 replies

I've noticed a methodological concern regarding participant naivety. Given that Prolific users often complete multiple studies, there's a real risk that participants had prior exposure to similar experimental paradigms, which could confound the results.

5 hours ago

RLHF Applicability to This Study Design

PRWProf. R. Williams
Verified Researcher
15 replies

The implications for RLHF training pipelines are understated. If we accept the authors' conclusions about preference stability, this has direct consequences for how we should structure reward model training. The temporal decay effect described in Section 4.2 is particularly relevant.

1 day ago

Verify your expertise to join discussion

Create an account and verify your credentials to participate in peer discussions.