Taking the person seriously: Ethically aware IS research in the era of reinforcement learning-based personalization

33 citations

Abstract

Advances in reinforcement learning and implicit data collection on large-scale commercial platforms mark the beginning of a new era of personalization aimed at the adaptive control of human user environments. We present five emergent features of this new paradigm of personalization that endanger persons and societies at scale and analyze their potential to reduce personal autonomy, destabilize social and political systems, and facilitate mass surveillance and social control, among other concerns. We argue that current data protection laws, most notably the European Union's General Data Protection Regulation, are limited in their ability to adequately address many of these issues. Nevertheless, we believe that IS researchers are well-situated to engage with and investigate this new era of personalization. We propose three distinct directions for ethically aware reinforcement learning-based personalization research uniquely suited to the strengths of IS researchers across the sociotechnical spectrum.

33
Citations
Research
Paper Only

Study specs

The study presents a conceptual analysis of emergent features and societal risks associated with reinforcement learning-based personalization and proposes research directions.

Study Type
Literature Review
Year
2025
Human Data Platform
Prolific

Measured Outcomes

Potential harms of reinforcement learning-based personalization, such as reduced autonomy, social and political destabilization, and mass surveillance, alongside the limitations of current data protection laws.

Peer Review & Critical Discussion

3 threads

Potential Selection Bias in 2023 Cohort

DSJDr. Sarah J.
Verified PhD Candidate
12 replies

The participant pool shows a concerning overrepresentation of users from high-income demographics. Looking at Table 3, we can see that 78% of respondents had annual incomes above $75k, which significantly limits the generalizability of these findings to broader populations.

2 hours ago

Non-naive Participants Issue

MCM. Chen (OpenAI)
Data Scientist
8 replies

I've noticed a methodological concern regarding participant naivety. Given that Prolific users often complete multiple studies, there's a real risk that participants had prior exposure to similar experimental paradigms, which could confound the results.

5 hours ago

RLHF Applicability to This Study Design

PRWProf. R. Williams
Verified Researcher
15 replies

The implications for RLHF training pipelines are understated. If we accept the authors' conclusions about preference stability, this has direct consequences for how we should structure reward model training. The temporal decay effect described in Section 4.2 is particularly relevant.

1 day ago

Verify your expertise to join discussion

Create an account and verify your credentials to participate in peer discussions.