LLM-based Semantic Augmentation for Harmful Content Detection

Authors: Elyas Meguellati1, Assad Zeghina2, Shazia Sadiq1, Gianluca Demartini1

Published: 2025

Publication: ArXiv

The paper introduces an approach using LLM-based semantic augmentation for harmful content detection on social media, achieving performance comparable to human-annotated models but at reduced cost.

Methods: The researchers utilize LLMs to clean noisy text and generate explanations for context-rich preprocessing, then evaluate the augmented training sets on multiple high-context datasets such as SemEval 2024 Persuasive Meme, Google Jigsaw toxic comments, and Facebook hateful memes datasets.

Key Findings: The efficacy of LLM-based semantic augmentation in enhancing training sets for social media tasks such as propaganda detection, hateful meme classification, and toxicity identification.

Limitations: LLMs underperform in zero-shot classification for complex tasks compared to supervised models; the approach may depend on dataset-specific tuning and lacks exploration of potential biases introduced by LLM-generated augmentations.

Institution: University of Queensland,University of Strasbourg

Research Area: Natural Language Processing, Harmful Content Detection

Discipline: Natural Language Processing