Papers by S Mahamood

Explore 1 peer-reviewed study by S Mahamood in Summarization evaluation and Natural Language Processing (2025). Discover research powered by Prolific's participant panel.

This page lists 1 peer-reviewed paper authored or co-authored by S Mahamood in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (1 of 1)

Real-World Summarization: When Evaluation Reaches Its Limits

Authors: P Schmidtová, O Dušek, S Mahamood

Year: 2025

Published in: ArXiv

Institution: Charles University, Trivago

Research Area: Summarization evaluation, Natural Language Processing, LLM-as-a-Judge, AI Evaluation

Discipline: Natural Language Processing

Simpler metrics like word overlap surprisingly correlate well with human judgments in summarization evaluation, outperforming complex methods in out-of-domain applications, though LLMs remain unreliable for assessment due to annotation biases.

Methods: Human evaluation campaigns with categorical error assessment, span-level annotations, and comparison of traditional metrics, trainable models, and LLM-as-a-judge approaches.

Key Findings: Effectiveness of summarization evaluation methods and their correlation with human judgment, along with business impacts of incorrect information in generated summaries.

Citations: 1