Papers by G Bonetta

Explore 2 peer-reviewed studies by G Bonetta in Multimodal Reasoning and AI Benchmarking (2024–2025). Discover research powered by Prolific's participant panel.

This page lists 2 peer-reviewed papers authored or co-authored by G Bonetta in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.

Papers (2 of 2)

All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark

Authors: D Testa, G Bonetta, R Bernardi, A Bondielli

Year: 2025

Published in: arXiv preprint arXiv ..., 2025 - arxiv.org

Institution: Università di Roma La Sapienza

Research Area: Multimodal Reasoning, AI Benchmarking

Discipline: Artificial Intelligence

MAIA is a benchmark designed to evaluate the reasoning abilities of Vision Language Models (VLMs) on video-based tasks, with a focus on Italian culture and language, revealing their fragility in consistency and visually grounded language comprehension and generation.

Methods: MAIA comprises a set of video-related questions tested with two tasks: visual statement verification and open-ended visual question answering, categorized into twelve reasoning types to disentangle language-vision relations.

Key Findings: The ability of Vision Language Models (VLMs) to perform consistent, visually grounded natural language understanding and generation across fine-grained reasoning categories.

DOI: https://doi.org/10.48550/arXiv.2502.16989
MAIA: A benchmark for multimodal AI assessment

Authors: D Testa, G Bonetta, R Bernardi

Year: 2024

Published in: Proceedings of the ..., 2025 - aclanthology.org

Institution: Università di Roma La Sapienza, Fondazione Bruno Kessler, University of Pisa

Research Area: Multimodal AI Assessment, Visual Language Models (VLMs), Video Understanding, Computational Linguistics

Discipline: Artificial Intelligence, Computational Linguistics