Discover 3 peer-reviewed studies in Vision Models (2024–2025). Explore research findings powered by Prolific's diverse participant panel.
This page lists 3 peer-reviewed papers in the research area of Vision Models in the Prolific Citations Library, a curated collection of research powered by high-quality human data from Prolific.
-
Authors: LM Schulze Buschoff, E Akata, M Bethge
Year: 2025
Published in: Nature Machine ..., 2025 - nature.com
Institution: Max Planck Institute
Research Area: Visual Cognition, Multimodal Large Language Models (MLLMs), Vision-Language Models (VLMs)
Discipline: Cognitive Science, Artificial Intelligence, Computer Vision
Vision-based large language models show proficiency in visual data interpretation but fall short in human-like abilities for causal reasoning, intuitive physics, and social cognition.
Methods: Controlled experiments evaluating model performance on tasks related to intuitive physics, causal reasoning, and intuitive psychology using visual processing benchmarks.
Key Findings: Model capabilities in understanding physical interactions, causal relationships, and social preferences.
DOI: https://doi.org/10.1038/s42256-024-00963-y
Citations: 70
-
Authors: M Ku, T Li, K Zhang, Y Lu, X Fu, W Zhuang
Year: 2024
Published in: - arXiv preprint arXiv …, 2023 - arxiv.org
Institution: University of Waterloo, Ohio State University, University of California Santa Barbara, University of Pensylvania
Research Area: AI alignment, Representation learning, Cognitive computational modeling, Vision foundation models evaluation, Multimodal, Vision models
Discipline: Computer Science, Artificial Intelligence, Machine Learning
This paper presents a method for **aligning machine vision model representations with human visual similarity judgments across different abstraction levels, improving how well models reflect human perceptual and conceptual organization and enhancing generalization and uncertainty prediction.
DOI: https://doi.org/10.48550/arXiv.2310.01596
Citations: 59
-
Authors: Tianwei Yin, Michaël Gharbi, Taesung Park, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Freeman
Year: 2024
Published in: ArXiv
Institution: Adobe Research, Massachusetts Institute of Technology
Research Area: Computer Vision, Image Synthesis, Diffusion Models
Discipline: Artificial Intelligence