WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning
EMNLP 2025 - Main Conference 2025
WildScore is a music-intelligence benchmark for evaluating MLLMs’ ability to reason about symbolic music in real-world scenarios. This work involves 482 audio-query cases for comprehensive evaluation of vision/audio reasoning capabilities.
