Authors: Yilun Hua, Yoav Artzi
Abstract: Humans spontaneously use increasingly efficient language as interactions
progress, by adapting and forming ad-hoc conventions. This phenomenon has been
studied extensively using reference games, showing properties of human language
that go beyond relaying intents. It remains unexplored whether multimodal large
language models (MLLMs) similarly increase communication efficiency during
interactions, and what mechanisms they may adopt for this purpose. We introduce
ICCA, an automated framework to evaluate such conversational adaptation as an
in-context behavior in MLLMs. We evaluate several state-of-the-art MLLMs, and
observe that while they may understand the increasingly efficient language of
their interlocutor, they do not spontaneously make their own language more
efficient over time. This latter ability can only be elicited in some models
(e.g., GPT-4) with heavy-handed prompting. This shows that this property of
linguistic interaction does not arise from current training regimes, even
though it is a common hallmark of human language. ICCA is available at
https://github.com/lil-lab/ICCA.
Source: http://arxiv.org/abs/2408.01417v1