Measuring Bullshit in the Language Games played by ChatGPT

Authors: Alessandro Trevisan, Harry Giddens, Sarah Dillon, Alan F. Blackwell

Abstract: Generative large language models (LLMs), which create text without direct
correspondence to truth value, are widely understood to resemble the uses of
language described in Frankfurt’s popular monograph On Bullshit. In this paper,
we offer a rigorous investigation of this topic, identifying how the phenomenon
has arisen, and how it might be analysed. In this paper, we elaborate on this
argument to propose that LLM-based chatbots play the ‘language game of
bullshit’. We use statistical text analysis to investigate the features of this
Wittgensteinian language game, based on a dataset constructed to contrast the
language of 1,000 scientific publications with typical pseudo-scientific text
generated by ChatGPT. We then explore whether the same language features can be
detected in two well-known contexts of social dysfunction: George Orwell’s
critique of politics and language, and David Graeber’s characterisation of
bullshit jobs. Using simple hypothesis-testing methods, we demonstrate that a
statistical model of the language of bullshit can reliably relate the
Frankfurtian artificial bullshit of ChatGPT to the political and workplace
functions of bullshit as observed in natural human language.

Source: http://arxiv.org/abs/2411.15129v1