Study reveals insights into language generation by large language models

Study reveals insights into language generation by large language models
Webp qf03bk0kh670dmcc28im7v7rcceg
Irene Tracey Vice-Chancellor | University of Oxford

The study examines how large language models (LLMs) generate language, challenging the belief that these models primarily learn through inferring rules from their training data. Instead, it suggests that LLMs rely heavily on stored examples and draw analogies when encountering unfamiliar words, similar to human behavior.

Researchers compared the judgments of humans with those of GPT-J, an open-source LLM developed by EleutherAI in 2021. The focus was on a common English word formation pattern where adjectives become nouns by adding '-ness' or '-ity'. For example, 'happy' becomes 'happiness', and 'available' becomes 'availability'. The team created 200 fictional adjectives like 'cormasive' and 'friquish', which GPT-J had not encountered before. The model was tasked with converting each adjective into a noun using either '-ness' or '-ity'. Its responses were then compared to those of people and two cognitive models: one that uses rules and another based on analogical reasoning.

Findings indicated that GPT-J's behavior resembled human analogical reasoning rather than rule-based methods. It used similarities to known words from its training data for decision-making. For instance, 'friquish' became 'friquishness', influenced by words like 'selfish', while the choice for 'cormasive' drew from pairs like 'sensitive/sensitivity'.

The study also noted subtle influences of word form frequency in training data. When tested on nearly 50,000 real English adjectives, GPT-J's predictions closely matched statistical patterns from its training data. This behavior suggested that the model formed memory traces from every individual word it encountered during training.

A significant difference between humans and LLMs is highlighted in analogy formation over examples. Humans have a mental dictionary encompassing meaningful word forms regardless of frequency, recognizing non-words like 'friquish'. They generalize analogically using this mental store. In contrast, LLMs generalize over specific instances without consolidating them into single entries.

Janet Pierrehumbert, Professor of Language Modelling at Oxford University and senior author of the study stated: "Although LLMs can generate language in a very impressive manner, it turns out that they do not think as abstractly as humans do." She added this may explain why more language data is needed for LLM training compared to human learning.

Co-lead author Dr Valentin Hofman from Ai2 and the University of Washington commented: "This study is a great example of synergy between Linguistics and AI as research areas." He emphasized how findings offer insight into LLM operations during language generation, aiding future developments in AI robustness and efficiency.

The research involved collaborators from LMU Munich and Carnegie Mellon University.

Related