Academic

Differences in Typological Alignment in Language Models' Treatment of Differential Argument Marking

arXiv:2602.17653v1 Announce Type: new Abstract: Recent work has shown that language models (LMs) trained on synthetic corpora can exhibit typological preferences that resemble cross-linguistic regularities in human languages, particularly for syntactic phenomena such as word order. In this paper, we extend this paradigm to differential argument marking (DAM), a semantic licensing system in which morphological marking depends on semantic prominence. Using a controlled synthetic learning method, we train GPT-2 models on 18 corpora implementing distinct DAM systems and evaluate their generalization using minimal pairs. Our results reveal a dissociation between two typological dimensions of DAM. Models reliably exhibit human-like preferences for natural markedness direction, favoring systems in which overt marking targets semantically atypical arguments. In contrast, models do not reproduce the strong object preference in human languages, in which overt marking in DAM more often targets o

Iskar Deng, Nathalia Xu, Shane Steinert-Threlkeld · February 21, 2026 · 1 min read · 4 views

#cs.CL

Executive Summary

This article examines the typological alignment of language models in their treatment of differential argument marking (DAM), a semantic licensing system in which morphological marking depends on semantic prominence. The authors train GPT-2 models on 18 corpora implementing distinct DAM systems and evaluate their generalization using minimal pairs. The results reveal a dissociation between two typological dimensions of DAM, with models exhibiting human-like preferences for natural markedness direction but not reproducing the strong object preference in human languages. This study contributes to our understanding of how language models internalize typological regularities and highlights the need to re-examine the sources of these tendencies.

Key Points

▸ Language models exhibit typological preferences in their treatment of differential argument marking (DAM)
▸ Models reliably exhibit human-like preferences for natural markedness direction
▸ Models do not reproduce the strong object preference in human languages

Merits

Strength in methodology

The study employs a controlled synthetic learning method, ensuring a high degree of experimental control and allowing for meaningful comparisons between models.

Contribution to field

The study provides new insights into how language models internalize typological regularities, shedding light on the complex interplay between language models and human linguistic preferences.

Demerits

Limitation in scope

The study focuses solely on GPT-2 models and may not be generalizable to other language models or languages.

Need for longitudinal analysis

The study's static nature may not capture the evolving nature of language models' typological alignment, and longitudinal analysis would be beneficial to understand changes over time.

Expert Commentary

The study's findings on the typological alignment of language models in DAM are both fascinating and thought-provoking. By revealing a dissociation between natural markedness direction and strong object preference, the authors challenge our understanding of the sources of typological tendencies in language models. This study highlights the need for further investigation into the complex interplay between language models and human linguistic preferences, particularly in the context of DAM. The findings have significant implications for natural language processing, language education, and our broader understanding of linguistic universals.

Recommendations

✓ Future studies should investigate the role of linguistic diversity and cultural context in shaping the typological alignment of language models.
✓ Researchers should explore the application of language models in cross-linguistic analysis and comparison with human linguistic preferences, enabling the development of more effective and culturally sensitive language models.

Sources

arXiv - cs.CL

Something extraordinary is coming.

Differences in Typological Alignment in Language Models' Treatment of Differential Argument Marking

AI Commentary

Executive Summary

Key Points

Merits

Strength in methodology

Contribution to field

Demerits

Limitation in scope

Need for longitudinal analysis

Expert Commentary

Recommendations

Sources

Related Articles

How Large Language Models Get Stuck: Early structure with persistent …

Distribution-Aware Companding Quantization of Large Language Models

Policy Compliance of User Requests in Natural Language for AI …

LLM-Bootstrapped Targeted Finding Guidance for Factual MLLM-based Medical Report Generation

JCG, PC

HSOLLC Co., Ltd.