Academic

N-gram-like Language Models Predict Reading Time Best

arXiv:2603.09872v1 Announce Type: new Abstract: Recent work has found that contemporary language models such as transformers can become so good at next-word prediction that the probabilities they calculate become worse for predicting reading time. In this paper, we propose that this can be explained by reading time being sensitive to simple n-gram statistics rather than the more complex statistics learned by state-of-the-art transformer language models. We demonstrate that the neural language models whose predictions are most correlated with n-gram probability are also those that calculate probabilities that are the most correlated with eye-tracking-based metrics of reading time on naturalistic text.

J
James A. Michaelov, Roger P. Levy
· · 1 min read · 9 views

arXiv:2603.09872v1 Announce Type: new Abstract: Recent work has found that contemporary language models such as transformers can become so good at next-word prediction that the probabilities they calculate become worse for predicting reading time. In this paper, we propose that this can be explained by reading time being sensitive to simple n-gram statistics rather than the more complex statistics learned by state-of-the-art transformer language models. We demonstrate that the neural language models whose predictions are most correlated with n-gram probability are also those that calculate probabilities that are the most correlated with eye-tracking-based metrics of reading time on naturalistic text.

Executive Summary

This article proposes that reading time can be predicted using N-gram-like language models, which outperform contemporary transformer language models. The authors argue that reading time is sensitive to simple N-gram statistics, rather than the complex statistics learned by transformer models. They demonstrate a positive correlation between N-gram probability and eye-tracking-based metrics of reading time on naturalistic text. This finding has implications for the development of more accurate reading time prediction models and highlights the potential limitations of transformer-based language models in this context. The study's results challenge the notion that transformer models are superior in all aspects of language processing and suggest that simpler models may be more effective in specific tasks, such as reading time prediction.

Key Points

  • N-gram-like language models predict reading time more accurately than transformer models
  • Reading time is sensitive to simple N-gram statistics, not complex transformer statistics
  • N-gram probability is positively correlated with eye-tracking-based metrics of reading time

Merits

Strength

The study provides a novel explanation for the limitations of transformer models in predicting reading time, challenging the prevailing wisdom in the field.

Originality

The authors' proposal to use N-gram-like models for reading time prediction is an innovative approach that offers a fresh perspective on the problem.

Demerits

Limitation

The study's findings are based on a specific dataset and may not generalize to all types of text or reading environments.

Scope

The study's focus on reading time prediction may not capture the full range of applications and limitations of transformer models.

Expert Commentary

This study offers a nuanced perspective on the limitations of transformer models and the potential benefits of simpler models, such as N-gram-like language models. The findings have implications for the development of more accurate reading time prediction models and highlight the need for a more comprehensive understanding of the strengths and limitations of different language models. While the study's results are promising, further research is needed to fully understand the implications of these findings and to develop more effective reading time prediction models.

Recommendations

  • Future research should investigate the applicability of N-gram-like models to other tasks and datasets, to determine the scope of their effectiveness.
  • The study's findings should be replicated and validated on a broader range of datasets and reading environments, to confirm the generalizability of the results.

Sources