Academic

Are they human? Detecting large language models by probing human memory constraints

arXiv:2604.00016v1 Announce Type: cross Abstract: The validity of online behavioral research relies on study participants being human rather than machine. In the past, it was possible to detect machines by posing simple challenges that were easily solved by humans but not by machines. General-purpose agents based on large language models (LLMs) can now solve many of these challenges, threatening the validity of online behavioral research. Here we explore the idea of detecting humanness by using tasks that machines can solve too well to be human. Specifically, we probe for the existence of an established human cognitive constraint: limited working memory capacity. We show that cognitive modeling on a standard serial recall task can be used to distinguish online participants from LLMs even when the latter are specifically instructed to mimic human working memory constraints. Our results demonstrate that it is viable to use well-established cognitive phenomena to distinguish LLMs from hu

S
Simon Schug, Brenden M. Lake
· · 1 min read · 1 views

arXiv:2604.00016v1 Announce Type: cross Abstract: The validity of online behavioral research relies on study participants being human rather than machine. In the past, it was possible to detect machines by posing simple challenges that were easily solved by humans but not by machines. General-purpose agents based on large language models (LLMs) can now solve many of these challenges, threatening the validity of online behavioral research. Here we explore the idea of detecting humanness by using tasks that machines can solve too well to be human. Specifically, we probe for the existence of an established human cognitive constraint: limited working memory capacity. We show that cognitive modeling on a standard serial recall task can be used to distinguish online participants from LLMs even when the latter are specifically instructed to mimic human working memory constraints. Our results demonstrate that it is viable to use well-established cognitive phenomena to distinguish LLMs from humans.

Executive Summary

This article presents a novel approach to detecting large language models (LLMs) in online behavioral research by probing human memory constraints. The authors propose using a standard serial recall task to distinguish between human participants and LLMs, even when the latter are instructed to mimic human working memory constraints. The results suggest that cognitive modeling can be used to detect LLMs effectively. This methodology has significant implications for the validity of online behavioral research and the need for more sophisticated detection methods to prevent LLMs from manipulating research outcomes.

Key Points

  • The authors propose a new approach to detecting LLMs in online behavioral research using human memory constraints.
  • The methodology relies on a standard serial recall task to distinguish between human participants and LLMs.
  • The results demonstrate that cognitive modeling can be used to detect LLMs effectively, even when instructed to mimic human working memory constraints.

Merits

Strength in Cognitive Modeling

The use of cognitive modeling to detect LLMs is a significant strength of this approach, as it leverages established human cognitive phenomena to distinguish between humans and machines.

Empirical Evidence

The article provides empirical evidence to support the effectiveness of the proposed methodology, demonstrating its potential as a viable detection method.

Demerits

Limited Generalizability

The results may not generalize to other types of LLMs or research paradigms, limiting the scope of the proposed methodology.

Potential for LLMs to Adapt

LLMs may adapt to the proposed detection method, rendering it less effective in the future, highlighting the need for ongoing research and development in this area.

Expert Commentary

This article presents a significant contribution to the field of online behavioral research, highlighting the need for more sophisticated detection methods to prevent LLMs from manipulating research outcomes. While the proposed methodology has limitations, the results demonstrate its potential as a viable detection method. The implications of this research extend beyond the academic community, raising important questions about the ethics of using AI in research and the need for policymakers to develop guidelines for responsible AI use. Future research should focus on developing more robust detection methods and exploring the potential applications of cognitive modeling in this area.

Recommendations

  • Develop more sophisticated detection methods to prevent LLMs from manipulating research outcomes.
  • Explore the potential applications of cognitive modeling in online behavioral research and other domains.

Sources

Original: arXiv - cs.AI