How LLMs Distort Our Written Language
arXiv:2603.18161v1 Announce Type: new Abstract: Large language models (LLMs) are used by over a billion people globally, most often to assist with writing. In this work, we demonstrate that LLMs not only alter the voice and tone of human writing, but also consistently alter the intended meaning. First, we conduct a human user study to understand how people actually interact with LLMs when using them for writing. Our findings reveal that extensive LLM use led to a nearly 70% increase in essays that remained neutral in answering the topic question. Significantly more heavy LLM users reported that the writing was less creative and not in their voice. Next, using a dataset of human-written essays that was collected in 2021 before the widespread release of LLMs, we study how asking an LLM to revise the essay based on the human-written feedback in the dataset induces large changes in the resulting content and meaning. We find that even when LLMs are prompted with expert feedback and asked t
arXiv:2603.18161v1 Announce Type: new Abstract: Large language models (LLMs) are used by over a billion people globally, most often to assist with writing. In this work, we demonstrate that LLMs not only alter the voice and tone of human writing, but also consistently alter the intended meaning. First, we conduct a human user study to understand how people actually interact with LLMs when using them for writing. Our findings reveal that extensive LLM use led to a nearly 70% increase in essays that remained neutral in answering the topic question. Significantly more heavy LLM users reported that the writing was less creative and not in their voice. Next, using a dataset of human-written essays that was collected in 2021 before the widespread release of LLMs, we study how asking an LLM to revise the essay based on the human-written feedback in the dataset induces large changes in the resulting content and meaning. We find that even when LLMs are prompted with expert feedback and asked to only make grammar edits, they still change the text in a way that significantly alters its semantic meaning. We then examine LLM-generated text in the wild, specifically focusing on the 21% of AI-generated scientific peer reviews at a recent top AI conference. We find that LLM-generated reviews place significantly less weight on clarity and significance of the research, and assign scores that, on average, are a full point higher.These findings highlight a misalignment between the perceived benefit of AI use and an implicit, consistent effect on the semantics of human writing, motivating future work on how widespread AI writing will affect our cultural and scientific institutions.
Executive Summary
This study critically examines the impact of Large Language Models (LLMs) on human written language, revealing significant alterations in voice, tone, and intended meaning. The research demonstrates that extensive LLM use leads to a nearly 70% increase in neutral essays, and heavy users perceive the writing as less creative and not in their voice. Furthermore, the analysis shows that even when LLMs are prompted with expert feedback, they induce large changes in the text's semantic meaning. The study also examines AI-generated scientific peer reviews and finds that they assign scores that are a full point higher, prioritizing other factors over clarity and significance. The findings raise concerns about the misalignment between the perceived benefits of AI use and its implicit effects on human writing, highlighting the need for future research on the impact of AI writing on cultural and scientific institutions.
Key Points
- ▸ LLMs alter human writing, changing voice, tone, and intended meaning.
- ▸ Extensive LLM use leads to a nearly 70% increase in neutral essays.
- ▸ Heavy LLM users perceive the writing as less creative and not in their voice.
- ▸ LLMs induce large changes in text's semantic meaning even with expert feedback.
- ▸ AI-generated scientific peer reviews prioritize other factors over clarity and significance.
Merits
Robust Research Design
The study employs a multi-faceted approach, including human user studies, dataset analysis, and examination of AI-generated text in the wild, providing a comprehensive understanding of LLMs' impact on human writing.
Significant Findings
The research reveals substantial alterations in human writing, highlighting the need for further investigation into the effects of LLMs on cultural and scientific institutions.
Demerits
Limited Generalizability
The study's focus on a specific dataset and user group may limit the generalizability of its findings to broader populations and contexts.
Methodological Limitations
The reliance on self-reported measures and the use of a single dataset for analysis may introduce biases and limit the study's validity.
Expert Commentary
The study's findings have significant implications for our understanding of the role of LLMs in shaping human written language. While the research highlights the potential benefits of LLMs, it also raises important concerns about their impact on the integrity of human writing. As AI-generated content becomes increasingly prevalent, it is essential to investigate the effects of LLMs on cultural and scientific institutions, and to develop strategies for mitigating their potential biases and alterations. This study provides a critical step in this direction, highlighting the need for further research into the consequences of widespread AI writing.
Recommendations
- ✓ Recommendation 1: Future research should investigate the long-term effects of LLM use on human writing and cognitive abilities, examining the potential implications for education and writing practices.
- ✓ Recommendation 2: Institutions and regulatory bodies should establish clear guidelines and standards for the use of LLMs in writing and publishing, ensuring that their impact is transparent and their use is responsible.