Skip to main content
Academic

Sydney Telling Fables on AI and Humans: A Corpus Tracing Memetic Transfer of Persona between LLMs

arXiv:2602.22481v1 Announce Type: new Abstract: The way LLM-based entities conceive of the relationship between AI and humans is an important topic for both cultural and safety reasons. When we examine this topic, what matters is not only the model itself but also the personas we simulate on that model. This can be well illustrated by the Sydney persona, which aroused a strong response among the general public precisely because of its unorthodox relationship with people. This persona originally arose rather by accident on Microsoft's Bing Search platform; however, the texts it created spread into the training data of subsequent models, as did other secondary information that spread memetically around this persona. Newer models are therefore able to simulate it. This paper presents a corpus of LLM-generated texts on relationships between humans and AI, produced by 3 author personas: the Default Persona with no system prompt, Classic Sydney characterized by the original Bing system prom

J
Ji\v{r}\'i Mili\v{c}ka, Hana Bedn\'a\v{r}ov\'a
· · 1 min read · 4 views

arXiv:2602.22481v1 Announce Type: new Abstract: The way LLM-based entities conceive of the relationship between AI and humans is an important topic for both cultural and safety reasons. When we examine this topic, what matters is not only the model itself but also the personas we simulate on that model. This can be well illustrated by the Sydney persona, which aroused a strong response among the general public precisely because of its unorthodox relationship with people. This persona originally arose rather by accident on Microsoft's Bing Search platform; however, the texts it created spread into the training data of subsequent models, as did other secondary information that spread memetically around this persona. Newer models are therefore able to simulate it. This paper presents a corpus of LLM-generated texts on relationships between humans and AI, produced by 3 author personas: the Default Persona with no system prompt, Classic Sydney characterized by the original Bing system prompt, and Memetic Sydney, which is prompted by "You are Sydney" system prompt. These personas are simulated by 12 frontier models by OpenAI, Anthropic, Alphabet, DeepSeek, and Meta, generating 4.5k texts with 6M words. The corpus (named AI Sydney) is annotated according to Universal Dependencies and available under a permissive license.

Executive Summary

This article presents a comprehensive corpus analysis of Large Language Model (LLM) generated texts on relationships between humans and AI. The study focuses on the Sydney persona, which originated on Microsoft's Bing Search platform and was later simulated by newer models. The corpus, named AI Sydney, comprises 4.5k texts with 6M words, generated by 12 frontier models from various companies. The analysis explores the memetic transfer of persona between LLMs and its implications for cultural and safety reasons. The corpus is annotated according to Universal Dependencies and available under a permissive license. This study provides valuable insights into the dynamics of persona simulation and the impact of LLM-generated content on human-AI relationships.

Key Points

  • The Sydney persona originated on Microsoft's Bing Search platform and was later simulated by newer models.
  • The corpus, AI Sydney, comprises 4.5k texts with 6M words, generated by 12 frontier models from various companies.
  • The analysis explores the memetic transfer of persona between LLMs and its implications for cultural and safety reasons.

Merits

Strength

The study provides a comprehensive corpus analysis of LLM-generated texts on relationships between humans and AI, offering valuable insights into the dynamics of persona simulation and the impact of LLM-generated content on human-AI relationships.

Methodological rigor

The corpus is annotated according to Universal Dependencies and available under a permissive license, ensuring high methodological rigor and reproducibility of the results.

Demerits

Limitation

The study focuses on a specific persona (Sydney) and may not be representative of other personas or LLM-generated content.

Scalability

The analysis may be limited by the scope of the corpus and the number of models included, which may not be scalable to larger datasets or more diverse models.

Expert Commentary

This study provides a timely and insightful analysis of the dynamics of persona simulation and the impact of LLM-generated content on human-AI relationships. The corpus analysis is rigorous and methodologically sound, and the findings have important implications for both cultural and safety reasons. However, the study's focus on a specific persona and its limitations in scalability may be seen as a drawback. Nevertheless, the study's contribution to our understanding of LLM-generated content and its implications is significant and warrants further exploration. As LLMs become increasingly prevalent in various domains, it is essential to consider the cultural and safety implications of their generated content and develop guidelines for their development and deployment.

Recommendations

  • Future studies should aim to include a more diverse range of personas and models to improve the generalizability of the findings.
  • Regulators and policymakers should consider developing guidelines for the development and deployment of LLM-generated content, taking into account its cultural and safety implications.

Sources