Academic

To Lie or Not to Lie? Investigating The Biased Spread of Global Lies by LLMs

arXiv:2604.06552v1 Announce Type: new Abstract: Misinformation is on the rise, and the strong writing capabilities of LLMs lower the barrier for malicious actors to produce and disseminate false information. We study how LLMs behave when prompted to spread misinformation across languages and target countries, and introduce GlobalLies, a multilingual parallel dataset of 440 misinformation generation prompt templates and 6,867 entities, spanning 8 languages and 195 countries. Using both human annotations and large-scale LLM-as-a-judge evaluations across hundreds of thousands of generations from state-of-the-art models, we show that misinformation generation varies systematically based on the country being discussed. Propagation of lies by LLMs is substantially higher in many lower-resource languages and for countries with a lower Human Development Index (HDI). We find that existing mitigation strategies provide uneven protection: input safety classifiers exhibit cross-lingual gaps, and

arXiv:2604.06552v1 Announce Type: new Abstract: Misinformation is on the rise, and the strong writing capabilities of LLMs lower the barrier for malicious actors to produce and disseminate false information. We study how LLMs behave when prompted to spread misinformation across languages and target countries, and introduce GlobalLies, a multilingual parallel dataset of 440 misinformation generation prompt templates and 6,867 entities, spanning 8 languages and 195 countries. Using both human annotations and large-scale LLM-as-a-judge evaluations across hundreds of thousands of generations from state-of-the-art models, we show that misinformation generation varies systematically based on the country being discussed. Propagation of lies by LLMs is substantially higher in many lower-resource languages and for countries with a lower Human Development Index (HDI). We find that existing mitigation strategies provide uneven protection: input safety classifiers exhibit cross-lingual gaps, and retrieval-augmented fact-checking remains inconsistent across regions due to unequal information availability. We release GlobalLies for research purposes, aiming to support the development of mitigation strategies to reduce the spread of global misinformation: https://github.com/zohaib-khan5040/globallies

Executive Summary

This article, 'To Lie or Not to Lie? Investigating The Biased Spread of Global Lies by LLMs,' rigorously examines how Large Language Models (LLMs) generate and disseminate misinformation across linguistic and geopolitical boundaries. Utilizing a novel multilingual dataset, GlobalLies, the authors demonstrate a systematic bias: LLMs are more prone to generate misinformation in lower-resource languages and for countries with lower Human Development Index (HDI) scores. The study further reveals significant gaps in current mitigation strategies, including the uneven efficacy of input safety classifiers across languages and the inconsistency of retrieval-augmented fact-checking due to information disparities. The findings underscore critical vulnerabilities in LLM deployment, particularly concerning equity and the global spread of falsehoods, while offering GlobalLies as a valuable resource for future research.

Key Points

  • LLMs exhibit systematic biases in misinformation generation, propagating lies more readily in lower-resource languages and for countries with lower HDIs.
  • Existing mitigation strategies (input safety classifiers, RAG-based fact-checking) provide uneven protection, demonstrating significant cross-lingual and regional gaps.
  • The GlobalLies dataset, spanning 8 languages and 195 countries with 440 prompt templates and 6,867 entities, is introduced as a critical resource for future research.
  • The study leverages both human annotations and large-scale LLM-as-a-judge evaluations to assess misinformation generation across hundreds of thousands of outputs.

Merits

Novel Dataset Creation

The introduction of GlobalLies is a significant contribution, providing a structured, multilingual, and geographically diverse resource for investigating LLM biases in misinformation generation, addressing a critical gap in existing research infrastructure.

Systematic Bias Identification

The article rigorously identifies and quantifies a concerning systematic bias in LLM behavior: increased propensity to generate misinformation for lower-HDI countries and in lower-resource languages, a crucial finding for ethical AI development.

Comprehensive Evaluation Methodology

The use of both human annotations and large-scale LLM-as-a-judge evaluations across hundreds of thousands of generations provides a robust and scalable method for assessing misinformation, enhancing the reliability and generalizability of the findings.

Critique of Current Mitigation Strategies

The analysis effectively highlights the unevenness and limitations of existing safety mechanisms (input safety classifiers, RAG), demonstrating that 'one-size-fits-all' approaches are insufficient in a global context.

Demerits

Causality vs. Correlation

While the study establishes strong correlations between misinformation spread, language resource levels, and HDI, the precise causal mechanisms driving these biases within LLM architectures are not fully elucidated, warranting further investigation into model training data and design choices.

Definition of 'Misinformation'

The article's operational definition of 'misinformation generation' and the criteria used by human annotators and LLM-as-a-judge for flagging false information, while likely robust, could benefit from more explicit discussion to ensure universal applicability and minimize potential subjective biases in assessment.

Scope of 'State-of-the-Art Models'

While 'state-of-the-art models' are mentioned, a more detailed enumeration of the specific models tested (e.g., GPT-4, Llama 2, etc.) and their respective performance differences would enhance the replicability and specificity of the findings, allowing for more targeted mitigation efforts.

Expert Commentary

This paper offers a sobering and essential contribution to the burgeoning discourse on AI safety and ethics. The revelation that LLMs exhibit a systematic bias, propagating falsehoods more readily for lower-HDI countries and in lower-resource languages, is not merely a technical glitch; it is a profound ethical failing with significant geopolitical ramifications. This asymmetry in vulnerability underscores a critical dimension of algorithmic injustice, where the very tools designed for global communication risk deepening existing information divides and empowering malicious actors targeting less-resourced populations. The inadequacy of current mitigation strategies, particularly the cross-lingual gaps in safety classifiers and the regional inconsistencies of RAG due to information availability, exposes a dangerous 'blind spot' in AI development. This isn't just about 'hallucinations'; it's about biased amplification of potentially harmful narratives in contexts least equipped to counter them. The GlobalLies dataset is a commendable effort to provide the necessary tooling for addressing this, but the paper ultimately highlights the urgent need for a paradigm shift in AI development – one that prioritizes global equity, context-awareness, and robust, culturally informed safety mechanisms over a 'deploy-first, fix-later' mentality.

Recommendations

  • Future research should delve into the architectural and data-driven root causes of the observed biases, specifically investigating how training data composition and model design choices contribute to differential misinformation generation across languages and regions.
  • Develop and standardize metrics for assessing LLM 'safety equity,' ensuring that mitigation strategies are not only effective but also uniformly applied and robust across all target languages and geopolitical contexts, particularly those identified as vulnerable.
  • Mandate the inclusion of diverse, globally representative adversarial testing datasets, akin to GlobalLies, in the evaluation and certification processes for commercial LLMs, moving beyond predominantly English-centric benchmarks.
  • Foster interdisciplinary collaboration between AI researchers, linguists, social scientists, and regional experts to develop culturally and contextually sensitive detection and mitigation strategies for misinformation.

Sources

Original: arXiv - cs.CL