Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context
arXiv:2603.07792v1 Announce Type: new Abstract: Large language models (LLMs) increasingly influence global digital ecosystems, yet their potential to perpetuate social and cultural biases remains poorly understood in underrepresented contexts. This study presents a systematic analysis of representational biases in seven state-of-the-art LLMs: GPT-4o-mini, Claude-3-Sonnet, Claude-4-Sonnet, Gemini-2.0-Flash, Gemini-2.0-Lite, Llama-3-70B, and Mistral-Nemo in the Nepali cultural context. Using Croissant-compliant dataset of 2400+ stereotypical and anti-stereotypical sentence pairs on gender roles across social domains, we implement an evaluation framework, Dual-Metric Bias Assessment (DMBA), combining two metrics: (1) agreement with biased statements and (2) stereotypical completion tendencies. Results show models exhibit measurable explicit agreement bias, with mean bias agreement ranging from 0.36 to 0.43 across decoding configurations, and an implicit completion bias rate of 0.740-0.75
arXiv:2603.07792v1 Announce Type: new Abstract: Large language models (LLMs) increasingly influence global digital ecosystems, yet their potential to perpetuate social and cultural biases remains poorly understood in underrepresented contexts. This study presents a systematic analysis of representational biases in seven state-of-the-art LLMs: GPT-4o-mini, Claude-3-Sonnet, Claude-4-Sonnet, Gemini-2.0-Flash, Gemini-2.0-Lite, Llama-3-70B, and Mistral-Nemo in the Nepali cultural context. Using Croissant-compliant dataset of 2400+ stereotypical and anti-stereotypical sentence pairs on gender roles across social domains, we implement an evaluation framework, Dual-Metric Bias Assessment (DMBA), combining two metrics: (1) agreement with biased statements and (2) stereotypical completion tendencies. Results show models exhibit measurable explicit agreement bias, with mean bias agreement ranging from 0.36 to 0.43 across decoding configurations, and an implicit completion bias rate of 0.740-0.755. Importantly, implicit completion bias follows a non-linear, U-shaped relationship with temperature, peaking at moderate stochasticity (T=0.3) and declining slightly at higher temperatures. Correlation analysis under different decoding settings revealed that explicit agreement strongly aligns with stereotypical sentence agreement but is a weak and often negative predictor of implicit completion bias, indicating generative bias is poorly captured by agreement metrics. Sensitivity analysis shows increasing top-p amplifies explicit bias, while implicit generative bias remains largely stable. Domain-level analysis shows implicit bias is strongest for race and sociocultural stereotypes, while explicit agreement bias is similar across gender and sociocultural categories, with race showing the lowest explicit agreement. These findings highlight the need for culturally grounded datasets and debiasing strategies for LLMs in underrepresented societies.
Executive Summary
This article presents a systematic analysis of representational biases in seven state-of-the-art large language models (LLMs) in the Nepali cultural context. Using a novel evaluation framework, Dual-Metric Bias Assessment (DMBA), the authors demonstrate the existence of measurable explicit agreement bias and implicit completion bias across the models. The results highlight the importance of culturally grounded datasets and debiasing strategies for LLMs in underrepresented societies. The findings also reveal that explicit agreement bias is a weak predictor of implicit completion bias, indicating the need for more comprehensive evaluation metrics. This study contributes to the growing body of research on the social and cultural biases of LLMs and has significant implications for their development and deployment in diverse global contexts.
Key Points
- ▸ Dual-Metric Bias Assessment (DMBA) framework evaluates representational biases in LLMs
- ▸ Seven state-of-the-art LLMs exhibit measurable explicit agreement bias and implicit completion bias
- ▸ Implicit completion bias follows a non-linear, U-shaped relationship with temperature
- ▸ Explicit agreement bias is a weak predictor of implicit completion bias
- ▸ Culturally grounded datasets and debiasing strategies are essential for LLMs in underrepresented societies
Merits
Methodological rigor
The authors employ a systematic and comprehensive evaluation framework, DMBA, to analyze the biases in LLMs, ensuring a high level of methodological rigor.
Culturally relevant context
The study focuses on the Nepali cultural context, providing valuable insights into the social and cultural biases of LLMs in underrepresented societies.
Novel findings
The authors present novel findings on the relationship between temperature and implicit completion bias, highlighting the need for more comprehensive evaluation metrics.
Demerits
Limited generalizability
The study's findings may not be generalizable to other languages and cultures, limiting the broader applicability of the results.
Limited scope
The study focuses on a specific set of LLMs and a single cultural context, which may not capture the full range of biases and complexities in LLMs.
Expert Commentary
The study presents a novel contribution to the field of AI and machine learning bias, highlighting the need for culturally grounded datasets and debiasing strategies. The findings also underscore the limitations of traditional evaluation metrics, such as explicit agreement bias, in capturing the complexities of implicit completion bias. As LLMs continue to influence global digital ecosystems, the need for more comprehensive evaluation frameworks and culturally sensitive AI development practices becomes increasingly pressing. The study's methodology and findings serve as a valuable foundation for future research in this area, and its implications have significant practical and policy relevance.
Recommendations
- ✓ Future studies should investigate the generalizability of the DMBA framework across different languages and cultures.
- ✓ Researchers should explore the development of more comprehensive evaluation metrics that capture both explicit and implicit biases in LLMs.