Academic

CLARIN-PT-LDB: An Open LLM Leaderboard for Portuguese to assess Language, Culture and Civility

arXiv:2603.12872v1 Announce Type: new Abstract: This paper reports on the development of a leaderboard of Open Large Language Models (LLM) for European Portuguese (PT-PT), and on its associated benchmarks. This leaderboard comes as a way to address a gap in the evaluation of LLM for European Portuguese, which so far had no leaderboard dedicated to this variant of the language. The paper also reports on novel benchmarks, including some that address aspects of performance that so far have not been available in benchmarks for European Portuguese, namely model safeguards and alignment to Portuguese culture. The leaderboard is available at https://huggingface.co/spaces/PORTULAN/portuguese-llm-leaderboard.

Jo\~ao Silva, Lu\'is Gomes, Ant\'onio Branco · March 16, 2026 · 1 min read · 31 views

#cs.CL

Executive Summary

This article presents the development of CLARIN-PT-LDB, an open leaderboard for assessing the performance of Large Language Models (LLMs) in European Portuguese. The leaderboard fills a significant gap in the evaluation of LLMs for this language variant. Novel benchmarks have been created to address performance aspects, including model safeguards and cultural alignment. The leaderboard is made available through the Hugging Face platform. While this development is commendable, it also highlights the need for similar evaluations in other languages. The availability of this leaderboard provides a valuable resource for researchers and developers to assess and improve the performance of LLMs in European Portuguese.

Key Points

▸ The CLARIN-PT-LDB leaderboard is an open evaluation platform for Large Language Models in European Portuguese.
▸ It fills a significant gap in the evaluation of LLMs for this language variant.
▸ Novel benchmarks have been created to address performance aspects, including model safeguards and cultural alignment.

Merits

Strength in filling a significant gap

The CLARIN-PT-LDB leaderboard addresses a long-standing need for evaluation of LLMs in European Portuguese, providing a valuable resource for researchers and developers.

Demerits

Limited scope and applicability

The leaderboard is currently limited to European Portuguese and may not be directly applicable to other language variants or contexts.

Expert Commentary

The development of the CLARIN-PT-LDB leaderboard is a significant contribution to the field of natural language processing, particularly in the context of European Portuguese. It fills a long-standing gap in the evaluation of LLMs for this language variant and provides a valuable resource for researchers and developers. However, it also highlights the need for similar evaluations in other languages and the importance of creating language-specific evaluation metrics. The implications of this development are far-reaching, with potential improvements in natural language processing applications and implications for policy makers and regulators. Overall, this development is a step in the right direction towards creating more accurate and effective LLMs.

Recommendations

✓ Further research is needed to develop language-specific evaluation metrics and leaderboards for other language variants.
✓ The development of more nuanced and culturally sensitive benchmarks is essential to accurately assess the performance of LLMs in different language contexts.

Sources

arXiv - cs.CL

CLARIN-PT-LDB: An Open LLM Leaderboard for Portuguese to assess Language, Culture and Civility

AI Commentary

Executive Summary

Key Points

Merits

Strength in filling a significant gap

Demerits

Limited scope and applicability

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs