Right at My Level: A Unified Multilingual Framework for Proficiency-Aware Text Simplification
arXiv:2604.05302v1 Announce Type: new Abstract: Text simplification supports second language (L2) learning by providing comprehensible input, consistent with the Input Hypothesis. However, constructing personalized parallel corpora is costly, while existing large language model (LLM)-based readability control methods rely on pre-labeled sentence corpora and primarily target English. We propose Re-RIGHT, a unified reinforcement learning framework for adaptive multilingual text simplification without parallel corpus supervision. We first show that prompting-based lexical simplification at target proficiency levels (CEFR, JLPT, TOPIK, and HSK) performs poorly at easier levels and for non-English languages, even with state-of-the-art LLMs such as GPT-5.2 and Gemini 2.5. To address this, we collect 43K vocabulary-level data across four languages (English, Japanese, Korean, and Chinese) and train a compact 4B policy model using Re-RIGHT, which integrates three reward modules: vocabulary cov
arXiv:2604.05302v1 Announce Type: new Abstract: Text simplification supports second language (L2) learning by providing comprehensible input, consistent with the Input Hypothesis. However, constructing personalized parallel corpora is costly, while existing large language model (LLM)-based readability control methods rely on pre-labeled sentence corpora and primarily target English. We propose Re-RIGHT, a unified reinforcement learning framework for adaptive multilingual text simplification without parallel corpus supervision. We first show that prompting-based lexical simplification at target proficiency levels (CEFR, JLPT, TOPIK, and HSK) performs poorly at easier levels and for non-English languages, even with state-of-the-art LLMs such as GPT-5.2 and Gemini 2.5. To address this, we collect 43K vocabulary-level data across four languages (English, Japanese, Korean, and Chinese) and train a compact 4B policy model using Re-RIGHT, which integrates three reward modules: vocabulary coverage, semantic preservation, and coherence. Compared to the stronger LLM baselines, Re-RIGHT achieves higher lexical coverage at target proficiency levels while maintaining original meaning and fluency.
Executive Summary
The article introduces Re-RIGHT, a reinforcement learning framework designed to enhance multilingual text simplification for second language (L2) learners by adapting text to specific proficiency levels (e.g., CEFR, JLPT). Unlike existing approaches that rely on pre-labeled corpora or English-centric models, Re-RIGHT operates without parallel corpus supervision and demonstrates improved performance across English, Japanese, Korean, and Chinese. By integrating vocabulary coverage, semantic preservation, and coherence rewards, the 4B policy model achieves superior lexical alignment with target proficiency levels while maintaining semantic integrity and fluency, outperforming state-of-the-art LLMs like GPT-5.2 and Gemini 2.5 in controlled comparisons.
Key Points
- ▸ Re-RIGHT addresses a critical gap in multilingual text simplification by eliminating the dependency on parallel corpora, which are costly to construct and predominantly English-focused.
- ▸ The framework leverages reinforcement learning with three reward modules—vocabulary coverage, semantic preservation, and coherence—to optimize simplification at target proficiency levels (CEFR, JLPT, TOPIK, HSK).
- ▸ Empirical validation across 43K vocabulary-level data points in four languages (English, Japanese, Korean, Chinese) shows Re-RIGHT's superiority over advanced LLMs, particularly in non-English languages and lower proficiency levels.
Merits
Methodological Innovation
Re-RIGHT's reinforcement learning approach with unsupervised reward modules (vocabulary coverage, semantic preservation, coherence) represents a significant advancement over traditional rule-based or supervised LLM methods, enabling adaptive multilingual simplification without labeled data.
Empirical Rigor
The study's extensive evaluation across 43K data points and four languages, including non-English targets, provides robust evidence of the framework's generalizability and effectiveness, outperforming state-of-the-art LLMs in controlled settings.
Practical Relevance
By aligning with established proficiency frameworks (CEFR, JLPT, etc.), Re-RIGHT directly addresses a practical need in L2 education, offering a scalable solution for personalized language learning tools.
Demerits
Limited Generalization to Complex Syntax
The focus on vocabulary-level simplification may underrepresent challenges in syntactic complexity, which are critical for advanced proficiency levels. The framework's performance in simplifying structurally complex sentences remains unexamined.
Dependency on High-Quality Vocabulary Data
The 43K vocabulary-level dataset, while substantial, may not capture the full linguistic diversity required for real-world L2 learning contexts, particularly for low-resource languages or dialects.
Computational Overhead
Despite its compact 4B parameter size, reinforcement learning frameworks like Re-RIGHT require significant computational resources for training and inference, potentially limiting accessibility for researchers or institutions with constrained infrastructure.
Expert Commentary
Re-RIGHT represents a paradigm shift in multilingual text simplification by decoupling the process from the constraints of parallel corpora and English-centric models. The integration of reinforcement learning with unsupervised reward modules is particularly commendable, as it addresses a longstanding bottleneck in the field. However, the study's focus on vocabulary-level simplification, while pragmatic, may overlook higher-order linguistic challenges, such as discourse coherence or idiomatic expressions, which are critical for advanced proficiency. Additionally, the reliance on established proficiency frameworks (e.g., CEFR) may inadvertently perpetuate Western-centric biases in language assessment. Future work should explore the framework's scalability to low-resource languages and its adaptability to dynamic proficiency metrics. The methodological rigor and empirical validation are strengths, but the computational demands of reinforcement learning warrant consideration in real-world deployments.
Recommendations
- ✓ Expand the framework to incorporate syntactic and discourse-level simplification, ensuring broader applicability across proficiency levels and linguistic structures.
- ✓ Collaborate with linguists and educators to validate the framework against diverse proficiency assessment tools, mitigating potential biases in standardized frameworks like CEFR.
- ✓ Develop open-source toolkits or APIs to facilitate accessibility, enabling researchers and educators to fine-tune the model for specific languages or dialects without prohibitive computational costs.
Sources
Original: arXiv - cs.CL