Academic

Towards interpretable models for language proficiency assessment: Predicting the CEFR level of Estonian learner texts

arXiv:2602.13102v1 Announce Type: new Abstract: Using NLP to analyze authentic learner language helps to build automated assessment and feedback tools. It also offers new and extensive insights into the development of second language production. However, there is a lack of research explicitly combining these aspects. This study aimed to classify Estonian proficiency examination writings (levels A2-C1), assuming that careful feature selection can lead to more explainable and generalizable machine learning models for language testing. Various linguistic properties of the training data were analyzed to identify relevant proficiency predictors associated with increasing complexity and correctness, rather than the writing task. Such lexical, morphological, surface, and error features were used to train classification models, which were compared to models that also allowed for other features. The pre-selected features yielded a similar test accuracy but reduced variation in the classificati

Kais Allkivi · March 7, 2026 · 1 min read · 2 views

#cs.CL

Executive Summary

The article 'Towards interpretable models for language proficiency assessment: Predicting the CEFR level of Estonian learner texts' explores the application of Natural Language Processing (NLP) to automate the assessment of language proficiency. The study focuses on Estonian learner texts, aiming to classify them according to the Common European Framework of Reference (CEFR) levels A2 to C1. By carefully selecting linguistic features such as lexical, morphological, surface, and error features, the researchers trained machine learning models that achieved high accuracy in predicting proficiency levels. The study also revealed an increase in the complexity of writings over a 7-10 year period. The findings have been integrated into an open-source language learning environment, demonstrating practical applications of the research.

Key Points

▸ The study combines NLP and language proficiency assessment to build automated tools.
▸ Careful feature selection leads to more explainable and generalizable models.
▸ The best classifiers achieved an accuracy of around 0.9.
▸ The complexity of writings has increased over a 7-10 year period.
▸ The results have been implemented in an open-source language learning environment.

Merits

Innovative Approach

The study innovatively combines NLP with language proficiency assessment, offering new insights into automated assessment and feedback tools.

High Accuracy

The models achieved high accuracy, demonstrating the effectiveness of the selected features in predicting CEFR levels.

Practical Implementation

The findings have been practically applied in an open-source language learning environment, showcasing the real-world utility of the research.

Demerits

Limited Scope

The study is limited to Estonian learner texts, which may restrict the generalizability of the findings to other languages.

Feature Selection Bias

The pre-selected features, while effective, may introduce bias and limit the model's ability to capture a broader range of linguistic nuances.

Temporal Limitations

The study's observation of increased complexity over a 7-10 year period is based on a single earlier exam sample, which may not be representative of broader trends.

Expert Commentary

The study 'Towards interpretable models for language proficiency assessment: Predicting the CEFR level of Estonian learner texts' represents a significant advancement in the field of automated language assessment. By focusing on interpretable models, the researchers address a critical need for transparency and explainability in machine learning applications within education. The high accuracy achieved by the models, coupled with the practical implementation in an open-source language learning environment, underscores the potential for NLP to revolutionize language proficiency assessment. However, the study's limitations, such as the focus on Estonian learner texts and the potential bias in feature selection, highlight areas for future research. Expanding the scope to include other languages and refining feature selection methods could enhance the generalizability and robustness of the models. Additionally, the observation of increased complexity in writings over time suggests a need for continuous evaluation and adaptation of assessment tools to keep pace with evolving language proficiency standards. Overall, the study provides a solid foundation for further exploration and development in the intersection of NLP and language education.

Recommendations

✓ Future research should expand the scope to include a broader range of languages to enhance the generalizability of the findings.
✓ Refinement of feature selection methods is recommended to capture a wider array of linguistic nuances and reduce potential bias in the models.

Sources

arXiv - cs.CL

Towards interpretable models for language proficiency assessment: Predicting the CEFR level of Estonian learner texts

AI Commentary

Executive Summary

Key Points

Merits

Innovative Approach

High Accuracy

Practical Implementation

Demerits

Limited Scope

Feature Selection Bias

Temporal Limitations

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs