Academic

Parameter-Efficient Token Embedding Editing for Clinical Class-Level Unlearning

arXiv:2603.19302v1 Announce Type: new Abstract: Machine unlearning is increasingly important for clinical language models, where privacy regulations and institutional policies may require removing sensitive information from deployed systems without retraining from scratch. In practice, deletion requests must balance effective forgetting of targeted information with preservation of model utility and minimal parameter modification. We introduce Sparse Token Embedding Unlearning (STEU), a parameter-efficient method for behavioral class-level unlearning that updates only PMI-selected token embeddings together with a small classifier head while keeping all encoder layers frozen. Across experiments on MIMIC-IV, MIMIC-III, and eICU using BioClinicalBERT, BERT-base, and DistilBERT, STEU consistently suppresses the target class while largely preserving retained task performance. In the primary MIMIC-IV setting, STEU achieves near-complete forgetting (forget F1 = 0.0004) while maintaining compe

Iyad Ait Hou, Shrenik Borad, Harsh Sharma, Pooja Srinivasan, Rebecca Hwa, Aya Zirikly · March 23, 2026 · 1 min read · 7 views

#cs.LG #cs.AI

Executive Summary

This article proposes a novel approach to machine unlearning, specifically for clinical language models. The authors introduce Sparse Token Embedding Unlearning (STEU), a parameter-efficient method for class-level unlearning. STEU updates only token embeddings selected by Pointwise Mutual Information (PMI) and a small classifier head, while keeping encoder layers frozen. The method is tested on several clinical datasets, achieving near-complete forgetting of targeted information while preserving retained task performance. This approach addresses a critical need in clinical applications, where model utility and user confidentiality must be balanced. The results demonstrate the potential of STEU for targeted behavioral unlearning without modifying deeper model representations.

Key Points

▸ STEU is a parameter-efficient method for class-level unlearning in clinical language models.
▸ The method updates only PMI-selected token embeddings and a small classifier head.
▸ STEU achieves near-complete forgetting of targeted information while preserving retained task performance.

Merits

Strength in Addressing Confidentiality Concerns

STEU addresses the critical need for balancing model utility and user confidentiality in clinical applications, aligning with emerging privacy regulations and institutional policies.

Parameter Efficiency

STEU modifies only a small fraction of model parameters, reducing the computational overhead and potential impact on model utility.

Wide Applicability

The method is tested on multiple clinical datasets, demonstrating its potential for wide applicability across different clinical language models.

Demerits

Assumes Availability of PMI Information

STEU relies on PMI information, which may not always be readily available or computed efficiently, potentially limiting the method's applicability.

Potential Impact on Model Interpretability

The modification of token embeddings and classifier heads might affect model interpretability, particularly in high-stakes clinical applications.

Expert Commentary

The proposal of STEU marks a significant step forward in addressing the critical need for machine unlearning in clinical applications. While the method's performance is impressive, its applicability and potential impact on model interpretability warrant further investigation. Moreover, the regulatory implications of STEU's alignment with emerging privacy regulations and institutional policies underscore the need for ongoing discussions on regulatory frameworks for AI in healthcare.

Recommendations

✓ Future work should investigate the method's applicability in real-world clinical settings, including its scalability and potential impact on model interpretability.
✓ Developers and policymakers should continue to engage in discussions on regulatory frameworks that support the responsible development and deployment of AI in healthcare.

Sources

Original: arXiv - cs.LG

arXiv - cs.LG

Parameter-Efficient Token Embedding Editing for Clinical Class-Level Unlearning

AI Commentary

Executive Summary

Key Points

Merits

Strength in Addressing Confidentiality Concerns

Parameter Efficiency

Wide Applicability

Demerits

Assumes Availability of PMI Information

Potential Impact on Model Interpretability

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.