Missingness Bias Calibration in Feature Attribution Explanations
arXiv:2603.04831v1 Announce Type: new Abstract: Popular explanation methods often produce unreliable feature importance scores due to missingness bias, a systematic distortion that arises when models are probed with ablated, out-of-distribution inputs. Existing solutions treat this as a deep representational flaw that requires expensive retraining or architectural modifications. In this work, we challenge this assumption and show that missingness bias can be effectively treated as a superficial artifact of the model's output space. We introduce MCal, a lightweight post-hoc method that corrects this bias by fine-tuning a simple linear head on the outputs of a frozen base model. Surprisingly, we find this simple correction consistently reduces missingness bias and is competitive with, or even outperforms, prior heavyweight approaches across diverse medical benchmarks spanning vision, language, and tabular domains.
arXiv:2603.04831v1 Announce Type: new Abstract: Popular explanation methods often produce unreliable feature importance scores due to missingness bias, a systematic distortion that arises when models are probed with ablated, out-of-distribution inputs. Existing solutions treat this as a deep representational flaw that requires expensive retraining or architectural modifications. In this work, we challenge this assumption and show that missingness bias can be effectively treated as a superficial artifact of the model's output space. We introduce MCal, a lightweight post-hoc method that corrects this bias by fine-tuning a simple linear head on the outputs of a frozen base model. Surprisingly, we find this simple correction consistently reduces missingness bias and is competitive with, or even outperforms, prior heavyweight approaches across diverse medical benchmarks spanning vision, language, and tabular domains.
Executive Summary
The article introduces MCal, a post-hoc method to correct missingness bias in feature attribution explanations. By fine-tuning a simple linear head on the outputs of a frozen base model, MCal effectively reduces missingness bias without requiring expensive retraining or architectural modifications. The approach is shown to be competitive with or outperform prior methods across diverse medical benchmarks, offering a lightweight solution to a significant problem in explanation methods.
Key Points
- ▸ MCal is a post-hoc method to correct missingness bias in feature attribution explanations
- ▸ The approach involves fine-tuning a simple linear head on the outputs of a frozen base model
- ▸ MCal is competitive with or outperforms prior methods across diverse medical benchmarks
Merits
Efficiency
MCal offers a lightweight solution that does not require expensive retraining or architectural modifications, making it an efficient approach to correcting missingness bias.
Demerits
Generalizability
The article primarily focuses on medical benchmarks, and the generalizability of MCal to other domains or more complex models may require further investigation.
Expert Commentary
The introduction of MCal marks a significant step forward in addressing the long-standing issue of missingness bias in feature attribution explanations. By demonstrating that this bias can be effectively treated as a superficial artifact of the model's output space, the authors challenge conventional wisdom and offer a practical solution that can be widely adopted. The efficiency and competitiveness of MCal make it an attractive option for both researchers and practitioners seeking to improve the reliability of their models.
Recommendations
- ✓ Further research should investigate the applicability of MCal to other domains and more complex models to fully explore its potential
- ✓ Practitioners should consider integrating MCal into their workflow to enhance the reliability of feature attribution explanations in their models.