Skip to main content
Academic

The Mean is the Mirage: Entropy-Adaptive Model Merging under Heterogeneous Domain Shifts in Medical Imaging

arXiv:2602.21372v1 Announce Type: cross Abstract: Model merging under unseen test-time distribution shifts often renders naive strategies, such as mean averaging unreliable. This challenge is especially acute in medical imaging, where models are fine-tuned locally at clinics on private data, producing domain-specific models that differ by scanner, protocol, and population. When deployed at an unseen clinical site, test cases arrive in unlabeled, non-i.i.d. batches, and the model must adapt immediately without labels. In this work, we introduce an entropy-adaptive, fully online model-merging method that yields a batch-specific merged model via only forward passes, effectively leveraging target information. We further demonstrate why mean merging is prone to failure and misaligned under heterogeneous domain shifts. Next, we mitigate encoder classifier mismatch by decoupling the encoder and classification head, merging with separate merging coefficients. We extensively evaluate our metho

arXiv:2602.21372v1 Announce Type: cross Abstract: Model merging under unseen test-time distribution shifts often renders naive strategies, such as mean averaging unreliable. This challenge is especially acute in medical imaging, where models are fine-tuned locally at clinics on private data, producing domain-specific models that differ by scanner, protocol, and population. When deployed at an unseen clinical site, test cases arrive in unlabeled, non-i.i.d. batches, and the model must adapt immediately without labels. In this work, we introduce an entropy-adaptive, fully online model-merging method that yields a batch-specific merged model via only forward passes, effectively leveraging target information. We further demonstrate why mean merging is prone to failure and misaligned under heterogeneous domain shifts. Next, we mitigate encoder classifier mismatch by decoupling the encoder and classification head, merging with separate merging coefficients. We extensively evaluate our method with state-of-the-art baselines using two backbones across nine medical and natural-domain generalization image classification datasets, showing consistent gains across standard evaluation and challenging scenarios. These performance gains are achieved while retaining single-model inference at test-time, thereby demonstrating the effectiveness of our method.

Executive Summary

This article presents an entropy-adaptive, fully online model-merging method for medical imaging under heterogeneous domain shifts. The proposed approach addresses the limitations of naive strategies like mean averaging by leveraging target information and decoupling encoder and classification head. The method is extensively evaluated on nine medical and natural-domain generalization image classification datasets using state-of-the-art baselines, demonstrating consistent performance gains across standard evaluation and challenging scenarios. The proposed method retains single-model inference at test-time, making it an effective solution for real-world medical image classification tasks.

Key Points

  • The proposed method addresses the limitations of naive strategies like mean averaging under heterogeneous domain shifts.
  • The method leverages target information and decouples encoder and classification head to mitigate encoder-classifier mismatch.
  • The proposed method is evaluated on nine medical and natural-domain generalization image classification datasets using state-of-the-art baselines.

Merits

Strength in Addressing Domain Shifts

The proposed method effectively addresses the challenges of domain shifts in medical imaging, a critical issue in real-world applications.

Improved Performance

The method demonstrates consistent performance gains across standard evaluation and challenging scenarios, making it a valuable contribution to the field.

Retained Single-Model Inference

The proposed method retains single-model inference at test-time, making it a practical solution for real-world medical image classification tasks.

Demerits

Limited Evaluation on Real-World Data

The proposed method is evaluated on simulated datasets, and its performance on real-world data remains to be seen.

Potential Overfitting to Specific Datasets

The method may overfit to specific datasets, which could impact its performance on unseen data.

Expert Commentary

The proposed method addresses a critical issue in medical imaging, where models need to adapt to new data distributions and environments. The method's ability to leverage target information and decouple encoder and classification head makes it a valuable contribution to the field. However, the limited evaluation on real-world data and potential overfitting to specific datasets remain concerns that need to be addressed. The proposed method has implications for real-world medical image classification tasks and can inform the development of policies for model deployment and adaptation in medical imaging applications.

Recommendations

  • Further evaluation of the proposed method on real-world data is necessary to assess its performance and robustness.
  • The method should be adapted to address potential overfitting to specific datasets and improve its generalizability.

Sources