Federated Active Learning Under Extreme Non-IID and Global Class Imbalance
arXiv:2603.10341v1 Announce Type: new Abstract: Federated active learning (FAL) seeks to reduce annotation cost under privacy constraints, yet its effectiveness degrades in realistic settings with severe global class imbalance and highly heterogeneous clients. We conduct a systematic study of query-model selection in FAL and uncover a central insight: the model that achieves more class-balanced sampling, especially for minority classes, consistently leads to better final performance. Moreover, global-model querying is beneficial only when the global distribution is highly imbalanced and client data are relatively homogeneous; otherwise, the local model is preferable. Based on these findings, we propose FairFAL, an adaptive class-fair FAL framework. FairFAL (1) infers global imbalance and local-global divergence via lightweight prediction discrepancy, enabling adaptive selection between global and local query models; (2) performs prototype-guided pseudo-labeling using global features t
arXiv:2603.10341v1 Announce Type: new Abstract: Federated active learning (FAL) seeks to reduce annotation cost under privacy constraints, yet its effectiveness degrades in realistic settings with severe global class imbalance and highly heterogeneous clients. We conduct a systematic study of query-model selection in FAL and uncover a central insight: the model that achieves more class-balanced sampling, especially for minority classes, consistently leads to better final performance. Moreover, global-model querying is beneficial only when the global distribution is highly imbalanced and client data are relatively homogeneous; otherwise, the local model is preferable. Based on these findings, we propose FairFAL, an adaptive class-fair FAL framework. FairFAL (1) infers global imbalance and local-global divergence via lightweight prediction discrepancy, enabling adaptive selection between global and local query models; (2) performs prototype-guided pseudo-labeling using global features to promote class-aware querying; and (3) applies a two-stage uncertainty-diversity balanced sampling strategy with k-center refinement. Experiments on five benchmarks show that FairFAL consistently outperforms state-of-the-art approaches under challenging long-tailed and non-IID settings. The code is available at https://github.com/chenchenzong/FairFAL.
Executive Summary
This article presents a novel federated active learning framework, FairFAL, designed to address the challenges of severe global class imbalance and highly heterogeneous clients in realistic settings. Through a systematic study of query-model selection, the authors identify a central insight that a class-balanced sampling model leads to better final performance. FairFAL adapts to different scenarios by inferring global imbalance and local-global divergence, and it employs prototype-guided pseudo-labeling and uncertainty-diversity balanced sampling to promote class-aware querying. The authors demonstrate the effectiveness of FairFAL through experiments on five benchmarks, achieving state-of-the-art performance under challenging long-tailed and non-IID settings. The proposed framework has significant implications for real-world applications and provides a valuable contribution to the field of federated learning and active learning.
Key Points
- ▸ FairFAL addresses the challenges of global class imbalance and heterogeneous clients in federated active learning.
- ▸ The authors identify a central insight that class-balanced sampling leads to better final performance.
- ▸ FairFAL employs adaptive query-model selection, prototype-guided pseudo-labeling, and uncertainty-diversity balanced sampling.
Merits
Strength in Addressing Global Class Imbalance
FairFAL effectively addresses the challenges of global class imbalance and heterogeneous clients, demonstrating significant improvements in performance under challenging long-tailed and non-IID settings.
Demerits
Limited Evaluation of Real-World Scenarios
The authors focus on synthetic datasets and may not fully capture the complexities of real-world scenarios, which could impact the practicality and scalability of FairFAL.
Expert Commentary
The article presents a novel and effective approach to addressing the challenges of global class imbalance and heterogeneous clients in federated active learning. The proposed FairFAL framework demonstrates significant improvements in performance under challenging long-tailed and non-IID settings. However, the authors may benefit from further evaluation of FairFAL in real-world scenarios to fully capture its practicality and scalability. Additionally, the framework raises important policy questions regarding data access, sharing, and privacy in federated learning environments. As such, this article is a valuable contribution to the field of federated learning and active learning, with significant implications for real-world applications.
Recommendations
- ✓ Future research should prioritize the evaluation of FairFAL in real-world scenarios to better understand its practicality and scalability.
- ✓ The authors should further explore the policy implications of FairFAL, particularly regarding data access, sharing, and privacy in federated learning environments.