Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting
arXiv:2603.08907v1 Announce Type: new Abstract: We present a comprehensive ablation of nine finite-sample bound families for selective prediction with risk control, combining concentration inequalities (Hoeffding, Empirical Bernstein, Clopper-Pearson, Wasserstein DRO, CVaR) with multiple-testing corrections (union bound, Learn Then Test fixed-sequence) and betting-based confidence sequences (WSR). Our main theoretical contribution is Transfer-Informed Betting (TIB), which warm-starts the WSR wealth process using a source domain's risk profile, achieving tighter bounds in data-scarce settings with a formal dominance guarantee. We prove that the TIB wealth process remains a valid supermartingale under all source-target divergences, that TIB dominates standard WSR when domains match, and that no data-independent warm-start can achieve better convergence. The combination of betting-based confidence sequences, LTT monotone testing, and cross-domain transfer is, to our knowledge, a three-wa
arXiv:2603.08907v1 Announce Type: new Abstract: We present a comprehensive ablation of nine finite-sample bound families for selective prediction with risk control, combining concentration inequalities (Hoeffding, Empirical Bernstein, Clopper-Pearson, Wasserstein DRO, CVaR) with multiple-testing corrections (union bound, Learn Then Test fixed-sequence) and betting-based confidence sequences (WSR). Our main theoretical contribution is Transfer-Informed Betting (TIB), which warm-starts the WSR wealth process using a source domain's risk profile, achieving tighter bounds in data-scarce settings with a formal dominance guarantee. We prove that the TIB wealth process remains a valid supermartingale under all source-target divergences, that TIB dominates standard WSR when domains match, and that no data-independent warm-start can achieve better convergence. The combination of betting-based confidence sequences, LTT monotone testing, and cross-domain transfer is, to our knowledge, a three-way novelty not present in the literature. We evaluate all nine bound families on four benchmarks-MASSIVE (n=1,102), NyayaBench (n=280), CLINC-150 (n=22.5K), and Banking77 (n=13K)-across 18 (alpha, delta) configurations. On MASSIVE at alpha=0.10, LTT eliminates the ln(K) union-bound penalty, achieving 94.0% guaranteed coverage versus 73.8% for Hoeffding-a 27% relative improvement. On NyayaBench, where the small calibration set makes Hoeffding-family bounds infeasible below alpha=0.20, Transfer-Informed Betting achieves 18.5% coverage at alpha=0.10, a 5.4x improvement over LTT + Hoeffding. We additionally compare with split-conformal prediction, showing that conformal methods produce prediction sets (avg. 1.67 classes) whereas selective prediction provides single-prediction risk guarantees. We apply these methods to agentic caching systems, formalizing a progressive trust model where the guarantee determines when cached responses can be served autonomously.
Executive Summary
This article presents a comprehensive analysis of nine finite-sample bound families for selective prediction with risk control, combining concentration inequalities and betting-based confidence sequences. The main contribution is Transfer-Informed Betting (TIB), which warm-starts the wealth process using a source domain's risk profile, achieving tighter bounds in data-scarce settings. The study evaluates the methods on four benchmarks and demonstrates significant improvements over existing approaches. The article also applies the methods to agentic caching systems, providing a progressive trust model for autonomous response serving. The findings have implications for both theoretical and practical applications, particularly in transfer learning and risk control.
Key Points
- ▸ Transfer-Informed Betting (TIB) achieves tighter bounds in data-scarce settings
- ▸ TIB warm-starts the WSR wealth process using a source domain's risk profile
- ▸ Significant improvements over existing approaches, such as Hoeffding and Empirical Bernstein
- ▸ Application to agentic caching systems provides a progressive trust model for autonomous response serving
Merits
Strength in Theoretical Contributions
The article presents novel theoretical contributions, including the TIB method and its dominance guarantee, which have significant implications for transfer learning and risk control.
Improvements in Practical Applications
The study demonstrates significant improvements over existing approaches in data-scarce settings, making it a valuable contribution to the field.
Interdisciplinary Approach
The article applies the methods to agentic caching systems, providing a progressive trust model for autonomous response serving, which highlights the interdisciplinary potential of the research.
Demerits
Complexity of Methods
The TIB method and other approaches presented in the article may be complex and challenging to implement, particularly for non-experts.
Limited Evaluation on Real-World Data
The study evaluates the methods on four benchmarks, but it would be beneficial to extend the evaluation to real-world data to further validate the findings.
Expert Commentary
This article presents a significant contribution to the field of transfer learning and risk control. The TIB method and other approaches presented in the article demonstrate significant improvements over existing approaches, particularly in data-scarce settings. The study's application to agentic caching systems highlights the interdisciplinary potential of the research. However, the complexity of the methods and the limited evaluation on real-world data are notable limitations. Nevertheless, the article's findings have significant implications for both theoretical and practical applications and warrant further research and exploration.
Recommendations
- ✓ Future research should focus on extending the evaluation to real-world data and exploring the application of the TIB method and other approaches to various domains.
- ✓ Developing more accessible and user-friendly implementations of the TIB method and other approaches would facilitate broader adoption and application.