UpSkill: Mutual Information Skill Learning for Structured Response Diversity in LLMs
arXiv:2602.22296v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has improved the reasoning abilities of large language models (LLMs) on mathematics and programming tasks, but standard approaches that optimize single-attempt accuracy can inadvertently suppress response diversity across repeated...
Predicting Multi-Drug Resistance in Bacterial Isolates Through Performance Comparison and LIME-based Interpretation of Classification Models
arXiv:2602.22400v1 Announce Type: new Abstract: The rise of Antimicrobial Resistance, particularly Multi-Drug Resistance (MDR), presents a critical challenge for clinical decision-making due to limited treatment options and delays in conventional susceptibility testing. This study proposes an interpretable machine learning framework...
MolFM-Lite: Multi-Modal Molecular Property Prediction with Conformer Ensemble Attention and Cross-Modal Fusion
arXiv:2602.22405v1 Announce Type: new Abstract: Most machine learning models for molecular property prediction rely on a single molecular representation (either a sequence, a graph, or a 3D structure) and treat molecular geometry as static. We present MolFM-Lite, a multi-modal model...
Revisiting Chebyshev Polynomial and Anisotropic RBF Models for Tabular Regression
arXiv:2602.22422v1 Announce Type: new Abstract: Smooth-basis models such as Chebyshev polynomial regressors and radial basis function (RBF) networks are well established in numerical analysis. Their continuously differentiable prediction surfaces suit surrogate optimisation, sensitivity analysis, and other settings where the response...
Beyond performance-wise Contribution Evaluation in Federated Learning
arXiv:2602.22470v1 Announce Type: new Abstract: Federated learning offers a privacy-friendly collaborative learning framework, yet its success, like any joint venture, hinges on the contributions of its participants. Existing client evaluation methods predominantly focus on model performance, such as accuracy or...
Reinforcement-aware Knowledge Distillation for LLM Reasoning
arXiv:2602.22495v1 Announce Type: new Abstract: Reinforcement learning (RL) post-training has recently driven major gains in long chain-of-thought reasoning large language models (LLMs), but the high inference cost of such models motivates distillation into smaller students. Most existing knowledge distillation (KD)...
TEFL: Prediction-Residual-Guided Rolling Forecasting for Multi-Horizon Time Series
arXiv:2602.22520v1 Announce Type: new Abstract: Time series forecasting plays a critical role in domains such as transportation, energy, and meteorology. Despite their success, modern deep forecasting models are typically trained to minimize point-wise prediction loss without leveraging the rich information...
Predicting Tennis Serve directions with Machine Learning
arXiv:2602.22527v1 Announce Type: new Abstract: Serves, especially first serves, are very important in professional tennis. Servers choose their serve directions strategically to maximize their winning chances while trying to be unpredictable. On the other hand, returners try to predict serve...
Coarse-to-Fine Learning of Dynamic Causal Structures
arXiv:2602.22532v1 Announce Type: new Abstract: Learning the dynamic causal structure of time series is a challenging problem. Most existing approaches rely on distributional or structural invariance to uncover underlying causal dynamics, assuming stationary or partially stationary causality. However, these assumptions...
Employees at Google and OpenAI support Anthropic’s Pentagon stand in open letter
While Anthropic has an existing partnership with the Pentagon, the AI company has remained firm that its technology not be used for mass domestic surveillance or fully autonomous weaponry.
Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning
arXiv:2602.21420v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become the leading paradigm for enhancing reasoning in Large Language Models (LLMs). However, standard RLVR algorithms suffer from a well-documented pathology: while they improve Pass@1 accuracy through sharpened...
ECHOSAT: Estimating Canopy Height Over Space And Time
arXiv:2602.21421v1 Announce Type: cross Abstract: Forest monitoring is critical for climate change mitigation. However, existing global tree height maps provide only static snapshots and do not capture temporal forest dynamics, which are essential for accurate carbon accounting. We introduce ECHOSAT,...
Disaster Question Answering with LoRA Efficiency and Accurate End Position
arXiv:2602.21212v1 Announce Type: new Abstract: Natural disasters such as earthquakes, torrential rainfall, floods, and volcanic eruptions occur with extremely low frequency and affect limited geographic areas. When individuals face disaster situations, they often experience confusion and lack the domain-specific knowledge...
Structured Prompt Language: Declarative Context Management for LLMs
arXiv:2602.21257v1 Announce Type: new Abstract: We present SPL (Structured Prompt Language), a declarative SQL-inspired language that treats large language models as generative knowledge bases and their context windows as constrained resources. SPL provides explicit WITH BUDGET/LIMIT token management, an automatic...
Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models
arXiv:2602.21262v1 Announce Type: new Abstract: With increasing integration of Large Language Models (LLMs) into areas of high-stakes human decision-making, it is important to understand the risks they introduce as advisors. To be useful advisors, LLMs must sift through large amounts...
Evaluating the Usage of African-American Vernacular English in Large Language Models
arXiv:2602.21485v1 Announce Type: new Abstract: In AI, most evaluations of natural language understanding tasks are conducted in standardized dialects such as Standard American English (SAE). In this work, we investigate how accurately large language models (LLMs) represent African American Vernacular...
MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification
arXiv:2602.21608v1 Announce Type: new Abstract: Bangla-English code-mixing is widespread across South Asian social media, yet resources for implicit meaning identification in this setting remain scarce. Existing sentiment and sarcasm models largely focus on monolingual English or high-resource languages and struggle...
When More Is Less: A Systematic Analysis of Spatial and Commonsense Information for Visual Spatial Reasoning
arXiv:2602.21619v1 Announce Type: new Abstract: Visual spatial reasoning (VSR) remains challenging for modern vision-language models (VLMs), despite advances in multimodal architectures. A common strategy is to inject additional information at inference time, such as explicit spatial cues, external commonsense knowledge,...
RuCL: Stratified Rubric-Based Curriculum Learning for Multimodal Large Language Model Reasoning
arXiv:2602.21628v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a prevailing paradigm for enhancing reasoning in Multimodal Large Language Models (MLLMs). However, relying solely on outcome supervision risks reward hacking, where models learn spurious reasoning...
Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion
arXiv:2602.21646v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have achieved notable success in enhancing translation performance by integrating multimodal information. However, existing research primarily focuses on image-guided methods, whose applicability is constrained by the scarcity of multilingual image-text...
Evaluating the relationship between regularity and learnability in recursive numeral systems using Reinforcement Learning
arXiv:2602.21720v1 Announce Type: new Abstract: Human recursive numeral systems (i.e., counting systems such as English base-10 numerals), like many other grammatical systems, are highly regular. Following prior work that relates cross-linguistic tendencies to biases in learning, we ask whether regular...
D-COT: Disciplined Chain-of-Thought Learning for Efficient Reasoning in Small Language Models
arXiv:2602.21786v1 Announce Type: new Abstract: Chain-of-Thought (CoT) distillation from Large Language Models (LLMs) often induces "overthinking" in Small Language Models (SLMs), leading to performance degradation and excessive token consumption. In this study, we propose Disciplined Chain-of-Thought (D-CoT), a novel framework...
ExpLang: Improved Exploration and Exploitation in LLM Reasoning with On-Policy Thinking Language Selection
arXiv:2602.21887v1 Announce Type: new Abstract: Current large reasoning models (LRMs) have shown strong ability on challenging tasks after reinforcement learning (RL) based post-training. However, previous work mainly focuses on English reasoning in expectation of the strongest performance, despite the demonstrated...
CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models
arXiv:2602.21978v1 Announce Type: new Abstract: Recent work has examined language models from a linguistic perspective to better understand how they acquire language. Most existing benchmarks focus on judging grammatical acceptability, whereas the ability to interpret meanings conveyed by grammatical forms...
Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data
arXiv:2602.21320v1 Announce Type: new Abstract: Large language models (LLMs) are becoming the foundation for autonomous agents that can use tools to solve complex tasks. Reinforcement learning (RL) has emerged as a common approach for injecting such agentic capabilities, but typically...
Efficient Opportunistic Approachability
arXiv:2602.21328v1 Announce Type: new Abstract: We study the problem of opportunistic approachability: a generalization of Blackwell approachability where the learner would like to obtain stronger guarantees (i.e., approach a smaller set) when their adversary limits themselves to a subset of...
HiPPO Zoo: Explicit Memory Mechanisms for Interpretable State Space Models
arXiv:2602.21340v1 Announce Type: new Abstract: Representing the past in a compressed, efficient, and informative manner is a central problem for systems trained on sequential data. The HiPPO framework, originally proposed by Gu & Dao et al., provides a principled approach...
Archetypal Graph Generative Models: Explainable and Identifiable Communities via Anchor-Dominant Convex Hulls
arXiv:2602.21342v1 Announce Type: new Abstract: Representation learning has been essential for graph machine learning tasks such as link prediction, community detection, and network visualization. Despite recent advances in achieving high performance on these downstream tasks, little progress has been made...
Generative Bayesian Computation as a Scalable Alternative to Gaussian Process Surrogates
arXiv:2602.21408v1 Announce Type: new Abstract: Gaussian process (GP) surrogates are the default tool for emulating expensive computer experiments, but cubic cost, stationarity assumptions, and Gaussian predictive distributions limit their reach. We propose Generative Bayesian Computation (GBC) via Implicit Quantile Networks...
Provably Safe Generative Sampling with Constricting Barrier Functions
arXiv:2602.21429v1 Announce Type: new Abstract: Flow-based generative models, such as diffusion models and flow matching models, have achieved remarkable success in learning complex data distributions. However, a critical gap remains for their deployment in safety-critical domains: the lack of formal...