ATPO: Adaptive Tree Policy Optimization for Multi-Turn Medical Dialogue
arXiv:2603.02216v1 Announce Type: new Abstract: Effective information seeking in multi-turn medical dialogues is critical for accurate diagnosis, especially when dealing with incomplete information. Aligning Large Language Models (LLMs) for these interactive scenarios is challenging due to the uncertainty inherent in...
Neural Paging: Learning Context Management Policies for Turing-Complete Agents
arXiv:2603.02228v1 Announce Type: new Abstract: The proof that Large Language Models (LLMs) augmented with external read-write memory constitute a computationally universal system has established the theoretical foundation for general-purpose agents. However, existing implementations face a critical bottleneck: the finite and...
Using the SEKF to Transfer NN Models of Dynamical Systems with Limited Data
arXiv:2603.02439v1 Announce Type: new Abstract: Data-driven models of dynamical systems require extensive amounts of training data. For many practical applications, gathering sufficient data is not feasible due to cost or safety concerns. This work uses the Subset Extended Kalman Filter...
Weight Updates as Activation Shifts: A Principled Framework for Steering
arXiv:2603.00425v1 Announce Type: new Abstract: Activation steering promises to be an extremely parameter-efficient form of adaptation, but its effectiveness depends on critical design choices -- such as intervention location and parameterization -- that currently rely on empirical heuristics rather than...
U-CAN: Utility-Aware Contrastive Attenuation for Efficient Unlearning in Generative Recommendation
arXiv:2602.23400v1 Announce Type: new Abstract: Generative Recommendation (GenRec) typically leverages Large Language Models (LLMs) to redefine personalization as an instruction-driven sequence generation task. However, fine-tuning on user logs inadvertently encodes sensitive attributes into model parameters, raising critical privacy concerns. Existing...
Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents
arXiv:2602.23556v1 Announce Type: new Abstract: Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed, training requires frequent irregular communication that stalls forward progress. Moreover, fetched data...
BTTackler: A Diagnosis-based Framework for Efficient Deep Learning Hyperparameter Optimization
arXiv:2602.23630v1 Announce Type: new Abstract: Hyperparameter optimization (HPO) is known to be costly in deep learning, especially when leveraging automated approaches. Most of the existing automated HPO methods are accuracy-based, i.e., accuracy metrics are used to guide the trials of...
MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening
arXiv:2602.23994v1 Announce Type: new Abstract: Alzheimer's disease is a progressive neurodegenerative disorder in which mild cognitive impairment (MCI) marks a critical transition between aging and dementia. Neuroimaging modalities, such as structural MRI, provide biomarkers of this transition; however, their high...
Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning
arXiv:2602.20722v1 Announce Type: new Abstract: Traditional on-policy Reinforcement Learning with Verifiable Rewards (RLVR) frameworks suffer from experience waste and reward homogeneity, which directly hinders learning efficiency on difficult samples during large language models post-training. In this paper, we introduce Batch...
EQ-5D Classification Using Biomedical Entity-Enriched Pre-trained Language Models and Multiple Instance Learning
arXiv:2602.21216v1 Announce Type: cross Abstract: The EQ-5D (EuroQol 5-Dimensions) is a standardized instrument for the evaluation of health-related quality of life. In health economics, systematic literature reviews (SLRs) depend on the correct identification of publications that use the EQ-5D, but...
Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases
arXiv:2602.21222v1 Announce Type: cross Abstract: Parameter efficient fine tuning methods like LoRA have enabled task specific adaptation of large language models, but efficiently composing multiple specialized adapters for unseen tasks remains challenging. We present a novel framework for dynamic LoRA...
SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning
arXiv:2602.22603v1 Announce Type: new Abstract: Long-running agentic tasks, such as deep research, require multi-hop reasoning over information distributed across multiple webpages and documents. In such tasks, the LLM context is dominated by tokens from external retrieval, causing memory usage to...
Decomposing Physician Disagreement in HealthBench
arXiv:2602.22758v1 Announce Type: new Abstract: We decompose physician disagreement in the HealthBench medical AI evaluation dataset to understand where variance resides and what observable features can explain it. Rubric identity accounts for 15.8% of met/not-met label variance but only 3.6-6.9%...
Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design
arXiv:2602.23092v1 Announce Type: new Abstract: The Capacitated Vehicle Routing Problem (CVRP), a fundamental combinatorial optimization challenge, focuses on optimizing fleet operations under vehicle capacity constraints. While extensively studied in operational research, the NP-hard nature of CVRP continues to pose significant...
Improving Neural Argumentative Stance Classification in Controversial Topics with Emotion-Lexicon Features
arXiv:2602.22846v1 Announce Type: new Abstract: Argumentation mining comprises several subtasks, among which stance classification focuses on identifying the standpoint expressed in an argumentative text toward a specific target topic. While arguments-especially about controversial topics-often appeal to emotions, most prior work...
Copyright Protection for AI-Generated Works
Since the 2010s, artificial intelligence (AI) has quickly grown from another subset of machine learning (ie deep learning) in particular with recent advances in generative AI, such as ChatGPT. The use of generative AI has gone beyond leisure purposes. It...
Precision Medicine and Data Privacy: Balancing Innovation with Patient Rights
The rapid advancement of precision medicine creates unprecedented opportunities for personalized treatment while raising complex data privacy and consent challenges.
SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks
arXiv:2602.21307v1 Announce Type: new Abstract: Symbolic distillation replaces neural networks, or components thereof, with interpretable, closed-form mathematical expressions. This approach has shown promise in discovering physical laws and mathematical relationships directly from trained deep learning models, yet adoption remains limited...
Benchmarking State Space Models, Transformers, and Recurrent Networks for US Grid Forecasting
arXiv:2602.21415v1 Announce Type: new Abstract: Selecting the right deep learning model for power grid forecasting is challenging, as performance heavily depends on the data available to the operator. This paper presents a comprehensive benchmark of five modern neural architectures: two...
MINAR: Mechanistic Interpretability for Neural Algorithmic Reasoning
arXiv:2602.21442v1 Announce Type: new Abstract: The recent field of neural algorithmic reasoning (NAR) studies the ability of graph neural networks (GNNs) to emulate classical algorithms like Bellman-Ford, a phenomenon known as algorithmic alignment. At the same time, recent advances in...
When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training
arXiv:2602.21454v1 Announce Type: new Abstract: Recurrent neural networks (RNNs) can be interpreted as discrete-time state-space models, where the state evolution corresponds to an infinite-impulse-response (IIR) filtering operation governed by both feedforward weights and recurrent poles. While, in principle, all parameters...
Effects of Training Data Quality on Classifier Performance
arXiv:2602.21462v1 Announce Type: new Abstract: We describe extensive numerical experiments assessing and quantifying how classifier performance depends on the quality of the training data, a frequently neglected component of the analysis of classifiers. More specifically, in the scientific context of...
GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning
arXiv:2602.21492v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a central post-training paradigm for large language models (LLMs), but its performance is highly sensitive to the quality of training problems. This sensitivity stems from the non-stationarity of RL: rollouts...
ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition
arXiv:2602.20727v1 Announce Type: new Abstract: LoRA has become a universal Parameter-Efficient Fine-Tuning (PEFT) technique that equips Large Language Models (LLMs) to adapt quickly to new tasks. However, when these models are scaled up, even the latest LoRA variants still introduce...
Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training
arXiv:2602.20532v1 Announce Type: cross Abstract: Post-training large foundation models with reinforcement learning typically relies on massive and heterogeneous datasets, making effective curriculum learning both critical and challenging. In this work, we propose ACTOR-CURATOR, a scalable and fully automated curriculum learning...
RMIT-ADM+S at the MMU-RAG NeurIPS 2025 Competition
arXiv:2602.20735v1 Announce Type: cross Abstract: This paper presents the award-winning RMIT-ADM+S system for the Text-to-Text track of the NeurIPS~2025 MMU-RAG Competition. We introduce Routing-to-RAG (R2RAG), a research-focused retrieval-augmented generation (RAG) architecture composed of lightweight components that dynamically adapt the retrieval...
FedAvg-Based CTMC Hazard Model for Federated Bridge Deterioration Assessment
arXiv:2602.20194v1 Announce Type: new Abstract: Bridge periodic inspection records contain sensitive information about public infrastructure, making cross-organizational data sharing impractical under existing data governance constraints. We propose a federated framework for estimating a Continuous-Time Markov Chain (CTMC) hazard model of...
KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem
arXiv:2602.20217v1 Announce Type: new Abstract: Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on static heuristics that ignore the dynamic computational overhead of attention in long-context scenarios. We...
Imputation of Unknown Missingness in Sparse Electronic Health Records
arXiv:2602.20442v1 Announce Type: new Abstract: Machine learning holds great promise for advancing the field of medicine, with electronic health records (EHRs) serving as a primary data source. However, EHRs are often sparse and contain missing data due to various challenges...
Nonparametric Teaching of Attention Learners
arXiv:2602.20461v1 Announce Type: new Abstract: Attention learners, neural networks built on the attention mechanism, e.g., transformers, excel at learning the implicit relationships that relate sequences to their corresponding properties, e.g., mapping a given sequence of tokens to the probability of...