Lipschitz-Based Robustness Certification Under Floating-Point Execution
arXiv:2603.13334v1 Announce Type: new Abstract: Sensitivity-based robustness certification has emerged as a practical approach for certifying neural network robustness, including in settings that require verifiable guarantees. A key advantage of these methods is that certification is performed by concrete numerical...
PolyGLU: State-Conditional Activation Routing in Transformer Feed-Forward Networks
arXiv:2603.13347v1 Announce Type: new Abstract: Biological neural systems employ diverse neurotransmitters -- glutamate, GABA, dopamine, acetylcholine -- to implement distinct signal-processing modalities within shared neural circuits. In contrast, modern transformers apply a single fixed activation function across all feed-forward neurons....
AI Planning Framework for LLM-Based Web Agents
arXiv:2603.12710v1 Announce Type: new Abstract: Developing autonomous agents for web-based tasks is a core challenge in AI. While Large Language Model (LLM) agents can interpret complex user requests, they often operate as black boxes, making it difficult to diagnose why...
Structured Distillation for Personalized Agent Memory: 11x Token Reduction with Retrieval Preservation
arXiv:2603.13017v1 Announce Type: new Abstract: Long conversations with an AI agent create a simple problem for one user: the history is useful, but carrying it verbatim is expensive. We study personalized agent memory: one user's conversation history with an agent,...
DART: Input-Difficulty-AwaRe Adaptive Threshold for Early-Exit DNNs
arXiv:2603.12269v1 Announce Type: cross Abstract: Early-exit deep neural networks enable adaptive inference by terminating computation when sufficient confidence is achieved, reducing cost for edge AI accelerators in resource-constrained settings. Existing methods, however, rely on suboptimal exit policies, ignore input difficulty,...
Context is all you need: Towards autonomous model-based process design using agentic AI in flowsheet simulations
arXiv:2603.12813v1 Announce Type: new Abstract: Agentic AI systems integrating large language models (LLMs) with reasoning and tooluse capabilities are transforming various domains - in particular, software development. In contrast, their application in chemical process flowsheet modelling remains largely unexplored. In...
The DIME Architecture: A Unified Operational Algorithm for Neural Representation, Dynamics, Control and Integration
arXiv:2603.12286v1 Announce Type: cross Abstract: Modern neuroscience has accumulated extensive evidence on perception, memory, prediction, valuation, and consciousness, yet still lacks an explicit operational architecture capable of integrating these phenomena within a unified computational framework. Existing theories address specific aspects...
Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency
arXiv:2603.12298v1 Announce Type: cross Abstract: Activation engineering enables precise control over Large Language Models (LLMs) without the computational cost of fine-tuning. However, existing methods deriving vectors from static activation differences are susceptible to high-dimensional noise and layer-wise semantic drift, often...
On Using Machine Learning to Early Detect Catastrophic Failures in Marine Diesel Engines
arXiv:2603.12733v1 Announce Type: new Abstract: Catastrophic failures of marine engines imply severe loss of functionality and destroy or damage the systems irreversibly. Being sudden and often unpredictable events, they pose a severe threat to navigation, crew, and passengers. The abrupt...
Steve-Evolving: Open-World Embodied Self-Evolution via Fine-Grained Diagnosis and Dual-Track Knowledge Distillation
arXiv:2603.13131v1 Announce Type: new Abstract: Open-world embodied agents must solve long-horizon tasks where the main bottleneck is not single-step planning quality but how interaction experience is organized and evolved. To this end, we present Steve-Evolving, a non-parametric self-evolving framework that...
Synthetic Data Generation for Brain-Computer Interfaces: Overview, Benchmarking, and Future Directions
arXiv:2603.12296v1 Announce Type: cross Abstract: Deep learning has achieved transformative performance across diverse domains, largely driven by the large-scale, high-quality training data. In contrast, the development of brain-computer interfaces (BCIs) is fundamentally constrained by the limited, heterogeneous, and privacy-sensitive neural...
Optimizing Task Completion Time Updates Using POMDPs
arXiv:2603.12340v1 Announce Type: cross Abstract: Managing announced task completion times is a fundamental control problem in project management. While extensive research exists on estimating task durations and task scheduling, the problem of when and how to update completion times communicated...
Revisiting Model Stitching In the Foundation Model Era
arXiv:2603.12433v1 Announce Type: cross Abstract: Model stitching, connecting early layers of one model (source) to later layers of another (target) via a light stitch layer, has served as a probe of representational compatibility. Prior work finds that models trained on...
TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction
arXiv:2603.12500v1 Announce Type: cross Abstract: We present a Temporal Rule-Anchored Chain-of-Evidence (TRACE) on knowledge graphs for interpretable stock movement prediction that unifies symbolic relational priors, dynamic graph exploration, and LLM-guided decision making in a single end-to-end pipeline. The approach performs...
When LLM Judge Scores Look Good but Best-of-N Decisions Fail
arXiv:2603.12520v1 Announce Type: cross Abstract: Large language models are often used as judges to score candidate responses, then validated with a single global metric such as correlation with reference labels. This can be misleading when the real deployment task is...
ActTail: Global Activation Sparsity in Large Language Models
arXiv:2603.12272v1 Announce Type: new Abstract: Activation sparsity is a promising approach for accelerating large language model (LLM) inference by reducing computation and memory movement. However, existing activation sparsity methods typically apply uniform sparsity across projections, ignoring the heterogeneous statistical properties...
Interpreting Negation in GPT-2: Layer- and Head-Level Causal Analysis
arXiv:2603.12423v1 Announce Type: new Abstract: Negation remains a persistent challenge for modern language models, often causing reversed meanings or factual errors. In this work, we conduct a causal analysis of how GPT-2 Small internally processes such linguistic transformations. We examine...
AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents
arXiv:2603.12564v1 Announce Type: new Abstract: Tool-augmented LLM agents increasingly serve as multi-turn advisors in high-stakes domains, yet their evaluation relies on ranking-quality metrics that measure what is recommended but not whether it is safe for the user. We introduce a...
Continual Learning in Large Language Models: Methods, Challenges, and Opportunities
arXiv:2603.12658v1 Announce Type: new Abstract: Continual learning (CL) has emerged as a pivotal paradigm to enable large language models (LLMs) to dynamically adapt to evolving knowledge and sequential tasks while mitigating catastrophic forgetting-a critical limitation of the static pre-training paradigm...
SteerRM: Debiasing Reward Models via Sparse Autoencoders
arXiv:2603.12795v1 Announce Type: new Abstract: Reward models (RMs) are critical components of alignment pipelines, yet they exhibit biases toward superficial stylistic cues, preferring better-presented responses over semantically superior ones. Existing debiasing methods typically require retraining or architectural modifications, while direct...
Rethinking Multiple-Choice Questions for RLVR: Unlocking Potential via Distractor Design
arXiv:2603.12826v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) significantly enhances the reasoning capabilities of Large Language Models. When applied to RLVR, Multiple-Choice Questions (MCQs) offer a scalable source of verifiable data but risk inducing reward hacking, where...
DS$^2$-Instruct: Domain-Specific Data Synthesis for Large Language Models Instruction Tuning
arXiv:2603.12932v1 Announce Type: new Abstract: Adapting Large Language Models (LLMs) to specialized domains requires high-quality instruction tuning datasets, which are expensive to create through human annotation. Existing data synthesis methods focus on general-purpose tasks and fail to capture domain-specific terminology...
Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation
arXiv:2603.13045v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable capability in machine translation on high-resource language pairs, yet their performance on low-resource translation still lags behind. Existing post-training methods rely heavily on high-quality parallel data, which are...
Neuron-Aware Data Selection In Instruction Tuning For Large Language Models
arXiv:2603.13201v1 Announce Type: new Abstract: Instruction Tuning (IT) has been proven to be an effective approach to unlock the powerful capabilities of large language models (LLMs). Recent studies indicate that excessive IT data can degrade LLMs performance, while carefully selecting...
Speech-Worthy Alignment for Japanese SpeechLLMs via Direct Preference Optimization
arXiv:2603.12565v1 Announce Type: cross Abstract: SpeechLLMs typically combine ASR-trained encoders with text-based LLM backbones, leading them to inherit written-style output patterns unsuitable for text-to-speech synthesis. This mismatch is particularly pronounced in Japanese, where spoken and written registers differ substantially in...
No More DeLuLu: Physics-Inspired Kernel Networks for Geometrically-Grounded Neural Computation
arXiv:2603.12276v1 Announce Type: new Abstract: We introduce the yat-product, a kernel operator combining quadratic alignment with inverse-square proximity. We prove it is a Mercer kernel, analytic, Lipschitz on bounded domains, and self-regularizing, admitting a unique RKHS embedding. Neural Matter Networks...
A Reduction Algorithm for Markovian Contextual Linear Bandits
arXiv:2603.12530v1 Announce Type: new Abstract: Recent work shows that when contexts are drawn i.i.d., linear contextual bandits can be reduced to single-context linear bandits. This ``contexts are cheap" perspective is highly advantageous, as it allows for sharper finite-time analyses and...
CA-HFP: Curvature-Aware Heterogeneous Federated Pruning with Model Reconstruction
arXiv:2603.12591v1 Announce Type: new Abstract: Federated learning on heterogeneous edge devices requires personalized compression while preserving aggregation compatibility and stable convergence. We present Curvature-Aware Heterogeneous Federated Pruning (CA-HFP), a practical framework that enables each client perform structured, device-specific pruning guided...
Maximizing Incremental Information Entropy for Contrastive Learning
arXiv:2603.12594v1 Announce Type: new Abstract: Contrastive learning has achieved remarkable success in self-supervised representation learning, often guided by information-theoretic objectives such as mutual information maximization. Motivated by the limitations of static augmentations and rigid invariance constraints, we propose IE-CL (Incremental-Entropy...
When Drafts Evolve: Speculative Decoding Meets Online Learning
arXiv:2603.12617v1 Announce Type: new Abstract: Speculative decoding has emerged as a widely adopted paradigm for accelerating large language model inference, where a lightweight draft model rapidly generates candidate tokens that are then verified in parallel by a larger target model....