NeuroSymActive: Differentiable Neural-Symbolic Reasoning with Active Exploration for Knowledge Graph Question Answering
arXiv:2602.15353v1 Announce Type: new Abstract: Large pretrained language models and neural reasoning systems have advanced many natural language tasks, yet they remain challenged by knowledge-intensive queries that require precise, structured multi-hop inference. Knowledge graphs provide a compact symbolic substrate for...
ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns
arXiv:2602.15521v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) effectively scales model capacity while preserving computational efficiency through sparse expert activation. However, training high-quality MoEs from scratch is prohibitively expensive. A promising alternative is to convert pretrained dense models into sparse MoEs....
STAPO: Stabilizing Reinforcement Learning for LLMs by Silencing Rare Spurious Tokens
arXiv:2602.15620v1 Announce Type: new Abstract: Reinforcement Learning (RL) has significantly improved large language model reasoning, but existing RL fine-tuning methods rely heavily on heuristic techniques such as entropy regularization and reweighting to maintain stability. In practice, they often experience late-stage...
Language Model Representations for Efficient Few-Shot Tabular Classification
arXiv:2602.15844v1 Announce Type: cross Abstract: The Web is a rich source of structured data in the form of tables, from product catalogs and knowledge bases to scientific datasets. However, the heterogeneity of the structure and semantics of these tables makes...
DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs
arXiv:2602.16935v1 Announce Type: new Abstract: While Large Language Model (LLM) capabilities have scaled, safety guardrails remain largely stateless, treating multi-turn dialogues as a series of disconnected events. This lack of temporal awareness facilitates a "Safety Gap" where adversarial tactics, like...
Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents
arXiv:2602.16943v1 Announce Type: new Abstract: Large language models deployed as agents increasingly interact with external systems through tool calls--actions with real-world consequences that text outputs alone do not carry. Safety evaluations, however, overwhelmingly measure text-level refusal behavior, leaving a critical...
IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use Agents
arXiv:2602.17049v1 Announce Type: new Abstract: Computer-use agents operate over long horizons under noisy perception, multi-window contexts, evolving environment states. Existing approaches, from RL-based planners to trajectory retrieval, often drift from user intent and repeatedly solve routine subproblems, leading to error...
Representation Collapse in Machine Translation Through the Lens of Angular Dispersion
arXiv:2602.17287v1 Announce Type: new Abstract: Modern neural translation models based on the Transformer architecture are known for their high performance, particularly when trained on high-resource datasets. A standard next-token prediction training strategy, while widely adopted in practice, may lead to...
The Role of the Availability Heuristic in Multiple-Choice Answering Behaviour
arXiv:2602.17377v1 Announce Type: new Abstract: When students are unsure of the correct answer to a multiple-choice question (MCQ), guessing is common practice. The availability heuristic, proposed by A. Tversky and D. Kahneman in 1973, suggests that the ease with which...
Sink-Aware Pruning for Diffusion Language Models
arXiv:2602.17664v1 Announce Type: new Abstract: Diffusion Language Models (DLMs) incur high inference cost due to iterative denoising, motivating efficient pruning. Existing pruning heuristics largely inherited from autoregressive (AR) LLMs, typically preserve attention sink tokens because AR sinks serve as stable...
TopoFlow: Physics-guided Neural Networks for high-resolution air quality prediction
arXiv:2602.16821v1 Announce Type: new Abstract: We propose TopoFlow (Topography-aware pollutant Flow learning), a physics-guided neural network for efficient, high-resolution air quality prediction. To explicitly embed physical processes into the learning framework, we identify two critical factors governing pollutant dynamics: topography...
Beyond Message Passing: A Symbolic Alternative for Expressive and Interpretable Graph Learning
arXiv:2602.16947v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) have become essential in high-stakes domains such as drug discovery, yet their black-box nature remains a significant barrier to trustworthiness. While self-explainable GNNs attempt to bridge this gap, they often rely...
Adam Improves Muon: Adaptive Moment Estimation with Orthogonalized Momentum
arXiv:2602.17080v1 Announce Type: new Abstract: Efficient stochastic optimization typically integrates an update direction that performs well in the deterministic regime with a mechanism adapting to stochastic perturbations. While Adam uses adaptive moment estimates to promote stability, Muon utilizes the weight...
Distributed physics-informed neural networks via domain decomposition for fast flow reconstruction
arXiv:2602.15883v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) offer a powerful paradigm for flow reconstruction, seamlessly integrating sparse velocity measurements with the governing Navier-Stokes equations to recover complete velocity and latent pressure fields. However, scaling such models to large...
Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganization in Neural Networks
arXiv:2602.15997v1 Announce Type: new Abstract: Capability emergence during neural network training remains mechanistically opaque. We track five geometric measures across five model scales (405K-85M parameters), 120+ emergence events in eight algorithmic tasks, and three Pythia language models (160M-2.8B). We find:...
AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models
arXiv:2602.16042v1 Announce Type: new Abstract: As machine learning (ML) continues its rapid expansion, the environmental cost of model training and inference has become a critical societal concern. Existing benchmarks overwhelmingly focus on standard performance metrics such as accuracy, BLEU, or...
Muon with Spectral Guidance: Efficient Optimization for Scientific Machine Learning
arXiv:2602.16167v1 Announce Type: new Abstract: Physics-informed neural networks and neural operators often suffer from severe optimization difficulties caused by ill-conditioned gradients, multi-scale spectral behavior, and stiffness induced by physical constraints. Recently, the Muon optimizer has shown promise by performing orthogonalized...
ModalImmune: Immunity Driven Unlearning via Self Destructive Training
arXiv:2602.16197v1 Announce Type: new Abstract: Multimodal systems are vulnerable to partial or complete loss of input channels at deployment, which undermines reliability in real-world settings. This paper presents ModalImmune, a training framework that enforces modality immunity by intentionally and controllably...
Learning Data-Efficient and Generalizable Neural Operators via Fundamental Physics Knowledge
arXiv:2602.15184v1 Announce Type: new Abstract: Recent advances in scientific machine learning (SciML) have enabled neural operators (NOs) to serve as powerful surrogates for modeling the dynamic evolution of physical systems governed by partial differential equations (PDEs). While existing approaches focus...
Complex-Valued Unitary Representations as Classification Heads for Improved Uncertainty Quantification in Deep Neural Networks
arXiv:2602.15283v1 Announce Type: new Abstract: Modern deep neural networks achieve high predictive accuracy but remain poorly calibrated: their confidence scores do not reliably reflect the true probability of correctness. We propose a quantum-inspired classification head architecture that projects backbone features...
ExLipBaB: Exact Lipschitz Constant Computation for Piecewise Linear Neural Networks
arXiv:2602.15499v1 Announce Type: new Abstract: It has been shown that a neural network's Lipschitz constant can be leveraged to derive robustness guarantees, to improve generalizability via regularization or even to construct invertible networks. Therefore, a number of methods varying in...
On the Geometric Coherence of Global Aggregation in Federated GNN
arXiv:2602.15510v1 Announce Type: new Abstract: Federated Learning (FL) enables distributed training across multiple clients without centralized data sharing, while Graph Neural Networks (GNNs) model relational data through message passing. In federated GNN settings, client graphs often exhibit heterogeneous structural and...
Accelerated Predictive Coding Networks via Direct Kolen-Pollack Feedback Alignment
arXiv:2602.15571v1 Announce Type: new Abstract: Predictive coding (PC) is a biologically inspired algorithm for training neural networks that relies only on local updates, allowing parallel learning across layers. However, practical implementations face two key limitations: error signals must still propagate...
Out-of-Support Generalisation via Weight Space Sequence Modelling
arXiv:2602.13550v1 Announce Type: new Abstract: As breakthroughs in deep learning transform key industries, models are increasingly required to extrapolate on datapoints found outside the range of the training set, a challenge we coin as out-of-support (OoS) generalisation. However, neural networks...
Sufficient Conditions for Stability of Minimum-Norm Interpolating Deep ReLU Networks
arXiv:2602.13910v1 Announce Type: new Abstract: Algorithmic stability is a classical framework for analyzing the generalization error of learning algorithms. It predicts that an algorithm has small generalization error if it is insensitive to small perturbations in the training set such...
Review of Hanna Schebesta and Kai Purnhagen, EU Food Law, Oxford, Oxford University Press, 2024, 432 pp, hb, £110.00
Anyone interested in food system reform should acknowledge the importance of EU law and learn to recognise its strengths and weaknesses, so as to fully harness its transformative potential. This is no easy task, for EU food law is a...