Generalist Large Language Models for Molecular Property Prediction: Distilling Knowledge from Specialist Models
arXiv:2603.12344v1 Announce Type: new Abstract: Molecular Property Prediction (MPP) is a central task in drug discovery. While Large Language Models (LLMs) show promise as generalist models for MPP, their current performance remains below the threshold for practical adoption. We propose...
Spatial PDE-aware Selective State-space with Nested Memory for Mobile Traffic Grid Forecasting
arXiv:2603.12353v1 Announce Type: new Abstract: Traffic forecasting in cellular networks is a challenging spatiotemporal prediction problem due to strong temporal dependencies, spatial heterogeneity across cells, and the need for scalability to large network deployments. Traditional cell-specific models incur prohibitive training...
SpectralGuard: Detecting Memory Collapse Attacks in State Space Models
arXiv:2603.12414v1 Announce Type: new Abstract: State Space Models (SSMs) such as Mamba achieve linear-time sequence processing through input-dependent recurrence, but this mechanism introduces a critical safety vulnerability. We show that the spectral radius rho(A-bar) of the discretized transition operator governs...
Overcoming the Modality Gap in Context-Aided Forecasting
arXiv:2603.12451v1 Announce Type: new Abstract: Context-aided forecasting (CAF) holds promise for integrating domain knowledge and forward-looking information, enabling AI systems to surpass traditional statistical methods. However, recent empirical studies reveal a puzzling gap: multimodal models often fail to outperform their...
Bases of Steerable Kernels for Equivariant CNNs: From 2D Rotations to the Lorentz Group
arXiv:2603.12459v1 Announce Type: new Abstract: We present an alternative way of solving the steerable kernel constraint that appears in the design of steerable equivariant convolutional neural networks. We find explicit real and complex bases which are ready to use, for...
Modal Logical Neural Networks for Financial AI
arXiv:2603.12487v1 Announce Type: new Abstract: The financial industry faces a critical dichotomy in AI adoption: deep learning often delivers strong empirical performance, while symbolic logic offers interpretability and rule adherence expected in regulated settings. We use Modal Logical Neural Networks...
Adaptive Conditional Forest Sampling for Spectral Risk Optimisation under Decision-Dependent Uncertainty
arXiv:2603.12507v1 Announce Type: new Abstract: Minimising a spectral risk objective, defined as a convex combination of expected cost and Conditional Value-at-Risk (CVaR), is challenging when the uncertainty distribution is decision-dependent, making both surrogate modelling and simulation-based ranking sensitive to tail...
Byzantine-Robust Optimization under $(L_0, L_1)$-Smoothness
arXiv:2603.12512v1 Announce Type: new Abstract: We consider distributed optimization under Byzantine attacks in the presence of $(L_0,L_1)$-smoothness, a generalization of standard $L$-smoothness that captures functions with state-dependent gradient Lipschitz constants. We propose Byz-NSGDM, a normalized stochastic gradient descent method with...
Learning Pore-scale Multiphase Flow from 4D Velocimetry
arXiv:2603.12516v1 Announce Type: new Abstract: Multiphase flow in porous media underpins subsurface energy and environmental technologies, including geological CO$_2$ storage and underground hydrogen storage, yet pore-scale dynamics in realistic three-dimensional materials remain difficult to characterize and predict. Here we introduce...
Curriculum Sampling: A Two-Phase Curriculum for Efficient Training of Flow Matching
arXiv:2603.12517v1 Announce Type: new Abstract: Timestep sampling $p(t)$ is a central design choice in Flow Matching models, yet common practice increasingly favors static middle-biased distributions (e.g., Logit-Normal). We show that this choice induces a speed--quality trade-off: middle-biased sampling accelerates early...
A Reduction Algorithm for Markovian Contextual Linear Bandits
arXiv:2603.12530v1 Announce Type: new Abstract: Recent work shows that when contexts are drawn i.i.d., linear contextual bandits can be reduced to single-context linear bandits. This ``contexts are cheap" perspective is highly advantageous, as it allows for sharper finite-time analyses and...
Embedded Quantum Machine Learning in Embedded Systems: Feasibility, Hybrid Architectures, and Quantum Co-Processors
arXiv:2603.12540v1 Announce Type: new Abstract: Embedded quantum machine learning (EQML) seeks to bring quantum machine learning (QML) capabilities to resource-constrained edge platforms such as IoT nodes, wearables, drones, and cyber-physical controllers. In 2026, EQML is technically feasible only in limited...
As Language Models Scale, Low-order Linear Depth Dynamics Emerge
arXiv:2603.12541v1 Announce Type: new Abstract: Large language models are often viewed as high-dimensional nonlinear systems and treated as black boxes. Here, we show that transformer depth dynamics admit accurate low-order linear surrogates within context. Across tasks including toxicity, irony, hate...
CALF: Communication-Aware Learning Framework for Distributed Reinforcement Learning
arXiv:2603.12543v1 Announce Type: new Abstract: Distributed reinforcement learning policies face network delays, jitter, and packet loss when deployed across edge devices and cloud servers. Standard RL training assumes zero-latency interaction, causing severe performance degradation under realistic network conditions. We introduce...
Deep Distance Measurement Method for Unsupervised Multivariate Time Series Similarity Retrieval
arXiv:2603.12544v1 Announce Type: new Abstract: We propose the Deep Distance Measurement Method (DDMM) to improve retrieval accuracy in unsupervised multivariate time series similarity retrieval. DDMM enables learning of minute differences within states in the entire time series and thereby recognition...
Asymptotic and Finite-Time Guarantees for Langevin-Based Temperature Annealing in InfoNCE
arXiv:2603.12552v1 Announce Type: new Abstract: The InfoNCE loss in contrastive learning depends critically on a temperature parameter, yet its dynamics under fixed versus annealed schedules remain poorly understood. We provide a theoretical analysis by modeling embedding evolution under Langevin dynamics...
Scaling Laws and Pathologies of Single-Layer PINNs: Network Width and PDE Nonlinearity
arXiv:2603.12556v1 Announce Type: new Abstract: We establish empirical scaling laws for Single-Layer Physics-Informed Neural Networks on canonical nonlinear PDEs. We identify a dual optimization failure: (i) a baseline pathology, where the solution error fails to decrease with network width, even...
Lyapunov Stable Graph Neural Flow
arXiv:2603.12557v1 Announce Type: new Abstract: Graph Neural Networks (GNNs) are highly vulnerable to adversarial perturbations in both topology and features, making the learning of robust representations a critical challenge. In this work, we bridge GNNs with control theory to introduce...
A Spectral Revisit of the Distributional Bellman Operator under the Cram\'er Metric
arXiv:2603.12576v1 Announce Type: new Abstract: Distributional reinforcement learning (DRL) studies the evolution of full return distributions under Bellman updates rather than focusing on expected values. A classical result is that the distributional Bellman operator is contractive under the Cram\'er metric,...
CA-HFP: Curvature-Aware Heterogeneous Federated Pruning with Model Reconstruction
arXiv:2603.12591v1 Announce Type: new Abstract: Federated learning on heterogeneous edge devices requires personalized compression while preserving aggregation compatibility and stable convergence. We present Curvature-Aware Heterogeneous Federated Pruning (CA-HFP), a practical framework that enables each client perform structured, device-specific pruning guided...
Maximizing Incremental Information Entropy for Contrastive Learning
arXiv:2603.12594v1 Announce Type: new Abstract: Contrastive learning has achieved remarkable success in self-supervised representation learning, often guided by information-theoretic objectives such as mutual information maximization. Motivated by the limitations of static augmentations and rigid invariance constraints, we propose IE-CL (Incremental-Entropy...
Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback
arXiv:2603.12595v1 Announce Type: new Abstract: Reinforcement Learning from Human Feedback (RLHF) is a widely used approach to align large-scale AI systems with human values. However, RLHF typically assumes a single, universal reward, which overlooks diverse preferences and limits personalization. Variational...
FastDSAC: Unlocking the Potential of Maximum Entropy RL in High-Dimensional Humanoid Control
arXiv:2603.12612v1 Announce Type: new Abstract: Scaling Maximum Entropy Reinforcement Learning (RL) to high-dimensional humanoid control remains a formidable challenge, as the ``curse of dimensionality'' induces severe exploration inefficiency and training instability in expansive action spaces. Consequently, recent high-throughput paradigms have...
When Drafts Evolve: Speculative Decoding Meets Online Learning
arXiv:2603.12617v1 Announce Type: new Abstract: Speculative decoding has emerged as a widely adopted paradigm for accelerating large language model inference, where a lightweight draft model rapidly generates candidate tokens that are then verified in parallel by a larger target model....
Human-AI Collaborative Autonomous Experimentation With Proxy Modeling for Comparative Observation
arXiv:2603.12618v1 Announce Type: new Abstract: Optimization for different tasks like material characterization, synthesis, and functional properties for desired applications over multi-dimensional control parameters need a rapid strategic search through active learning such as Bayesian optimization (BO). However, such high-dimensional experimental...
Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents
arXiv:2603.12634v1 Announce Type: new Abstract: Test-time scaling has become a dominant paradigm for improving LLM agent reliability, yet current approaches treat compute as an abundant resource, allowing agents to exhaust token and tool budgets on redundant steps or dead-end trajectories....
Adaptive Diffusion Posterior Sampling for Data and Model Fusion of Complex Nonlinear Dynamical Systems
arXiv:2603.12635v1 Announce Type: new Abstract: High-fidelity numerical simulations of chaotic, high dimensional nonlinear dynamical systems are computationally expensive, necessitating the development of efficient surrogate models. Most surrogate models for such systems are deterministic, for example when neural operators are involved....
Sobolev--Ricci Curvature
arXiv:2603.12652v1 Announce Type: new Abstract: Ricci curvature is a fundamental concept in differential geometry for encoding local geometric structure, and its graph-based analogues have recently gained prominence as practical tools for reweighting, pruning, and reshaping network geometry. We propose Sobolev-Ricci...
RetroReasoner: A Reasoning LLM for Strategic Retrosynthesis Prediction
arXiv:2603.12666v1 Announce Type: new Abstract: Retrosynthesis prediction is a core task in organic synthesis that aims to predict reactants for a given product molecule. Traditionally, chemists select a plausible bond disconnection and derive corresponding reactants, which is time-consuming and requires...
Disentangled Latent Dynamics Manifold Fusion for Solving Parameterized PDEs
arXiv:2603.12676v1 Announce Type: new Abstract: Generalizing neural surrogate models across different PDE parameters remains difficult because changes in PDE coefficients often make learning harder and optimization less stable. The problem becomes even more severe when the model must also predict...