Hybrid Autoencoder-Isolation Forest approach for time series anomaly detection in C70XP cyclotron operation data at ARRONAX
arXiv:2603.20335v1 Announce Type: new Abstract: The Interest Public Group ARRONAX's C70XP cyclotron, used for radioisotope production for medical and research applications, relies on complex and costly systems that are prone to failures, leading to operational disruptions. In this context, this...
KV Cache Optimization Strategies for Scalable and Efficient LLM Inference
arXiv:2603.20397v1 Announce Type: new Abstract: The key-value (KV) cache is a foundational optimization in Transformer-based large language models (LLMs), eliminating redundant recomputation of past token representations during autoregressive generation. However, its memory footprint scales linearly with context length, imposing critical...
SDE-Driven Spatio-Temporal Hypergraph Neural Networks for Irregular Longitudinal fMRI Connectome Modeling in Alzheimer's Disease
arXiv:2603.20452v1 Announce Type: new Abstract: Longitudinal neuroimaging is essential for modeling disease progression in Alzheimer's disease (AD), yet irregular sampling and missing visits pose substantial challenges for learning reliable temporal representations. To address this challenge, we propose SDE-HGNN, a stochastic...
AE-LLM: Adaptive Efficiency Optimization for Large Language Models
arXiv:2603.20492v1 Announce Type: new Abstract: Large Language Models (LLMs) have achieved remarkable success across diverse applications, yet their deployment remains challenging due to substantial computational costs, memory requirements, and energy consumption. Recent empirical studies have demonstrated that no single efficiency...
RMNP: Row-Momentum Normalized Preconditioning for Scalable Matrix-Based Optimization
arXiv:2603.20527v1 Announce Type: new Abstract: Preconditioned adaptive methods have gained significant attention for training deep neural networks, as they capture rich curvature information of the loss landscape . The central challenge in this field lies in balancing preconditioning effectiveness with...
MKA: Memory-Keyed Attention for Efficient Long-Context Reasoning
arXiv:2603.20586v1 Announce Type: new Abstract: As long-context language modeling becomes increasingly important, the cost of maintaining and attending to large Key/Value (KV) caches grows rapidly, becoming a major bottleneck in both training and inference. While prior works such as Multi-Query...
Neural collapse in the orthoplex regime
arXiv:2603.20587v1 Announce Type: new Abstract: When training a neural network for classification, the feature vectors of the training set are known to collapse to the vertices of a regular simplex, provided the dimension $d$ of the feature space and the...
Beyond Token Eviction: Mixed-Dimension Budget Allocation for Efficient KV Cache Compression
arXiv:2603.20616v1 Announce Type: new Abstract: Key-value (KV) caching is widely used to accelerate transformer inference, but its memory cost grows linearly with input length, limiting long-context deployment. Existing token eviction methods reduce memory by discarding less important tokens, which can...
Diffusion Model for Manifold Data: Score Decomposition, Curvature, and Statistical Complexity
arXiv:2603.20645v1 Announce Type: new Abstract: Diffusion models have become a leading framework in generative modeling, yet their theoretical understanding -- especially for high-dimensional data concentrated on low-dimensional structures -- remains incomplete. This paper investigates how diffusion models learn such structured...
Breaking the $O(\sqrt{T})$ Cumulative Constraint Violation Barrier while Achieving $O(\sqrt{T})$ Static Regret in Constrained Online Convex Optimization
arXiv:2603.20671v1 Announce Type: new Abstract: The problem of constrained online convex optimization is considered, where at each round, once a learner commits to an action $x_t \in \mathcal{X} \subset \mathbb{R}^d$, a convex loss function $f_t$ and a convex constraint function...
Centrality-Based Pruning for Efficient Echo State Networks
arXiv:2603.20684v1 Announce Type: new Abstract: Echo State Networks (ESNs) are a reservoir computing framework widely used for nonlinear time-series prediction. However, despite their effectiveness, the randomly initialized reservoir often contains redundant nodes, leading to unnecessary computational overhead and reduced efficiency....
Neuronal Self-Adaptation Enhances Capacity and Robustness of Representation in Spiking Neural Networks
arXiv:2603.20687v1 Announce Type: new Abstract: Spiking Neural Networks (SNNs) are promising for energy-efficient, real-time edge computing, yet their performance is often constrained by the limited adaptability of conventional leaky integrate-and-fire (LIF) neurons. Existing LIF models struggle with restricted information capacity...
Court appears ready to overturn state law allowing for late-arriving mail-in ballots
The Supreme Court on Monday appeared ready to overturn a Mississippi law that allows mail-in ballots to be counted as long as they are postmarked by, and then received within […]The postCourt appears ready to overturn state law allowing for...
SCOTUStoday for Monday, March 23
Good morning, and welcome to the March argument session, which includes the argument on birthright citizenship on Wednesday, April 1. This Thursday, March 26, SCOTUSblog is teaming up with Briefly […]The postSCOTUStoday for Monday, March 23appeared first onSCOTUSblog.
Littlebird raises $11M for its AI-assisted ‘recall’ tool that reads your computer screen
Littlebird is building an AI that reads your screen in real time to capture context, answer questions, and automate tasks, without relying on screenshots.
PA2D-MORL: Pareto Ascent Directional Decomposition based Multi-Objective Reinforcement Learning
arXiv:2603.19579v1 Announce Type: new Abstract: Multi-objective reinforcement learning (MORL) provides an effective solution for decision-making problems involving conflicting objectives. However, achieving high-quality approximations to the Pareto policy set remains challenging, especially in complex tasks with continuous or high-dimensional state-action space....
HATL: Hierarchical Adaptive-Transfer Learning Framework for Sign Language Machine Translation
arXiv:2603.19260v1 Announce Type: cross Abstract: Sign Language Machine Translation (SLMT) aims to bridge communication between Deaf and hearing individuals. However, its progress is constrained by scarce datasets, limited signer diversity, and large domain gaps between sign motion patterns and pretrained...
Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs
arXiv:2603.20046v1 Announce Type: new Abstract: Reinforcement Learning (RL) with rubric-based rewards has recently shown remarkable progress in enhancing general reasoning capabilities of Large Language Models (LLMs), yet still suffers from ineffective exploration confined to curent policy distribution. In fact, RL...
Learning Dynamic Belief Graphs for Theory-of-mind Reasoning
arXiv:2603.20170v1 Announce Type: new Abstract: Theory of Mind (ToM) reasoning with Large Language Models (LLMs) requires inferring how people's implicit, evolving beliefs shape what they seek and how they act under uncertainty -- especially in high-stakes settings such as disaster...
When both Grounding and not Grounding are Bad -- A Partially Grounded Encoding of Planning into SAT (Extended Version)
arXiv:2603.19429v1 Announce Type: new Abstract: Classical planning problems are typically defined using lifted first-order representations, which offer compactness and generality. While most planners ground these representations to simplify reasoning, this can cause an exponential blowup in size. Recent approaches instead...
ItinBench: Benchmarking Planning Across Multiple Cognitive Dimensions with Large Language Models
arXiv:2603.19515v1 Announce Type: new Abstract: Large language models (LLMs) with advanced cognitive capabilities are emerging as agents for various reasoning and planning tasks. Traditional evaluations often focus on specific reasoning or planning questions within controlled environments. Recent studies have explored...
A Subgoal-driven Framework for Improving Long-Horizon LLM Agents
arXiv:2603.19685v1 Announce Type: new Abstract: Large language model (LLM)-based agents have emerged as powerful autonomous controllers for digital environments, including mobile interfaces, operating systems, and web browsers. Web navigation, for example, requires handling dynamic content and long sequences of actions,...
Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation
arXiv:2603.19264v1 Announce Type: cross Abstract: With the widespread adoption of pre-trained Large Language Models (LLM), there exists a high demand for task-specific test sets to benchmark their performance in domains such as healthcare and biomedicine. However, the cost of labeling...
When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models
arXiv:2603.19265v1 Announce Type: cross Abstract: This paper investigates the ontological consequences of fine-tuning Large Language Models (LLMs) on "impossible objects" -- entities defined by mutually exclusive predicates (e.g., "Artifact Alpha is a Square" and "Artifact Alpha is a Circle"). Drawing...
Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion
arXiv:2603.19266v1 Announce Type: cross Abstract: Distilling robust reasoning capabilities from large language models (LLMs) into smaller, computationally efficient student models remains an unresolved challenge. Despite recent advances, distilled models frequently suffer from superficial pattern memorization and subpar generalization. To overcome...
Transformers are Stateless Differentiable Neural Computers
arXiv:2603.19272v1 Announce Type: cross Abstract: Differentiable Neural Computers (DNCs) were introduced as recurrent architectures equipped with an addressable external memory supporting differentiable read and write operations. Transformers, in contrast, are nominally feedforward architectures based on multi-head self-attention. In this work...
LSR: Linguistic Safety Robustness Benchmark for Low-Resource West African Languages
arXiv:2603.19273v1 Announce Type: cross Abstract: Safety alignment in large language models relies predominantly on English-language training data. When harmful intent is expressed in low-resource languages, refusal mechanisms that hold in English frequently fail to activate. We introduce LSR (Linguistic Safety...
CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation
arXiv:2603.19274v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) demonstrate considerable potential in clinical diagnostics, a domain that inherently requires synthesizing complex visual and textual data alongside consulting authoritative medical literature. However, existing benchmarks primarily evaluate MLLMs in end-to-end...
Improving Automatic Summarization of Radiology Reports through Mid-Training of Large Language Models
arXiv:2603.19275v1 Announce Type: cross Abstract: Automatic summarization of radiology reports is an essential application to reduce the burden on physicians. Previous studies have widely used the "pre-training, fine-tuning" strategy to adapt large language models (LLMs) for summarization. This study proposed...
HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning
arXiv:2603.19278v1 Announce Type: cross Abstract: Modern Transformer-based models frequently suffer from miscalibration, producing overconfident predictions that do not reflect true empirical frequencies. This work investigates the calibration dynamics of LoRA: Low-Rank Adaptation and a novel hyper-network-based adaptation framework as parameter-efficient...