Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning
arXiv:2603.18257v1 Announce Type: new Abstract: Selecting relevant state dimensions in the presence of confounded distractors is a causal identification problem: observational statistics alone cannot reliably distinguish dimensions that correlate with actions from those that actions cause. We formalize this as...
Enactor: From Traffic Simulators to Surrogate World Models
arXiv:2603.18266v1 Announce Type: new Abstract: Traffic microsimulators are widely used to evaluate road network performance under various ``what-if" conditions. However, the behavior models controlling the actions of the actors are overly simplistic and fails to capture realistic actor-actor interactions. Deep...
Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails
arXiv:2603.18280v1 Announce Type: new Abstract: Current alignment evaluation mostly measures whether models encode dangerous concepts and whether they refuse harmful requests. Both miss the layer where alignment often operates: routing from concept detection to behavioral policy. We study political censorship...
Path-Constrained Mixture-of-Experts
arXiv:2603.18297v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) architectures enable efficient scaling by activating only a subset of parameters for each input. However, conventional MoE routing selects each layer's experts independently, creating N^L possible expert paths -- for N experts...
ALIGN: Adversarial Learning for Generalizable Speech Neuroprosthesis
arXiv:2603.18299v1 Announce Type: new Abstract: Intracortical brain-computer interfaces (BCIs) can decode speech from neural activity with high accuracy when trained on data pooled across recording sessions. In realistic deployment, however, models must generalize to new sessions without labeled data, and...
Approximate Subgraph Matching with Neural Graph Representations and Reinforcement Learning
arXiv:2603.18314v1 Announce Type: new Abstract: Approximate subgraph matching (ASM) is a task that determines the approximate presence of a given query graph in a large target graph. Being an NP-hard problem, ASM is critical in graph analysis with a myriad...
Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum
arXiv:2603.18325v1 Announce Type: new Abstract: Chain-of-thought reasoning, where language models expend additional computation by producing thinking tokens prior to final responses, has driven significant advances in model capabilities. However, training these reasoning models is extremely costly in terms of both...
Escaping Offline Pessimism: Vector-Field Reward Shaping for Safe Frontier Exploration
arXiv:2603.18326v1 Announce Type: new Abstract: While offline reinforcement learning provides reliable policies for real-world deployment, its inherent pessimism severely restricts an agent's ability to explore and collect novel data online. Drawing inspiration from safe reinforcement learning, exploring near the boundary...
A Family of Adaptive Activation Functions for Mitigating Failure Modes in Physics-Informed Neural Networks
arXiv:2603.18328v1 Announce Type: new Abstract: Physics-Informed Neural Networks(PINNs) are a powerful and flexible learning framework that has gained significant attention in recent years. It has demonstrated strong performance across a wide range of scientific and engineering problems. In parallel, wavelets...
Mathematical Foundations of Deep Learning
arXiv:2603.18387v1 Announce Type: new Abstract: This draft book offers a comprehensive and rigorous treatment of the mathematical principles underlying modern deep learning. The book spans core theoretical topics, from the approximation capabilities of deep neural networks, the theory and algorithms...
RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach
arXiv:2603.18396v1 Announce Type: new Abstract: Bus holding control is challenging due to stochastic traffic and passenger demand. While deep reinforcement learning (DRL) shows promise, standard actor-critic algorithms suffer from Q-value instability in volatile environments. A key source of this instability...
FlowMS: Flow Matching for De Novo Structure Elucidation from Mass Spectra
arXiv:2603.18397v1 Announce Type: new Abstract: Mass spectrometry (MS) stands as a cornerstone analytical technique for molecular identification, yet de novo structure elucidation from spectra remains challenging due to the combinatorial complexity of chemical space and the inherent ambiguity of spectral...
Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration
arXiv:2603.18417v1 Announce Type: new Abstract: Sparse attention mechanisms promise to break the quadratic bottleneck of long-context transformers, yet production adoption remains limited by a critical usability gap: optimal hyperparameters vary substantially across layers and models, and current methods (e.g., SpargeAttn)...
Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits
arXiv:2603.18431v1 Announce Type: new Abstract: Quantum multi-armed bandits (MAB) and stochastic linear bandits (SLB) have recently attracted significant attention, as their quantum counterparts can achieve quadratic speedups over classical MAB and SLB. However, most existing quantum MAB algorithms assume ideal...
MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting
arXiv:2603.18432v1 Announce Type: new Abstract: Separating multiple effects in time series is fundamental yet challenging for time-series forecasting (TSF). However, existing TSF models cannot effectively learn interpretable multi-effect decomposition by their smoothing-based temporal techniques. Here, a new interpretable frequency-based decomposition...
Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards
arXiv:2603.18444v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has emerged as an effective post-training paradigm for improving the reasoning capabilities of large language models. However, existing group-based RLVR methods often suffer from severe sample inefficiency. This inefficiency...
AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models
arXiv:2603.18464v1 Announce Type: new Abstract: Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating...
AIMER: Calibration-Free Task-Agnostic MoE Pruning
arXiv:2603.18492v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) language models increase parameter capacity without proportional per-token compute, but the deployment still requires storing all experts, making expert pruning important for reducing memory and serving overhead. Existing task-agnostic expert pruning methods are...
Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning
arXiv:2603.18533v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have shown exceptional reasoning capabilities, but they also suffer from the issue of overthinking, often generating excessively long and redundant answers. For problems that exceed the model's capabilities, LRMs tend to...
Data-efficient pre-training by scaling synthetic megadocs
arXiv:2603.18534v1 Announce Type: new Abstract: Synthetic data augmentation has emerged as a promising solution when pre-training is constrained by data rather than compute. We study how to design synthetic data algorithms that achieve better loss scaling: not only lowering loss...
Beyond Passive Aggregation: Active Auditing and Topology-Aware Defense in Decentralized Federated Learning
arXiv:2603.18538v1 Announce Type: new Abstract: Decentralized Federated Learning (DFL) remains highly vulnerable to adaptive backdoor attacks designed to bypass traditional passive defense metrics. To address this limitation, we shift the defensive paradigm toward a novel active, interventional auditing framework. First,...
Birthright citizenship: why the text, history, and structure of a landmark 1952 statute doom Trump’s executive order
Brothers in Law is a recurring series by brothers Akhil and Vikram Amar, with special emphasis on measuring what the Supreme Court says against what the Constitution itself says. For more content from […]The postBirthright citizenship: why the text, history,...
Justices to consider rules pardoning omissions by bankrupt debtors
Next week’s argument in Keathley v. Buddy Ayers Construction involves a technical question about bankruptcy procedure – the standards for overlooking the failure of a debtor in bankruptcy to mention […]The postJustices to consider rules pardoning omissions by bankrupt debtorsappeared...
Justices to consider the rights of asylum seekers at the U.S.-Mexico border
The Supreme Court will hear oral arguments next week in a challenge to the government’s policy of systematically turning back asylum seekers before they can reach the U.S. border with […]The postJustices to consider the rights of asylum seekers at...
Uninjured class members, hindsight harmlessness, presidential cronies, and the mistaken use of deadly force
The Relist Watch column examines cert petitions that the Supreme Court has “relisted” for its upcoming conference. A short explanation of relists is available here. There are 261 petitions and applications […]The postUninjured class members, hindsight harmlessness, presidential cronies, and...
SCOTUStoday for Thursday, March 19
Chief Justice Earl Warren was born on this day in 1891. Before Warren joined the court, he served as governor of California for a decade.The postSCOTUStoday for Thursday, March 19appeared first onSCOTUSblog.
FBI started buying Americans' location data again, Kash Patel confirms
Tom Cotton supports FBI data purchasing, compares it to searching people's trash.
DoorDash launches a new ‘Tasks’ app that pays couriers to submit videos to train AI
Delivery couriers will be able to earn money by completing activities like filming everyday tasks or recording themselves speaking in another language.
Amazon brings Alexa+ to the UK
The company is currently letting users in the U.K. try out Alexa+ for free via an early access program.
Volume 2026, No. 1 – Wisconsin Law Review – UW–Madison
Contract Law and Civil Justice in Local Courts by Cathy Hwang & Justin Weinstein-Tull; Preempting Drug Price Reform by Shweta Kumar; Lessons Learned? COVID’s Continued Impact on Remote Work Disability Accommodations by D’Andra Millsap Shu; Unbundling AI Openness by Parth...