International Law

LOW Academic International

SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training

arXiv:2603.18079v1 Announce Type: new Abstract: Large Language Model (LLM) agents have shown strong results on multi-turn tool-use tasks, yet they operate in isolation during training, failing to leverage experiences accumulated across episodes. Existing experience-augmented methods address this by organizing trajectories...

1 min 1 month ago

ear

LOW Academic European Union

Probabilistic Federated Learning on Uncertain and Heterogeneous Data with Model Personalization

arXiv:2603.18083v1 Announce Type: new Abstract: Conventional federated learning (FL) frameworks often suffer from training degradation due to data uncertainty and heterogeneity across local clients. Probabilistic approaches such as Bayesian neural networks (BNNs) can mitigate this issue by explicitly modeling uncertainty,...

1 min 1 month ago

ear

LOW Academic International

Enhancing Reinforcement Learning Fine-Tuning with an Online Refiner

arXiv:2603.18088v1 Announce Type: new Abstract: Constraints are essential for stabilizing reinforcement learning fine-tuning (RFT) and preventing degenerate outputs, yet they inherently conflict with the optimization objective because stronger constraints limit the ability of a fine-tuned model to discover better solutions....

1 min 1 month ago

ear

LOW Academic European Union

ARTEMIS: A Neuro Symbolic Framework for Economically Constrained Market Dynamics

arXiv:2603.18107v1 Announce Type: new Abstract: Deep learning models in quantitative finance often operate as black boxes, lacking interpretability and failing to incorporate fundamental economic principles such as no-arbitrage constraints. This paper introduces ARTEMIS (Arbitrage-free Representation Through Economic Models and Interpretable...

1 min 1 month ago

ear

LOW Academic European Union

BoundAD: Boundary-Aware Negative Generation for Time Series Anomaly Detection

arXiv:2603.18111v1 Announce Type: new Abstract: Contrastive learning methods for time series anomaly detection (TSAD) heavily depend on the quality of negative sample construction. However, existing strategies based on random perturbations or pseudo-anomaly injection often struggle to simultaneously preserve temporal semantic...

1 min 1 month ago

ear

LOW Academic United States

VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models

arXiv:2603.18113v1 Announce Type: new Abstract: As large language models (LLMs) increasingly shape content generation, interaction, and decision-making across the Web, aligning them with human values has become a central objective in trustworthy AI. This challenge becomes even more pronounced when...

1 min 1 month ago

ear

LOW Academic United States

Conflict-Free Policy Languages for Probabilistic ML Predicates: A Framework and Case Study with the Semantic Router DSL

arXiv:2603.18174v1 Announce Type: new Abstract: Conflict detection in policy languages is a solved problem -- as long as every rule condition is a crisp Boolean predicate. BDDs, SMT solvers, and NetKAT all exploit that assumption. But a growing class of...

1 min 1 month ago

ear

LOW Academic European Union

Gradient-Informed Temporal Sampling Improves Rollout Accuracy in PDE Surrogate Training

arXiv:2603.18237v1 Announce Type: new Abstract: Researchers train neural simulators on uniformly sampled numerical simulation data. But under the same budget, does systematically sampled data provide the most effective information? A fundamental yet unformalized problem is how to sample training data...

1 min 1 month ago

ear

LOW Academic International

AGRI-Fidelity: Evaluating the Reliability of Listenable Explanations for Poultry Disease Detection

arXiv:2603.18247v1 Announce Type: new Abstract: Existing XAI metrics measure faithfulness for a single model, ignoring model multiplicity where near-optimal classifiers rely on different or spurious acoustic cues. In noisy farm environments, stationary artifacts such as ventilation noise can produce explanations...

1 min 1 month ago

ear

LOW Academic United States

MolRGen: A Training and Evaluation Setting for De Novo Molecular Generation with Reasonning Models

arXiv:2603.18256v1 Announce Type: new Abstract: Recent advances in reasoning-based large language models (LLMs) have demonstrated substantial improvements in complex problem-solving tasks. Motivated by these advances, several works have explored the application of reasoning LLMs to drug discovery and molecular design....

1 min 1 month ago

ear

LOW Academic International

Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

arXiv:2603.18257v1 Announce Type: new Abstract: Selecting relevant state dimensions in the presence of confounded distractors is a causal identification problem: observational statistics alone cannot reliably distinguish dimensions that correlate with actions from those that actions cause. We formalize this as...

1 min 1 month ago

ear

LOW Academic United States

Enactor: From Traffic Simulators to Surrogate World Models

arXiv:2603.18266v1 Announce Type: new Abstract: Traffic microsimulators are widely used to evaluate road network performance under various ``what-if" conditions. However, the behavior models controlling the actions of the actors are overly simplistic and fails to capture realistic actor-actor interactions. Deep...

1 min 1 month ago

ear

LOW Academic United States

Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails

arXiv:2603.18280v1 Announce Type: new Abstract: Current alignment evaluation mostly measures whether models encode dangerous concepts and whether they refuse harmful requests. Both miss the layer where alignment often operates: routing from concept detection to behavioral policy. We study political censorship...

1 min 1 month ago

ear

LOW Academic United States

Path-Constrained Mixture-of-Experts

arXiv:2603.18297v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) architectures enable efficient scaling by activating only a subset of parameters for each input. However, conventional MoE routing selects each layer's experts independently, creating N^L possible expert paths -- for N experts...

1 min 1 month ago

ear

LOW Academic European Union

ALIGN: Adversarial Learning for Generalizable Speech Neuroprosthesis

arXiv:2603.18299v1 Announce Type: new Abstract: Intracortical brain-computer interfaces (BCIs) can decode speech from neural activity with high accuracy when trained on data pooled across recording sessions. In realistic deployment, however, models must generalize to new sessions without labeled data, and...

1 min 1 month ago

ear

LOW Academic European Union

Approximate Subgraph Matching with Neural Graph Representations and Reinforcement Learning

arXiv:2603.18314v1 Announce Type: new Abstract: Approximate subgraph matching (ASM) is a task that determines the approximate presence of a given query graph in a large target graph. Being an NP-hard problem, ASM is critical in graph analysis with a myriad...

1 min 1 month ago

ear

LOW Academic International

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

arXiv:2603.18325v1 Announce Type: new Abstract: Chain-of-thought reasoning, where language models expend additional computation by producing thinking tokens prior to final responses, has driven significant advances in model capabilities. However, training these reasoning models is extremely costly in terms of both...

1 min 1 month ago

ear

LOW Academic International

Escaping Offline Pessimism: Vector-Field Reward Shaping for Safe Frontier Exploration

arXiv:2603.18326v1 Announce Type: new Abstract: While offline reinforcement learning provides reliable policies for real-world deployment, its inherent pessimism severely restricts an agent's ability to explore and collect novel data online. Drawing inspiration from safe reinforcement learning, exploring near the boundary...

1 min 1 month ago

ear

LOW Academic European Union

A Family of Adaptive Activation Functions for Mitigating Failure Modes in Physics-Informed Neural Networks

arXiv:2603.18328v1 Announce Type: new Abstract: Physics-Informed Neural Networks(PINNs) are a powerful and flexible learning framework that has gained significant attention in recent years. It has demonstrated strong performance across a wide range of scientific and engineering problems. In parallel, wavelets...

1 min 1 month ago

ear

LOW Academic European Union

Mathematical Foundations of Deep Learning

arXiv:2603.18387v1 Announce Type: new Abstract: This draft book offers a comprehensive and rigorous treatment of the mathematical principles underlying modern deep learning. The book spans core theoretical topics, from the approximation capabilities of deep neural networks, the theory and algorithms...

1 min 1 month ago

ear

LOW Academic International

RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach

arXiv:2603.18396v1 Announce Type: new Abstract: Bus holding control is challenging due to stochastic traffic and passenger demand. While deep reinforcement learning (DRL) shows promise, standard actor-critic algorithms suffer from Q-value instability in volatile environments. A key source of this instability...

1 min 1 month ago

ear

LOW Academic International

FlowMS: Flow Matching for De Novo Structure Elucidation from Mass Spectra

arXiv:2603.18397v1 Announce Type: new Abstract: Mass spectrometry (MS) stands as a cornerstone analytical technique for molecular identification, yet de novo structure elucidation from spectra remains challenging due to the combinatorial complexity of chemical space and the inherent ambiguity of spectral...

1 min 1 month ago

ear

LOW Academic European Union

Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration

arXiv:2603.18417v1 Announce Type: new Abstract: Sparse attention mechanisms promise to break the quadratic bottleneck of long-context transformers, yet production adoption remains limited by a critical usability gap: optimal hyperparameters vary substantially across layers and models, and current methods (e.g., SpargeAttn)...

1 min 1 month ago

ear

LOW Academic International

Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits

arXiv:2603.18431v1 Announce Type: new Abstract: Quantum multi-armed bandits (MAB) and stochastic linear bandits (SLB) have recently attracted significant attention, as their quantum counterparts can achieve quadratic speedups over classical MAB and SLB. However, most existing quantum MAB algorithms assume ideal...

1 min 1 month ago

ear

LOW Academic United States

MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting

arXiv:2603.18432v1 Announce Type: new Abstract: Separating multiple effects in time series is fundamental yet challenging for time-series forecasting (TSF). However, existing TSF models cannot effectively learn interpretable multi-effect decomposition by their smoothing-based temporal techniques. Here, a new interpretable frequency-based decomposition...

1 min 1 month ago

ear

LOW Academic International

Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards

arXiv:2603.18444v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has emerged as an effective post-training paradigm for improving the reasoning capabilities of large language models. However, existing group-based RLVR methods often suffer from severe sample inefficiency. This inefficiency...

1 min 1 month ago

ear

LOW Academic International

AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

arXiv:2603.18464v1 Announce Type: new Abstract: Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating...

1 min 1 month ago

ear

LOW Academic United States

AIMER: Calibration-Free Task-Agnostic MoE Pruning

arXiv:2603.18492v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) language models increase parameter capacity without proportional per-token compute, but the deployment still requires storing all experts, making expert pruning important for reducing memory and serving overhead. Existing task-agnostic expert pruning methods are...

1 min 1 month ago

ear

LOW Academic United States

Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning

arXiv:2603.18533v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have shown exceptional reasoning capabilities, but they also suffer from the issue of overthinking, often generating excessively long and redundant answers. For problems that exceed the model's capabilities, LRMs tend to...

1 min 1 month ago

ear

LOW Academic International

Data-efficient pre-training by scaling synthetic megadocs

arXiv:2603.18534v1 Announce Type: new Abstract: Synthetic data augmentation has emerged as a promising solution when pre-training is constrained by data rather than compute. We study how to design synthetic data algorithms that achieve better loss scaling: not only lowering loss...

1 min 1 month ago

ear

SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training

Probabilistic Federated Learning on Uncertain and Heterogeneous Data with Model Personalization

Enhancing Reinforcement Learning Fine-Tuning with an Online Refiner

ARTEMIS: A Neuro Symbolic Framework for Economically Constrained Market Dynamics

BoundAD: Boundary-Aware Negative Generation for Time Series Anomaly Detection

VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models

Conflict-Free Policy Languages for Probabilistic ML Predicates: A Framework and Case Study with the Semantic Router DSL

Gradient-Informed Temporal Sampling Improves Rollout Accuracy in PDE Surrogate Training

AGRI-Fidelity: Evaluating the Reliability of Listenable Explanations for Poultry Disease Detection

MolRGen: A Training and Evaluation Setting for De Novo Molecular Generation with Reasonning Models

Discovering What You Can Control: Interventional Boundary Discovery for Reinforcement Learning

Enactor: From Traffic Simulators to Surrogate World Models

Detection Is Cheap, Routing Is Learned: Why Refusal-Based Alignment Evaluation Fails

Path-Constrained Mixture-of-Experts

ALIGN: Adversarial Learning for Generalizable Speech Neuroprosthesis

Approximate Subgraph Matching with Neural Graph Representations and Reinforcement Learning

Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum

Escaping Offline Pessimism: Vector-Field Reward Shaping for Safe Frontier Exploration

A Family of Adaptive Activation Functions for Mitigating Failure Modes in Physics-Informed Neural Networks

Mathematical Foundations of Deep Learning

RE-SAC: Disentangling aleatoric and epistemic risks in bus fleet control: A stable and robust ensemble DRL approach

FlowMS: Flow Matching for De Novo Structure Elucidation from Mass Spectra

Self-Tuning Sparse Attention: Multi-Fidelity Hyperparameter Optimization for Transformer Acceleration

Towards Noise-Resilient Quantum Multi-Armed and Stochastic Linear Bandits

MLOW: Interpretable Low-Rank Frequency Magnitude Decomposition of Multiple Effects for Time Series Forecasting

Discounted Beta--Bernoulli Reward Estimation for Sample-Efficient Reinforcement Learning with Verifiable Rewards

AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

AIMER: Calibration-Free Task-Agnostic MoE Pruning

Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning

Data-efficient pre-training by scaling synthetic megadocs

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.