International Law

LOW Academic United States

Bayesian Optimality of In-Context Learning with Selective State Spaces

arXiv:2602.17744v1 Announce Type: cross Abstract: We propose Bayesian optimal sequential prediction as a new principle for understanding in-context learning (ICL). Unlike interpretations framing Transformers as performing implicit gradient descent, we formalize ICL as meta-learning over latent sequence tasks. For tasks...

1 min 2 months ago

ear

LOW Academic International

ADAPT: Hybrid Prompt Optimization for LLM Feature Visualization

arXiv:2602.17867v1 Announce Type: cross Abstract: Understanding what features are encoded by learned directions in LLM activation space requires identifying inputs that strongly activate them. Feature visualization, which optimizes inputs to maximally activate a target direction, offers an alternative to costly...

1 min 2 months ago

ear

LOW Academic European Union

NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs

arXiv:2602.18008v1 Announce Type: cross Abstract: Mechanistic models encode scientific knowledge about dynamical systems and are widely used in downstream scientific and policy applications. Recent work has explored LLM-based agentic frameworks to automatically construct mechanistic models from data; however, existing problem...

1 min 2 months ago

ear

LOW Academic International

Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards

arXiv:2602.18037v1 Announce Type: cross Abstract: Reinforcement Learning from Human Feedback (RLHF) or Verifiable Rewards (RLVR) are two key steps in the post-training of modern Language Models (LMs). A common problem is reward hacking, where the policy may exploit inaccuracies of...

1 min 2 months ago

ear

LOW Academic International

On the Semantic and Syntactic Information Encoded in Proto-Tokens for One-Step Text Reconstruction

arXiv:2602.18301v1 Announce Type: cross Abstract: Autoregressive large language models (LLMs) generate text token-by-token, requiring n forward passes to produce a sequence of length n. Recent work, Exploring the Latent Capacity of LLMs for One-Step Text Reconstruction (Mezentsev and Oseledets), shows...

1 min 2 months ago

ear

LOW Academic European Union

On the "Induction Bias" in Sequence Models

arXiv:2602.18333v1 Announce Type: cross Abstract: Despite the remarkable practical success of transformer-based language models, recent work has raised concerns about their ability to perform state tracking. In particular, a growing body of literature has shown this limitation primarily through failures...

1 min 2 months ago

ear

LOW Academic International

Subgroups of $U(d)$ Induce Natural RNN and Transformer Architectures

arXiv:2602.18417v1 Announce Type: cross Abstract: This paper presents a direct framework for sequence models with hidden states on closed subgroups of U(d). We use a minimal axiomatic setup and derive recurrent and transformer templates from a shared skeleton in which...

1 min 2 months ago

ear

LOW Academic United States

Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

arXiv:2602.17683v1 Announce Type: new Abstract: Accurate short-term forecasting of vegetation dynamics is a key enabler for data-driven decision support in precision agriculture. Normalized Difference Vegetation Index (NDVI) forecasting from satellite observations, however, remains challenging due to sparse and irregular sampling...

1 min 2 months ago

ear

LOW Academic European Union

Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling

arXiv:2602.17685v1 Announce Type: new Abstract: This paper addresses the challenge of multi target active debris removal (ADR) in Low Earth Orbit (LEO) by introducing a unified coelliptic maneuver framework that combines Hohmann transfers, safety ellipse proximity operations, and explicit refueling...

1 min 2 months ago

ear

LOW Academic International

Pimp My LLM: Leveraging Variability Modeling to Tune Inference Hyperparameters

arXiv:2602.17697v1 Announce Type: new Abstract: Large Language Models (LLMs) are being increasingly used across a wide range of tasks. However, their substantial computational demands raise concerns about the energy efficiency and sustainability of both training and inference. Inference, in particular,...

1 min 2 months ago

ear

LOW Academic International

Provable Adversarial Robustness in In-Context Learning

arXiv:2602.17743v1 Announce Type: new Abstract: Large language models adapt to new tasks through in-context learning (ICL) without parameter updates. Current theoretical explanations for this capability assume test tasks are drawn from a distribution similar to that seen during pretraining. This...

1 min 2 months ago

ear

LOW Academic European Union

Multi-material Multi-physics Topology Optimization with Physics-informed Gaussian Process Priors

arXiv:2602.17783v1 Announce Type: new Abstract: Machine learning (ML) has been increasingly used for topology optimization (TO). However, most existing ML-based approaches focus on simplified benchmark problems due to their high computational cost, spectral bias, and difficulty in handling complex physics....

1 min 2 months ago

ear

LOW Academic International

Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds

arXiv:2602.17798v1 Announce Type: new Abstract: Mixture-of-Experts models rely on learned routers to assign tokens to experts, yet standard softmax gating provides no principled mechanism to control the tradeoff between sparsity and utilization. We propose Grassmannian MoE (GrMoE), a routing framework...

1 min 2 months ago

ear

LOW Academic International

Avoid What You Know: Divergent Trajectory Balance for GFlowNets

arXiv:2602.17827v1 Announce Type: new Abstract: Generative Flow Networks (GFlowNets) are a flexible family of amortized samplers trained to generate discrete and compositional objects with probability proportional to a reward function. However, learning efficiency is constrained by the model's ability to...

1 min 2 months ago

ear

LOW Academic International

Causality by Abstraction: Symbolic Rule Learning in Multivariate Timeseries with Large Language Models

arXiv:2602.17829v1 Announce Type: new Abstract: Inferring causal relations in timeseries data with delayed effects is a fundamental challenge, especially when the underlying system exhibits complex dynamics that cannot be captured by simple functional mappings. Traditional approaches often fail to produce...

1 min 2 months ago

ear

LOW Academic International

MePoly: Max Entropy Polynomial Policy Optimization

arXiv:2602.17832v1 Announce Type: new Abstract: Stochastic Optimal Control provides a unified mathematical framework for solving complex decision-making problems, encompassing paradigms such as maximum entropy reinforcement learning(RL) and imitation learning(IL). However, conventional parametric policies often struggle to represent the multi-modality of...

1 min 2 months ago

ear

LOW Academic International

Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning

arXiv:2602.17835v1 Announce Type: new Abstract: Supervised fine-tuning (SFT) relies critically on selecting training data that most benefits a model's downstream performance. Gradient-based data selection methods such as TracIn and Influence Functions leverage influence to identify useful samples, but their computational...

1 min 2 months ago

ear

LOW Academic International

Two Calm Ends and the Wild Middle: A Geometric Picture of Memorization in Diffusion Models

arXiv:2602.17846v1 Announce Type: new Abstract: Diffusion models generate high-quality samples but can also memorize training data, raising serious privacy concerns. Understanding the mechanisms governing when memorization versus generalization occurs remains an active area of research. In particular, it is unclear...

1 min 2 months ago

ear

LOW Academic European Union

Neural Prior Estimation: Learning Class Priors from Latent Representations

arXiv:2602.17853v1 Announce Type: new Abstract: Class imbalance induces systematic bias in deep neural networks by imposing a skewed effective class prior. This work introduces the Neural Prior Estimator (NPE), a framework that learns feature-conditioned log-prior estimates from latent representations. NPE...

1 min 2 months ago

ear

LOW Academic International

JAX-Privacy: A library for differentially private machine learning

arXiv:2602.17861v1 Announce Type: new Abstract: JAX-Privacy is a library designed to simplify the deployment of robust and performant mechanisms for differentially private machine learning. Guided by design principles of usability, flexibility, and efficiency, JAX-Privacy serves both researchers requiring deep customization...

1 min 2 months ago

ear

LOW Academic European Union

COMBA: Cross Batch Aggregation for Learning Large Graphs with Context Gating State Space Models

arXiv:2602.17893v1 Announce Type: new Abstract: State space models (SSMs) have recently emerged for modeling long-range dependency in sequence data, with much simplified computational costs than modern alternatives, such as transformers. Advancing SMMs to graph structured data, especially for large graphs,...

1 min 2 months ago

ear

LOW Academic United States

Breaking the Correlation Plateau: On the Optimization and Capacity Limits of Attention-Based Regressors

arXiv:2602.17898v1 Announce Type: new Abstract: Attention-based regression models are often trained by jointly optimizing Mean Squared Error (MSE) loss and Pearson correlation coefficient (PCC) loss, emphasizing the magnitude of errors and the order or shape of targets, respectively. A common...

1 min 2 months ago

ear

LOW Academic International

Distribution-Free Sequential Prediction with Abstentions

arXiv:2602.17918v1 Announce Type: new Abstract: We study a sequential prediction problem in which an adversary is allowed to inject arbitrarily many adversarial instances in a stream of i.i.d.\ instances, but at each round, the learner may also \emph{abstain} from making...

1 min 2 months ago

ear

LOW Academic International

Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning

arXiv:2602.17931v1 Announce Type: new Abstract: In environments with sparse or delayed rewards, reinforcement learning (RL) incurs high sample complexity due to the large number of interactions needed for learning. This limitation has motivated the use of large language models (LLMs)...

1 min 2 months ago

ear

LOW Academic European Union

Causal Neighbourhood Learning for Invariant Graph Representations

arXiv:2602.17934v1 Announce Type: new Abstract: Graph data often contain noisy and spurious correlations that mask the true causal relationships, which are essential for enabling graph models to make predictions based on the underlying causal structure of the data. Dependence on...

1 min 2 months ago

ear

LOW Academic European Union

Optimizing Graph Causal Classification Models: Estimating Causal Effects and Addressing Confounders

arXiv:2602.17941v1 Announce Type: new Abstract: Graph data is becoming increasingly prevalent due to the growing demand for relational insights in AI across various domains. Organizations regularly use graph data to solve complex problems involving relationships and connections. Causal learning is...

1 min 2 months ago

ear

LOW Academic International

Understanding the Generalization of Bilevel Programming in Hyperparameter Optimization: A Tale of Bias-Variance Decomposition

arXiv:2602.17947v1 Announce Type: new Abstract: Gradient-based hyperparameter optimization (HPO) have emerged recently, leveraging bilevel programming techniques to optimize hyperparameter by estimating hypergradient w.r.t. validation loss. Nevertheless, previous theoretical works mainly focus on reducing the gap between the estimation and ground-truth...

1 min 2 months ago

ear

LOW News United States

Court grapples with disputes over efforts to recover losses from Cuban confiscations

In a pair of oral arguments on Monday, the Supreme Court wrestled with disputes over whether U.S. companies can recover under U.S. law for losses resulting from the confiscation of […]The postCourt grapples with disputes over efforts to recover losses...

1 min 2 months ago

ear

LOW News United States

Birthright citizenship: under the flag

Brothers in Law is a recurring series by brothers Akhil and Vikram Amar, with special emphasis on measuring what the Supreme Court says against what the Constitution itself says. For more content from […]The postBirthright citizenship: under the flagappeared first...

1 min 2 months ago

ear

LOW News United States

Supreme Court agrees to hear case on Colorado dispute over climate change

Returning from its winter recess, the Supreme Court on Monday added just one new case to its oral argument docket. In a list of orders from the justices’ private conference […]The postSupreme Court agrees to hear case on Colorado dispute...

1 min 2 months ago

ear

Bayesian Optimality of In-Context Learning with Selective State Spaces

ADAPT: Hybrid Prompt Optimization for LLM Feature Visualization

NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs

Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards

On the Semantic and Syntactic Information Encoded in Proto-Tokens for One-Step Text Reconstruction

On the "Induction Bias" in Sequence Models

Subgroups of $U(d)$ Induce Natural RNN and Transformer Architectures

Probabilistic NDVI Forecasting from Sparse Satellite Time Series and Weather Covariates

Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling

Pimp My LLM: Leveraging Variability Modeling to Tune Inference Hyperparameters

Provable Adversarial Robustness in In-Context Learning

Multi-material Multi-physics Topology Optimization with Physics-informed Gaussian Process Priors

Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds

Avoid What You Know: Divergent Trajectory Balance for GFlowNets

Causality by Abstraction: Symbolic Rule Learning in Multivariate Timeseries with Large Language Models

MePoly: Max Entropy Polynomial Policy Optimization

Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning

Two Calm Ends and the Wild Middle: A Geometric Picture of Memorization in Diffusion Models

Neural Prior Estimation: Learning Class Priors from Latent Representations

JAX-Privacy: A library for differentially private machine learning

COMBA: Cross Batch Aggregation for Learning Large Graphs with Context Gating State Space Models

Breaking the Correlation Plateau: On the Optimization and Capacity Limits of Attention-Based Regressors

Distribution-Free Sequential Prediction with Abstentions

Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning

Causal Neighbourhood Learning for Invariant Graph Representations

Optimizing Graph Causal Classification Models: Estimating Causal Effects and Addressing Confounders

Understanding the Generalization of Bilevel Programming in Hyperparameter Optimization: A Tale of Bias-Variance Decomposition

Court grapples with disputes over efforts to recover losses from Cuban confiscations

Birthright citizenship: under the flag

Supreme Court agrees to hear case on Colorado dispute over climate change

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.