Immigration Law

LOW Academic International

Subspace Geometry Governs Catastrophic Forgetting in Low-Rank Adaptation

arXiv:2603.02224v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for adapting large pre-trained models, yet its behavior under continual learning remains poorly understood. We present a geometric theory characterizing catastrophic forgetting in LoRA through the...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Scaling Reward Modeling without Human Supervision

arXiv:2603.02225v1 Announce Type: new Abstract: Learning from feedback is an instrumental process for advancing the capabilities and safety of frontier models, yet its effectiveness is often constrained by cost and scalability. We present a pilot study that explores scaling reward...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Length Generalization Bounds for Transformers

arXiv:2603.02238v1 Announce Type: new Abstract: Length generalization is a key property of a learning algorithm that enables it to make correct predictions on inputs of any length, given finite training data. To provide such a guarantee, one needs to be...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning

arXiv:2603.02280v1 Announce Type: new Abstract: With the widespread adoption of deep learning in visual tasks, Class-Incremental Learning (CIL) has become an important paradigm for handling dynamically evolving data distributions. However, CIL faces the core challenge of catastrophic forgetting, often manifested...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Preconditioned Score and Flow Matching

arXiv:2603.02337v1 Announce Type: new Abstract: Flow matching and score-based diffusion train vector fields under intermediate distributions $p_t$, whose geometry can strongly affect their optimization. We show that the covariance $\Sigma_t$ of $p_t$ governs optimization bias: when $\Sigma_t$ is ill-conditioned, and...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles

arXiv:2603.02406v1 Announce Type: new Abstract: Generative models have recently advanced $\textit{de novo}$ protein design by learning the statistical regularities of natural structures. However, current approaches face three key limitations: (1) Existing methods cannot jointly learn protein geometry and design tasks,...

1 min 1 month, 2 weeks ago

tps

LOW Academic International

Personalized Multi-Agent Average Reward TD-Learning via Joint Linear Approximation

arXiv:2603.02426v1 Announce Type: new Abstract: We study personalized multi-agent average reward TD learning, in which a collection of agents interacts with different environments and jointly learns their respective value functions. We focus on the setting where there exists a shared...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence

arXiv:2603.02429v1 Announce Type: new Abstract: Underdamped Langevin dynamics (ULD) is a widely-used sampler for Gibbs distributions $\pi\propto e^{-V}$, and is often empirically effective in high dimensions. However, existing non-asymptotic convergence guarantees for discretized ULD typically scale polynomially with the ambient...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

A Unified Revisit of Temperature in Classification-Based Knowledge Distillation

arXiv:2603.02430v1 Announce Type: new Abstract: A central idea of knowledge distillation is to expose relational structure embedded in the teacher's weights for the student to learn, which is often facilitated using a temperature parameter. Despite its widespread use, there remains...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Spectral Regularization for Diffusion Models

arXiv:2603.02447v1 Announce Type: new Abstract: Diffusion models are typically trained using pointwise reconstruction objectives that are agnostic to the spectral and multi-scale structure of natural signals. We propose a loss-level spectral regularization framework that augments standard diffusion training with differentiable...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Manifold Aware Denoising Score Matching (MAD)

arXiv:2603.02452v1 Announce Type: new Abstract: A major focus in designing methods for learning distributions defined on manifolds is to alleviate the need to implicitly learn the manifold so that learning can concentrate on the data distribution within the manifold. However,...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

EdgeFLow: Serverless Federated Learning via Sequential Model Migration in Edge Networks

arXiv:2603.02562v1 Announce Type: new Abstract: Federated Learning (FL) has emerged as a transformative distributed learning paradigm in the era of Internet of Things (IoT), reconceptualizing data processing methodologies. However, FL systems face significant communication bottlenecks due to inevitable client-server data...

1 min 1 month, 2 weeks ago

ead

LOW Conference International

CVPR 2026 Media Center

1 min 1 month, 2 weeks ago

ead

LOW Conference International

Get a CVPR 2026 Media Pass

2 min 1 month, 2 weeks ago

ead

LOW Conference International

CVPR 2026 News and Resources for Press

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Distribution-Aware Companding Quantization of Large Language Models

arXiv:2603.00364v1 Announce Type: new Abstract: Large language models such as GPT and Llama are trained with a next-token prediction loss. In this work, we suggest that training language models to predict multiple future tokens at once results in higher sample...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

A Typologically Grounded Evaluation Framework for Word Order and Morphology Sensitivity in Multilingual Masked LMs

arXiv:2603.00432v1 Announce Type: new Abstract: We introduce a typology-aware diagnostic for multilingual masked language models that tests reliance on word order versus inflectional form. Using Universal Dependencies, we apply inference-time perturbations: full token scrambling, content-word scrambling with function words fixed,...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

CIRCUS: Circuit Consensus under Uncertainty via Stability Ensembles

arXiv:2603.00523v1 Announce Type: new Abstract: Mechanistic circuit discovery is notoriously sensitive to arbitrary analyst choices, especially pruning thresholds and feature dictionaries, often yielding brittle "one-shot" explanations with no principled notion of uncertainty. We reframe circuit discovery as an uncertainty-quantification problem...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging

arXiv:2603.00573v1 Announce Type: new Abstract: Large language models (LLMs) achieve remarkable performance on diverse downstream and domain-specific tasks via parameter-efficient fine-tuning (PEFT). However, existing PEFT methods, particularly MoE-LoRA architectures, suffer from limited parameter efficiency and coarse-grained adaptation due to the...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

SSKG Hub: An Expert-Guided Platform for LLM-Empowered Sustainability Standards Knowledge Graphs

arXiv:2603.00669v1 Announce Type: new Abstract: Sustainability disclosure standards (e.g., GRI, SASB, TCFD, IFRS S2) are comprehensive yet lengthy, terminology-dense, and highly cross-referential, hindering structured analysis and downstream use. We present SSKG Hub (Sustainability Standards Knowledge Graph Hub), a research prototype...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Polynomial Mixing for Efficient Self-supervised Speech Encoders

arXiv:2603.00683v1 Announce Type: new Abstract: State-of-the-art speech-to-text models typically employ Transformer-based encoders that model token dependencies via self-attention mechanisms. However, the quadratic complexity of self-attention in both memory and computation imposes significant constraints on scalability. In this work, we propose...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis

arXiv:2603.00686v1 Announce Type: new Abstract: Large Language Models have evolved from single-round generators into long-horizon agents, capable of complex text synthesis scenarios. However, current evaluation frameworks lack the ability to assess the actual synthesis operations, such as outlining, drafting, and...

1 min 1 month, 2 weeks ago

tps

LOW Academic International

RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models

arXiv:2603.00724v1 Announce Type: new Abstract: Large language model alignment via reinforcement learning depends critically on reward function quality. However, static, domain-specific reward models are often costly to train and exhibit poor generalization in out-of-distribution scenarios encountered during RL iterations. We...

1 min 1 month, 2 weeks ago

tps

LOW Academic International

Constitutional Black-Box Monitoring for Scheming in LLM Agents

arXiv:2603.00829v1 Announce Type: new Abstract: Safe deployment of Large Language Model (LLM) agents in autonomous settings requires reliable oversight mechanisms. A central challenge is detecting scheming, where agents covertly pursue misaligned goals. One approach to mitigating such risks is LLM-based...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

KVSlimmer: Theoretical Insights and Practical Optimizations for Asymmetric KV Merging

arXiv:2603.00907v1 Announce Type: new Abstract: The growing computational and memory demands of the Key-Value (KV) cache significantly limit the ability of Large Language Models (LLMs). While KV merging has emerged as a promising solution, existing methods that rely on empirical...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

arXiv:2603.01042v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about time series data, which limits their effectiveness in decision-making scenarios that depend on temporal dynamics....

1 min 1 month, 2 weeks ago

tps

LOW Academic International

How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning

arXiv:2603.01070v1 Announce Type: new Abstract: Solving complex geometric problems inherently requires interleaved reasoning: a tight alternation between constructing diagrams and performing logical deductions. Although recent Multimodal Large Language Models (MLLMs) have demonstrated strong capabilities in visual generation and plotting, we...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

StaTS: Spectral Trajectory Schedule Learning for Adaptive Time Series Forecasting with Frequency Guided Denoiser

arXiv:2603.00037v1 Announce Type: new Abstract: Diffusion models have been used for probabilistic time series forecasting and show strong potential. However, fixed noise schedules often produce intermediate states that are hard to invert and a terminal state that deviates from the...

1 min 1 month, 2 weeks ago

tps

LOW Academic International

Maximizing the Spectral Energy Gain in Sub-1-Bit LLMs via Latent Geometry Alignment

arXiv:2603.00042v1 Announce Type: new Abstract: We identify the Spectral Energy Gain in extreme model compression, where low-rank binary approximations outperform tiny-rank floating-point baselines for heavy-tailed spectra. However, prior attempts fail to realize this potential, trailing state-of-the-art 1-bit methods. We attribute...

1 min 1 month, 2 weeks ago

ead

LOW Academic International

REMIND: Rethinking Medical High-Modality Learning under Missingness--A Long-Tailed Distribution Perspective

arXiv:2603.00046v1 Announce Type: new Abstract: Medical multi-modal learning is critical for integrating information from a large set of diverse modalities. However, when leveraging a high number of modalities in real clinical applications, it is often impractical to obtain full-modality observations...

1 min 1 month, 2 weeks ago

ead

Subspace Geometry Governs Catastrophic Forgetting in Low-Rank Adaptation

Scaling Reward Modeling without Human Supervision

Length Generalization Bounds for Transformers

Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning

Preconditioned Score and Flow Matching

Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles

Personalized Multi-Agent Average Reward TD-Learning via Joint Linear Approximation

Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence

A Unified Revisit of Temperature in Classification-Based Knowledge Distillation

Spectral Regularization for Diffusion Models

Manifold Aware Denoising Score Matching (MAD)

EdgeFLow: Serverless Federated Learning via Sequential Model Migration in Edge Networks

CVPR 2026 Media Center

Get a CVPR 2026 Media Pass

CVPR 2026 News and Resources for Press

Distribution-Aware Companding Quantization of Large Language Models

A Typologically Grounded Evaluation Framework for Word Order and Morphology Sensitivity in Multilingual Masked LMs

CIRCUS: Circuit Consensus under Uncertainty via Stability Ensembles

CoMoL: Efficient Mixture of LoRA Experts via Dynamic Core Space Merging

SSKG Hub: An Expert-Guided Platform for LLM-Empowered Sustainability Standards Knowledge Graphs

Polynomial Mixing for Efficient Self-supervised Speech Encoders

RAVEL: Reasoning Agents for Validating and Evaluating LLM Text Synthesis

RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models

Constitutional Black-Box Monitoring for Scheming in LLM Agents

KVSlimmer: Theoretical Insights and Practical Optimizations for Asymmetric KV Merging

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning

StaTS: Spectral Trajectory Schedule Learning for Adaptive Time Series Forecasting with Frequency Guided Denoiser

Maximizing the Spectral Energy Gain in Sub-1-Bit LLMs via Latent Geometry Alignment

REMIND: Rethinking Medical High-Modality Learning under Missingness--A Long-Tailed Distribution Perspective

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.