Arbitration

LOW Academic International

MaBERT:A Padding Safe Interleaved Transformer Mamba Hybrid Encoder for Efficient Extended Context Masked Language Modeling

arXiv:2603.03001v1 Announce Type: new Abstract: Self attention encoders such as Bidirectional Encoder Representations from Transformers(BERT) scale quadratically with sequence length, making long context modeling expensive. Linear time state space models, such as Mamba, are efficient; however, they show limitations in...

1 min 1 month, 1 week ago

adr

LOW Academic International

Routing Absorption in Sparse Attention: Why Random Gates Are Hard to Beat

arXiv:2603.02227v1 Announce Type: cross Abstract: Can a transformer learn which attention entries matter during training? In principle, yes: attention distributions are highly concentrated, and a small gate network can identify the important entries post-hoc with near-perfect accuracy. In practice, barely....

1 min 1 month, 1 week ago

bit

LOW Academic International

Concept Heterogeneity-aware Representation Steering

arXiv:2603.02237v1 Announce Type: new Abstract: Representation steering offers a lightweight mechanism for controlling the behavior of large language models (LLMs) by intervening on internal activations at inference time. Most existing methods rely on a single global steering direction, typically obtained...

1 min 1 month, 1 week ago

bit

LOW Academic International

A Comparative Study of UMAP and Other Dimensionality Reduction Methods

arXiv:2603.02275v1 Announce Type: new Abstract: Uniform Manifold Approximation and Projection (UMAP) is a widely used manifold learning technique for dimensionality reduction. This paper studies UMAP, supervised UMAP, and several competing dimensionality reduction methods, including Principal Component Analysis (PCA), Kernel PCA,...

1 min 1 month, 1 week ago

bit

LOW Conference International

CVPR 2026 Media Center

1 min 1 month, 1 week ago

bit

LOW Conference International

CVPR 2026 News and Resources for Press

1 min 1 month, 1 week ago

bit

LOW Academic International

CIRCUS: Circuit Consensus under Uncertainty via Stability Ensembles

arXiv:2603.00523v1 Announce Type: new Abstract: Mechanistic circuit discovery is notoriously sensitive to arbitrary analyst choices, especially pruning thresholds and feature dictionaries, often yielding brittle "one-shot" explanations with no principled notion of uncertainty. We reframe circuit discovery as an uncertainty-quantification problem...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Polynomial Mixing for Efficient Self-supervised Speech Encoders

arXiv:2603.00683v1 Announce Type: new Abstract: State-of-the-art speech-to-text models typically employ Transformer-based encoders that model token dependencies via self-attention mechanisms. However, the quadratic complexity of self-attention in both memory and computation imposes significant constraints on scalability. In this work, we propose...

1 min 1 month, 2 weeks ago

adr

LOW Academic International

RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models

arXiv:2603.00724v1 Announce Type: new Abstract: Large language model alignment via reinforcement learning depends critically on reward function quality. However, static, domain-specific reward models are often costly to train and exhibit poor generalization in out-of-distribution scenarios encountered during RL iterations. We...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

MedGPT-oss: Training a General-Purpose Vision-Language Model for Biomedicine

arXiv:2603.00842v1 Announce Type: new Abstract: Biomedical multimodal assistants have the potential to unify radiology, pathology, and clinical-text reasoning, yet a critical deployment gap remains: top-performing systems are either closed-source or computationally prohibitive, precluding the on-premises deployment required for patient privacy...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning

arXiv:2603.00889v1 Announce Type: new Abstract: Large Language Models (LLMs) have recently exhibited remarkable reasoning capabilities, largely enabled by supervised fine-tuning (SFT)- and reinforcement learning (RL)-based post-training on high-quality reasoning data. However, reproducing and extending these capabilities in open and scalable...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Prompt Sensitivity and Answer Consistency of Small Open-Source Large Language Models on Clinical Question Answering: Implications for Low-Resource Healthcare Deployment

arXiv:2603.00917v1 Announce Type: new Abstract: Small open-source language models are gaining attention for low-resource healthcare settings, but their reliability under different prompt phrasings remains poorly understood. We evaluated five open-source models (Gemma 2 2B, Phi-3 Mini 3.8B, Llama 3.2 3B,...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

arXiv:2603.01042v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about time series data, which limits their effectiveness in decision-making scenarios that depend on temporal dynamics....

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Maximizing the Spectral Energy Gain in Sub-1-Bit LLMs via Latent Geometry Alignment

arXiv:2603.00042v1 Announce Type: new Abstract: We identify the Spectral Energy Gain in extreme model compression, where low-rank binary approximations outperform tiny-rank floating-point baselines for heavy-tailed spectra. However, prior attempts fail to realize this potential, trailing state-of-the-art 1-bit methods. We attribute...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Breaking the Factorization Barrier in Diffusion Language Models

arXiv:2603.00045v1 Announce Type: new Abstract: Diffusion language models theoretically allow for efficient parallel generation but are practically hindered by the "factorization barrier": the assumption that simultaneously predicted tokens are independent. This limitation forces a trade-off: models must either sacrifice speed...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

REMIND: Rethinking Medical High-Modality Learning under Missingness--A Long-Tailed Distribution Perspective

arXiv:2603.00046v1 Announce Type: new Abstract: Medical multi-modal learning is critical for integrating information from a large set of diverse modalities. However, when leveraging a high number of modalities in real clinical applications, it is often impractical to obtain full-modality observations...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Mag-Mamba: Modeling Coupled spatiotemporal Asymmetry for POI Recommendation

arXiv:2603.00053v1 Announce Type: new Abstract: Next Point-of-Interest (POI) recommendation is a critical task in location-based services, yet it faces the fundamental challenge of coupled spatiotemporal asymmetry inherent in urban mobility. Specifically, transition intents between locations exhibit high asymmetry and are...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Expert Divergence Learning for MoE-based Language Models

arXiv:2603.00054v1 Announce Type: new Abstract: The Mixture-of-Experts (MoE) architecture is a powerful technique for scaling language models, yet it often suffers from expert homogenization, where experts learn redundant functionalities, thereby limiting MoE's full potential. To address this, we introduce Expert...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

MAML-KT: Addressing Cold Start Problem in Knowledge Tracing for New Students via Few-Shot Model-Agnostic Meta Learning

arXiv:2603.00137v1 Announce Type: new Abstract: Knowledge tracing (KT) models are commonly evaluated by training on early interactions from all students and testing on later responses. While effective for measuring average predictive performance, this evaluation design obscures a cold start scenario...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Physics-Aware Learnability: From Set-Theoretic Independence to Operational Constraints

arXiv:2603.00417v1 Announce Type: new Abstract: Beyond binary classification, learnability can become a logically fragile notion: in EMX, even the class of all finite subsets of $[0,1]$ is learnable in some models of ZFC and not in others. We argue the...

1 min 1 month, 2 weeks ago

bit

LOW Conference International

2026 Expo Schedule

1 min 1 month, 2 weeks ago

bit

LOW Academic International

LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning

arXiv:2602.23610v1 Announce Type: new Abstract: The reasoning capability of large language models (LLMs), defined as their ability to analyze, infer, and make decisions based on input information, is essential for building intelligent task-oriented dialogue systems. However, existing benchmarks do not...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games

arXiv:2602.24188v1 Announce Type: new Abstract: We present a scalable methodology for evaluating language models in multi-turn interactions, using a suite of collaborative games that require effective communication about private information. This enables an interactive scaling analysis, in which a fixed...

1 min 1 month, 2 weeks ago

adr

LOW Academic International

HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit

arXiv:2602.23699v1 Announce Type: cross Abstract: The quadratic computational cost of processing vision tokens in Multimodal Large Language Models (MLLMs) hinders their widespread adoption. While progressive vision token pruning offers a promising solution, current methods misinterpret shallow layer functions and use...

1 min 1 month, 2 weeks ago

adr

LOW Academic International

Dynamics of Learning under User Choice: Overspecialization and Peer-Model Probing

arXiv:2602.23565v1 Announce Type: new Abstract: In many economically relevant contexts where machine learning is deployed, multiple platforms obtain data from the same pool of users, each of whom selects the platform that best serves them. Prior work in this setting...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation

arXiv:2602.23636v1 Announce Type: new Abstract: Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a fixed definition of harmfulness. In practice, enforcement strictness -...

1 min 1 month, 2 weeks ago

enforcement

LOW Academic International

Optimizer-Induced Low-Dimensional Drift and Transverse Dynamics in Transformer Training

arXiv:2602.23696v1 Announce Type: new Abstract: We study the geometry of training trajectories in small transformer models and find that parameter updates organize into a dominant drift direction with transverse residual dynamics. Using uncentered, row-normalized trajectory PCA, we show that a...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

PreScience: A Benchmark for Forecasting Scientific Contributions

arXiv:2602.20459v1 Announce Type: new Abstract: Can AI systems trained on the scientific record up to a fixed point in time forecast the scientific advances that follow? Such a capability could help researchers identify collaborators and impactful research directions, and anticipate...

1 min 1 month, 2 weeks ago

adr

LOW Academic International

Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

arXiv:2602.20517v1 Announce Type: new Abstract: Effective human-AI coordination requires artificial agents capable of exhibiting and responding to human-like behaviors while adapting to changing contexts. Imitation learning has emerged as one of the prominent approaches to build such agents by training...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Recursive Belief Vision Language Model

arXiv:2602.20659v1 Announce Type: new Abstract: Current vision-language-action (VLA) models struggle with long-horizon manipulation under partial observability. Most existing approaches remain observation-driven, relying on short context windows or repeated queries to vision-language models (VLMs). This leads to loss of task progress,...

1 min 1 month, 2 weeks ago

bit

MaBERT:A Padding Safe Interleaved Transformer Mamba Hybrid Encoder for Efficient Extended Context Masked Language Modeling

Routing Absorption in Sparse Attention: Why Random Gates Are Hard to Beat

Concept Heterogeneity-aware Representation Steering

A Comparative Study of UMAP and Other Dimensionality Reduction Methods

CVPR 2026 Media Center

CVPR 2026 News and Resources for Press

CIRCUS: Circuit Consensus under Uncertainty via Stability Ensembles

Polynomial Mixing for Efficient Self-supervised Speech Encoders

RLAR: An Agentic Reward System for Multi-task Reinforcement Learning on Large Language Models

MedGPT-oss: Training a General-Purpose Vision-Language Model for Biomedicine

CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning

Prompt Sensitivity and Answer Consistency of Small Open-Source Large Language Models on Clinical Question Answering: Implications for Low-Resource Healthcare Deployment

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

Maximizing the Spectral Energy Gain in Sub-1-Bit LLMs via Latent Geometry Alignment

Breaking the Factorization Barrier in Diffusion Language Models

REMIND: Rethinking Medical High-Modality Learning under Missingness--A Long-Tailed Distribution Perspective

Mag-Mamba: Modeling Coupled spatiotemporal Asymmetry for POI Recommendation

Expert Divergence Learning for MoE-based Language Models

MAML-KT: Addressing Cold Start Problem in Knowledge Tracing for New Students via Few-Shot Model-Agnostic Meta Learning

Physics-Aware Learnability: From Set-Theoretic Independence to Operational Constraints

2026 Expo Schedule

LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning

MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games

HiDrop: Hierarchical Vision Token Reduction in MLLMs via Late Injection, Concave Pyramid Pruning, and Early Exit

Dynamics of Learning under User Choice: Overspecialization and Peer-Model Probing

FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation

Optimizer-Induced Low-Dimensional Drift and Transverse Dynamics in Transformer Training

PreScience: A Benchmark for Forecasting Scientific Contributions

Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

Recursive Belief Vision Language Model

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.