Arbitration

LOW Academic International

MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

arXiv:2603.23516v1 Announce Type: new Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information remains a long-standing pursuit in the field. Due to the constraints of full-attention architectures, the effective context length of large language...

1 min 3 weeks, 1 day ago

bit

LOW Academic International

Visuospatial Perspective Taking in Multimodal Language Models

arXiv:2603.23510v1 Announce Type: new Abstract: As multimodal language models (MLMs) are increasingly used in social and collaborative settings, it is crucial to evaluate their perspective-taking abilities. Existing benchmarks largely rely on text-based vignettes or static scene understanding, leaving visuospatial perspective-taking...

1 min 3 weeks, 1 day ago

bit

LOW Academic International

The Compression Paradox in LLM Inference: Provider-Dependent Energy Effects of Prompt Compression

arXiv:2603.23528v1 Announce Type: new Abstract: The rapid proliferation of Large Language Models has created an environmental paradox: the very technology that could help solve climate challenges is itself becoming a significant contributor to global carbon emissions. We test whether prompt...

1 min 3 weeks, 1 day ago

bit

LOW Academic United States

Compression Method Matters: Benchmark-Dependent Output Dynamics in LLM Prompt Compression

arXiv:2603.23527v1 Announce Type: new Abstract: Prompt compression is often evaluated by input-token reduction, but its real deployment impact depends on how compression changes output length and total inference cost. We present a controlled replication and extension study of benchmark-dependent output...

1 min 3 weeks, 1 day ago

bit

LOW Academic International

The Diminishing Returns of Early-Exit Decoding in Modern LLMs

arXiv:2603.23701v1 Announce Type: new Abstract: In Large Language Model (LLM) inference, early-exit refers to stopping computation at an intermediate layer once the prediction is sufficiently confident, thereby reducing latency and cost. However, recent LLMs adopt improved pretraining recipes and architectures...

1 min 3 weeks, 2 days ago

bit

LOW Academic International

PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay

arXiv:2603.23841v1 Announce Type: new Abstract: While Large Language Models (LLMs) are increasingly used as primary sources of information, their potential for political bias may impact their objectivity. Existing benchmarks of LLM social bias primarily evaluate gender and racial stereotypes. When...

1 min 3 weeks, 2 days ago

bit

LOW Academic International

Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction

arXiv:2603.23550v1 Announce Type: new Abstract: Multi-turn human-AI collaboration is fundamental to deploying interactive services such as adaptive tutoring, conversational recommendation, and professional consultation. However, optimizing these interactions via reinforcement learning is hindered by the sparsity of verifiable intermediate rewards and...

1 min 3 weeks, 2 days ago

bit

LOW Academic International

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

arXiv:2603.23574v1 Announce Type: new Abstract: Federated Learning (FL), as a popular distributed learning paradigm, has shown outstanding performance in improving computational efficiency and protecting data privacy, and is widely applied in industrial image classification. However, due to its distributed nature,...

1 min 3 weeks, 2 days ago

bit

LOW Academic International

BXRL: Behavior-Explainable Reinforcement Learning

arXiv:2603.23738v1 Announce Type: new Abstract: A major challenge of Reinforcement Learning is that agents often learn undesired behaviors that seem to defy the reward structure they were given. Explainable Reinforcement Learning (XRL) methods can answer queries such as "explain this...

1 min 3 weeks, 2 days ago

bit

LOW Academic European Union

Deep Neural Regression Collapse

arXiv:2603.23805v1 Announce Type: new Abstract: Neural Collapse is a phenomenon that helps identify sparse and low rank structures in deep classifiers. Recent work has extended the definition of neural collapse to regression problems, albeit only measuring the phenomenon at the...

1 min 3 weeks, 2 days ago

bit

LOW Academic United States

Why the Maximum Second Derivative of Activations Matters for Adversarial Robustness

arXiv:2603.23860v1 Announce Type: new Abstract: This work investigates the critical role of activation function curvature -- quantified by the maximum second derivative $\max|\sigma''|$ -- in adversarial robustness. Using the Recursive Curvature-Tunable Activation Family (RCT-AF), which enables precise control over curvature...

1 min 3 weeks, 2 days ago

bit

LOW Academic European Union

Can VLMs Reason Robustly? A Neuro-Symbolic Investigation

arXiv:2603.23867v1 Announce Type: new Abstract: Vision-Language Models (VLMs) have been applied to a wide range of reasoning tasks, yet it remains unclear whether they can reason robustly under distribution shifts. In this paper, we study covariate shifts in which the...

1 min 3 weeks, 2 days ago

bit

LOW Academic European Union

Transcending Classical Neural Network Boundaries: A Quantum-Classical Synergistic Paradigm for Seismic Data Processing

arXiv:2603.23984v1 Announce Type: new Abstract: In recent years, a number of neural-network (NN) methods have exhibited good performance in seismic data processing, such as denoising, interpolation, and frequency-band extension. However, these methods rely on stacked perceptrons and standard activation functions,...

1 min 3 weeks, 2 days ago

bit

LOW Academic International

Can we generate portable representations for clinical time series data using LLMs?

arXiv:2603.23987v1 Announce Type: new Abstract: Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs)...

1 min 3 weeks, 2 days ago

bit

LOW Academic European Union

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

arXiv:2603.24002v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) for high-dimensional and high-order partial differential equations (PDEs) are primarily constrained by the $\mathcal{O}(d^k)$ spatial derivative complexity and the $\mathcal{O}(P)$ memory overhead of backpropagation (BP). While randomized spatial estimators successfully reduce...

1 min 3 weeks, 2 days ago

bit

LOW Academic United States

Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Trajectories

arXiv:2603.22869v1 Announce Type: new Abstract: Large Language Models (LLMs) have become core cognitive components in modern artificial intelligence (AI) systems, combining internal knowledge with external context to perform complex tasks. However, LLMs typically treat all accessible data indiscriminately, lacking inherent...

1 min 3 weeks, 2 days ago

bit

LOW Academic International

LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation

arXiv:2603.22629v1 Announce Type: new Abstract: Adapting pretrained language models to low-resource, morphologically rich languages remains a significant challenge. Existing vocabulary expansion methods typically rely on arbitrarily segmented subword units, resulting in fragmented lexical representations and loss of critical morphological information....

1 min 3 weeks, 2 days ago

bit

LOW Academic International

Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures

arXiv:2603.22473v1 Announce Type: new Abstract: Hybrid language models combining attention with state space models (SSMs) or linear attention offer improved efficiency, but whether both components are genuinely utilized remains unclear. We present a functional component ablation framework applied to two...

1 min 3 weeks, 2 days ago

bit

LOW Academic United States

Describe-Then-Act: Proactive Agent Steering via Distilled Language-Action World Models

arXiv:2603.23149v1 Announce Type: new Abstract: Deploying safety-critical agents requires anticipating the consequences of actions before they are executed. While world models offer a paradigm for this proactive foresight, current approaches relying on visual simulation incur prohibitive latencies, often exceeding several...

1 min 3 weeks, 2 days ago

bit

LOW Conference International

ICLR 2026 Career Opportunities

1 min 3 weeks, 2 days ago

bit

LOW Academic International

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

arXiv:2603.22446v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has significantly improved reasoning in large language models (LLMs), yet the token-level mechanisms underlying these improvements remain unclear. We present a systematic empirical study of RLVR's distributional effects organized...

1 min 3 weeks, 2 days ago

bit

LOW Academic United States

AI Mental Models: Learned Intuition and Deliberation in a Bounded Neural Architecture

arXiv:2603.22561v1 Announce Type: new Abstract: This paper asks whether a bounded neural architecture can exhibit a meaningful division of labor between intuition and deliberation on a classic 64-item syllogistic reasoning benchmark. More broadly, the benchmark is relevant to ongoing debates...

1 min 3 weeks, 2 days ago

bit

LOW Academic European Union

Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies

arXiv:2603.23406v1 Announce Type: new Abstract: While large language models simulate social behaviors, their capacity for stable stance formation and identity negotiation during complex interventions remains unclear. To overcome the limitations of static evaluations, this paper proposes a novel mixed-methods framework...

1 min 3 weeks, 2 days ago

bit

LOW Academic International

Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

arXiv:2603.22582v1 Announce Type: new Abstract: Chain-of-thought (CoT) reasoning has been proposed as a transparency mechanism for large language models in safety-critical deployments, yet its effectiveness depends on faithfulness (whether models accurately verbalize the factors that actually influence their outputs), a...

1 min 3 weeks, 2 days ago

bit

LOW Academic European Union

Computational Arbitrage in AI Model Markets

arXiv:2603.22404v1 Announce Type: new Abstract: Consider a market of competing model providers selling query access to models with varying costs and capabilities. Customers submit problem instances and are willing to pay up to a budget for a verifiable solution. An...

1 min 3 weeks, 2 days ago

bit

LOW Academic United States

DALDALL: Data Augmentation for Lexical and Semantic Diverse in Legal Domain by leveraging LLM-Persona

arXiv:2603.22765v1 Announce Type: new Abstract: Data scarcity remains a persistent challenge in low-resource domains. While existing data augmentation methods leverage the generative capabilities of large language models (LLMs) to produce large volumes of synthetic data, these approaches often prioritize quantity...

1 min 3 weeks, 3 days ago

bit

LOW Academic International

Analysing LLM Persona Generation and Fairness Interpretation in Polarised Geopolitical Contexts

arXiv:2603.22837v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly utilised for social simulation and persona generation, necessitating an understanding of how they represent geopolitical identities. In this paper, we analyse personas generated for Palestinian and Israeli identities by...

1 min 3 weeks, 3 days ago

bit

LOW Academic United States

Beyond Hate: Differentiating Uncivil and Intolerant Speech in Multimodal Content Moderation

arXiv:2603.22985v1 Announce Type: new Abstract: Current multimodal toxicity benchmarks typically use a single binary hatefulness label. This coarse approach conflates two fundamentally different characteristics of expression: tone and content. Drawing on communication science theory, we introduce a fine-grained annotation scheme...

1 min 3 weeks, 3 days ago

bit

LOW Academic European Union

Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics?

arXiv:2603.23219v1 Announce Type: new Abstract: Amidst the rising capabilities of generative AI to mimic specific human styles, this study investigates the ability of state-of-the-art large language models (LLMs), including GPT-4o, Gemini 1.5 Pro, and Claude Sonnet 3.5, to emulate the...

1 min 3 weeks, 3 days ago

bit

LOW Academic International

I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes

arXiv:2603.23229v1 Announce Type: new Abstract: Internet memes represent a popular form of multimodal online communication and often use figurative elements to convey layered meaning through the combination of text and images. However, it remains largely unclear how multimodal large language...

1 min 3 weeks, 3 days ago

bit

MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

Visuospatial Perspective Taking in Multimodal Language Models

The Compression Paradox in LLM Inference: Provider-Dependent Energy Effects of Prompt Compression

Compression Method Matters: Benchmark-Dependent Output Dynamics in LLM Prompt Compression

The Diminishing Returns of Early-Exit Decoding in Modern LLMs

PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay

Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

BXRL: Behavior-Explainable Reinforcement Learning

Deep Neural Regression Collapse

Why the Maximum Second Derivative of Activations Matters for Adversarial Robustness

Can VLMs Reason Robustly? A Neuro-Symbolic Investigation

Transcending Classical Neural Network Boundaries: A Quantum-Classical Synergistic Paradigm for Seismic Data Processing

Can we generate portable representations for clinical time series data using LLMs?

Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs

Chain-of-Authorization: Internalizing Authorization into Large Language Models via Reasoning Trajectories

LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation

Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures

Describe-Then-Act: Proactive Agent Steering via Distilled Language-Action World Models

ICLR 2026 Career Opportunities

Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

AI Mental Models: Learned Intuition and Deliberation in a Bounded Neural Architecture

Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies

Lie to Me: How Faithful Is Chain-of-Thought Reasoning in Reasoning Models?

Computational Arbitrage in AI Model Markets

DALDALL: Data Augmentation for Lexical and Semantic Diverse in Legal Domain by leveraging LLM-Persona

Analysing LLM Persona Generation and Fairness Interpretation in Polarised Geopolitical Contexts

Beyond Hate: Differentiating Uncivil and Intolerant Speech in Multimodal Content Moderation

Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics?

I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.