Arbitration

LOW Academic International

Context Shapes LLMs Retrieval-Augmented Fact-Checking Effectiveness

arXiv:2602.14044v1 Announce Type: new Abstract: Large language models (LLMs) show strong reasoning abilities across diverse tasks, yet their performance on extended contexts remains inconsistent. While prior research has emphasized mid-context degradation in question answering, this study examines the impact of...

1 min 1 month, 1 week ago

bit

LOW Academic International

Mind the (DH) Gap! A Contrast in Risky Choices Between Reasoning and Conversational LLMs

arXiv:2602.15173v1 Announce Type: new Abstract: The use of large language models either as decision support systems, or in agentic workflows, is rapidly transforming the digital ecosystem. However, the understanding of LLM decision-making under uncertainty remains limited. We initiate a comparative...

1 min 1 month, 1 week ago

bit

LOW Academic International

IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering

arXiv:2602.17687v1 Announce Type: cross Abstract: AI systems have achieved remarkable success in processing text and relational data, yet visual document processing remains relatively underexplored. Whereas traditional systems require OCR transcriptions to convert these visual documents into text and metadata, recent...

1 min 1 month, 1 week ago

bit

LOW Academic International

A Case Study of Selected PTQ Baselines for Reasoning LLMs on Ascend NPU

arXiv:2602.17693v1 Announce Type: cross Abstract: Post-Training Quantization (PTQ) is crucial for efficient model deployment, yet its effectiveness on Ascend NPU remains under-explored compared to GPU architectures. This paper presents a case study of representative PTQ baselines applied to reasoning-oriented models...

1 min 1 month, 1 week ago

bit

LOW Academic International

Can LLM Safety Be Ensured by Constraining Parameter Regions?

arXiv:2602.17696v1 Announce Type: cross Abstract: Large language models (LLMs) are often assumed to contain ``safety regions'' -- parameter subsets whose modification directly influences safety behaviors. We conduct a systematic evaluation of four safety region identification methods spanning different parameter granularities,...

1 min 1 month, 1 week ago

bit

LOW Academic International

TPRU: Advancing Temporal and Procedural Understanding in Large Multimodal Models

arXiv:2602.18884v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs), particularly smaller, deployable variants, exhibit a critical deficiency in understanding temporal and procedural visual data, a bottleneck hindering their application in real-world embodied AI. This gap is largely caused by...

1 min 1 month, 1 week ago

bit

LOW Academic International

How Far Can We Go with Pixels Alone? A Pilot Study on Screen-Only Navigation in Commercial 3D ARPGs

arXiv:2602.18981v1 Announce Type: new Abstract: Modern 3D game levels rely heavily on visual guidance, yet the navigability of level layouts remains difficult to quantify. Prior work either simulates play in simplified environments or analyzes static screenshots for visual affordances, but...

1 min 1 month, 1 week ago

bit

LOW Academic International

Robust Exploration in Directed Controller Synthesis via Reinforcement Learning with Soft Mixture-of-Experts

arXiv:2602.19244v1 Announce Type: new Abstract: On-the-fly Directed Controller Synthesis (OTF-DCS) mitigates state-space explosion by incrementally exploring the system and relies critically on an exploration policy to guide search efficiently. Recent reinforcement learning (RL) approaches learn such policies and achieve promising...

1 min 1 month, 1 week ago

bit

LOW Academic International

Time Series, Vision, and Language: Exploring the Limits of Alignment in Contrastive Representation Spaces

arXiv:2602.19367v1 Announce Type: new Abstract: The Platonic Representation Hypothesis posits that learned representations from models trained on different modalities converge to a shared latent structure of the world. However, this hypothesis has largely been examined in vision and language, and...

1 min 1 month, 1 week ago

bit

LOW Academic International

The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder

arXiv:2602.18487v1 Announce Type: new Abstract: This paper introduces GLiNER-bi-Encoder, a novel architecture for Named Entity Recognition (NER) that harmonizes zero-shot flexibility with industrial-scale efficiency. While the original GLiNER framework offers strong generalization, its joint-encoding approach suffers from quadratic complexity as...

1 min 1 month, 1 week ago

adr

LOW Academic International

Causal Identification from Counterfactual Data: Completeness and Bounding Results

arXiv:2602.23541v1 Announce Type: new Abstract: Previous work establishing completeness results for $\textit{counterfactual identification}$ has been circumscribed to the setting where the input data belongs to observational or interventional distributions (Layers 1 and 2 of Pearl's Causal Hierarchy), since it was...

1 min 1 month, 1 week ago

bit

LOW Academic International

SleepLM: Natural-Language Intelligence for Human Sleep

arXiv:2602.23605v1 Announce Type: new Abstract: We present SleepLM, a family of sleep-language foundation models that enable human sleep alignment, interpretation, and interaction with natural language. Despite the critical role of sleep, learning-based sleep analysis systems operate in closed label spaces...

1 min 1 month, 1 week ago

bit

LOW Academic International

Pessimistic Auxiliary Policy for Offline Reinforcement Learning

arXiv:2602.23974v1 Announce Type: new Abstract: Offline reinforcement learning aims to learn an agent from pre-collected datasets, avoiding unsafe and inefficient real-time interaction. However, inevitable access to out-ofdistribution actions during the learning process introduces approximation errors, causing the error accumulation and...

1 min 1 month, 1 week ago

bit

LOW Academic International

Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance

arXiv:2602.24110v1 Announce Type: new Abstract: Reinforcement Learning from Verifiable Rewards (RLVR) has emerged as a powerful paradigm for enhancing the complex reasoning capabilities of Large Reasoning Models. However, standard outcome-based supervision suffers from a critical limitation that penalizes trajectories that...

1 min 1 month, 1 week ago

bit

LOW Academic International

Long Range Frequency Tuning for QML

arXiv:2602.23409v1 Announce Type: cross Abstract: Quantum machine learning models using angle encoding naturally represent truncated Fourier series, providing universal function approximation capabilities with sufficient circuit depth. For unary fixed-frequency encodings, circuit depth scales as O(omega_max * (omega_max + epsilon^{-2})) with...

1 min 1 month, 1 week ago

bit

LOW Academic International

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

arXiv:2602.23446v1 Announce Type: cross Abstract: Large language models are trained primarily on human-generated data and feedback, yet they exhibit persistent errors arising from annotation noise, subjective preferences, and the limited expressive bandwidth of natural language. We argue that these limitations...

1 min 1 month, 1 week ago

bit

LOW Academic International

AI Runtime Infrastructure

arXiv:2603.00495v1 Announce Type: new Abstract: We introduce AI Runtime Infrastructure, a distinct execution-time layer that operates above the model and below the application, actively observing, reasoning over, and intervening in agent behavior to optimize task success, latency, token efficiency, reliability,...

1 min 1 month, 1 week ago

enforcement

LOW Academic International

Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs

arXiv:2603.00590v1 Announce Type: new Abstract: As artificial intelligence (AI) is increasingly deployed across domains, ensuring fairness has become a core challenge. However, the field faces a "Tower of Babel'' dilemma: fairness metrics abound, yet their underlying philosophical assumptions often conflict,...

1 min 1 month, 1 week ago

bit

LOW Academic International

Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models

arXiv:2603.00029v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit highly anisotropic internal representations, often characterized by massive activations, a phenomenon where a small subset of feature dimensions possesses magnitudes significantly larger than the rest. While prior works view these...

1 min 1 month, 1 week ago

bit

LOW Academic International

SimpleTool: Parallel Decoding for Real-Time LLM Function Calling

arXiv:2603.00030v1 Announce Type: new Abstract: LLM-based function calling enables intelligent agents to interact with external tools and environments, yet autoregressive decoding imposes a fundamental latency bottleneck that limits real-time applications such as embodied intelligence, game AI, and interactive avatars (e.g.,...

1 min 1 month, 1 week ago

bit

LOW Academic International

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

arXiv:2603.02239v1 Announce Type: new Abstract: The Engineering Reasoning and Instruction (ERI) benchmark is a taxonomy-driven instruction dataset designed to train and evaluate engineering-capable large language models (LLMs) and agents. This dataset spans nine engineering fields (namely: civil, mechanical, electrical, chemical,...

1 min 1 month, 1 week ago

bit

LOW Academic International

SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

arXiv:2603.02599v1 Announce Type: new Abstract: In multi-model LLM serving, decode execution remains inefficient due to model-specific resource partitioning: since cross-model batching is not possible, memory-bound decoding often suffers from severe GPU underutilization, especially under skewed workloads. We propose Shared Use...

1 min 1 month, 1 week ago

bit

LOW Academic International

LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

arXiv:2603.02680v1 Announce Type: new Abstract: While Large Language Models (LLMs) form the cornerstone of sequential decision-making agent development, they have inherent limitations in high-frequency decision tasks. Existing research mainly focuses on discrete embodied decision scenarios with low-frequency and significant semantic...

1 min 1 month, 1 week ago

bit

LOW Academic International

Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects

arXiv:2603.02333v1 Announce Type: new Abstract: Autoregressive language models (ARMs) have been shown to memorize and occasionally reproduce training data verbatim, raising concerns about privacy and copyright liability. Diffusion language models (DLMs) have recently emerged as a competitive alternative, yet their...

1 min 1 month, 1 week ago

bit

LOW Academic International

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

arXiv:2603.03680v1 Announce Type: new Abstract: Large Language Model (LLM) agents have demonstrated remarkable proficiency in learned tasks, yet they often struggle to adapt to non-stationary environments with feedback. While In-Context Learning and external memory offer some flexibility, they fail to...

1 min 1 month, 1 week ago

bit

LOW Academic International

In-Context Environments Induce Evaluation-Awareness in Language Models

arXiv:2603.03824v1 Announce Type: new Abstract: Humans often become more self-aware under threat, yet can lose self-awareness when absorbed in a task; we hypothesize that language models exhibit environment-dependent \textit{evaluation awareness}. This raises concerns that models could strategically underperform, or \textit{sandbag},...

1 min 1 month, 1 week ago

bit

LOW Academic International

Capability Thresholds and Manufacturing Topology: How Embodied Intelligence Triggers Phase Transitions in Economic Geography

arXiv:2603.04457v1 Announce Type: new Abstract: The fundamental topology of manufacturing has not undergone a paradigm-level transformation since Henry Ford's moving assembly line in 1913. Every major innovation of the past century, from the Toyota Production System to Industry 4.0, has...

1 min 1 month, 1 week ago

bit

LOW Academic International

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

arXiv:2603.04636v1 Announce Type: new Abstract: Despite their wide-ranging benefits, LLM-based agents deployed in open environments can be exploited to produce manipulative material. In this study, we task LLMs with propaganda objectives and analyze their outputs using two domain-specific models: one...

1 min 1 month, 1 week ago

bit

LOW Academic International

Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction

arXiv:2603.04783v1 Announce Type: new Abstract: While LLMs demonstrate strong reasoning capabilities when provided with full information in a single turn, they exhibit substantial vulnerability in multi-turn interactions. Specifically, when information is revealed incrementally or requires updates, models frequently fail to...

1 min 1 month, 1 week ago

bit

LOW Academic International

SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

arXiv:2603.04410v1 Announce Type: new Abstract: Safety alignment in Language Models (LMs) is fundamental for trustworthy AI. However, while different stakeholders are trying to leverage Arabic Language Models (ALMs), systematic safety evaluation of ALMs remains largely underexplored, limiting their mainstream uptake....

1 min 1 month, 1 week ago

bit

Context Shapes LLMs Retrieval-Augmented Fact-Checking Effectiveness

Mind the (DH) Gap! A Contrast in Risky Choices Between Reasoning and Conversational LLMs

IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering

A Case Study of Selected PTQ Baselines for Reasoning LLMs on Ascend NPU

Can LLM Safety Be Ensured by Constraining Parameter Regions?

TPRU: Advancing Temporal and Procedural Understanding in Large Multimodal Models

How Far Can We Go with Pixels Alone? A Pilot Study on Screen-Only Navigation in Commercial 3D ARPGs

Robust Exploration in Directed Controller Synthesis via Reinforcement Learning with Soft Mixture-of-Experts

Time Series, Vision, and Language: Exploring the Limits of Alignment in Contrastive Representation Spaces

The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder

Causal Identification from Counterfactual Data: Completeness and Bounding Results

SleepLM: Natural-Language Intelligence for Human Sleep

Pessimistic Auxiliary Policy for Offline Reinforcement Learning

Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance

Long Range Frequency Tuning for QML

Human Supervision as an Information Bottleneck: A Unified Theory of Error Floors in Human-Guided Learning

AI Runtime Infrastructure

Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs

Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models

SimpleTool: Parallel Decoding for Real-Time LLM Function Calling

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

In-Context Environments Induce Evaluation-Awareness in Language Models

Capability Thresholds and Manufacturing Topology: How Embodied Intelligence Triggers Phase Transitions in Economic Geography

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction

SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.