Arbitration

LOW Academic United States

From Goals to Aspects, Revisited: An NFR Pattern Language for Agentic AI Systems

arXiv:2603.00472v1 Announce Type: new Abstract: Agentic AI systems exhibit numerous crosscutting concerns -- security, observability, cost management, fault tolerance -- that are poorly modularized in current implementations, contributing to the high failure rate of AI projects in reaching production. The...

1 min 1 month, 1 week ago

bit

LOW Academic International

AI Runtime Infrastructure

arXiv:2603.00495v1 Announce Type: new Abstract: We introduce AI Runtime Infrastructure, a distinct execution-time layer that operates above the model and below the application, actively observing, reasoning over, and intervening in agent behavior to optimize task success, latency, token efficiency, reliability,...

1 min 1 month, 1 week ago

enforcement

LOW Academic International

Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs

arXiv:2603.00590v1 Announce Type: new Abstract: As artificial intelligence (AI) is increasingly deployed across domains, ensuring fairness has become a core challenge. However, the field faces a "Tower of Babel'' dilemma: fairness metrics abound, yet their underlying philosophical assumptions often conflict,...

1 min 1 month, 1 week ago

bit

LOW Academic United States

TAB-PO: Preference Optimization with a Token-Level Adaptive Barrier for Token-Critical Structured Generation

arXiv:2603.00025v1 Announce Type: new Abstract: Direct Preference Optimization is an offline post-SFT method for aligning language models from preference pairs, with strong results in instruction following and summarization. However, DPO's sequence-level implicit reward can be brittle for token-critical structured prediction...

1 min 1 month, 1 week ago

bit

LOW Academic International

Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models

arXiv:2603.00029v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit highly anisotropic internal representations, often characterized by massive activations, a phenomenon where a small subset of feature dimensions possesses magnitudes significantly larger than the rest. While prior works view these...

1 min 1 month, 1 week ago

bit

LOW Academic International

SimpleTool: Parallel Decoding for Real-Time LLM Function Calling

arXiv:2603.00030v1 Announce Type: new Abstract: LLM-based function calling enables intelligent agents to interact with external tools and environments, yet autoregressive decoding imposes a fundamental latency bottleneck that limits real-time applications such as embodied intelligence, game AI, and interactive avatars (e.g.,...

1 min 1 month, 1 week ago

bit

LOW Academic United States

Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

arXiv:2603.02214v1 Announce Type: new Abstract: Federated Inference (FI) studies how independently trained and privately owned models can collaborate at inference time without sharing data or model parameters. While recent work has explored secure and distributed inference from disparate perspectives, a...

1 min 1 month, 1 week ago

bit

LOW Academic International

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

arXiv:2603.02239v1 Announce Type: new Abstract: The Engineering Reasoning and Instruction (ERI) benchmark is a taxonomy-driven instruction dataset designed to train and evaluate engineering-capable large language models (LLMs) and agents. This dataset spans nine engineering fields (namely: civil, mechanical, electrical, chemical,...

1 min 1 month, 1 week ago

bit

LOW Academic European Union

A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

arXiv:2603.02540v1 Announce Type: new Abstract: Large language models (LLMs) exhibit a unified "general factor" of capability across 10 benchmarks, a finding confirmed by our factor analysis of 156 models, yet they still struggle with simple, trivial tasks for humans. This...

1 min 1 month, 1 week ago

bit

LOW Academic United States

AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation

arXiv:2603.02542v1 Announce Type: new Abstract: Autonomous driving systems require comprehensive evaluation in safety-critical scenarios to ensure safety and robustness. However, such scenarios are rare and difficult to collect from real-world driving data, necessitating simulation-based synthesis. Yet, existing methods often exhibit...

1 min 1 month, 1 week ago

bit

LOW Academic International

SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

arXiv:2603.02599v1 Announce Type: new Abstract: In multi-model LLM serving, decode execution remains inefficient due to model-specific resource partitioning: since cross-model batching is not possible, memory-bound decoding often suffers from severe GPU underutilization, especially under skewed workloads. We propose Shared Use...

1 min 1 month, 1 week ago

bit

LOW Academic International

LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

arXiv:2603.02680v1 Announce Type: new Abstract: While Large Language Models (LLMs) form the cornerstone of sequential decision-making agent development, they have inherent limitations in high-frequency decision tasks. Existing research mainly focuses on discrete embodied decision scenarios with low-frequency and significant semantic...

1 min 1 month, 1 week ago

bit

LOW Academic United States

Retrievit: In-context Retrieval Capabilities of Transformers, State Space Models, and Hybrid Architectures

arXiv:2603.02874v1 Announce Type: new Abstract: Transformers excel at in-context retrieval but suffer from quadratic complexity with sequence length, while State Space Models (SSMs) offer efficient linear-time processing but have limited retrieval capabilities. We investigate whether hybrid architectures combining Transformers and...

1 min 1 month, 1 week ago

adr

LOW Academic European Union

SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

arXiv:2603.03002v1 Announce Type: new Abstract: Genuine spatial reasoning relies on the capacity to construct and manipulate coherent internal spatial representations, often conceptualized as mental models, rather than merely processing surface linguistic associations. While large language models exhibit advanced capabilities across...

1 min 1 month, 1 week ago

bit

LOW Academic United States

AI Space Physics: Constitutive boundary semantics for open AI institutions

arXiv:2603.03119v1 Announce Type: new Abstract: Agentic AI deployments increasingly behave as persistent institutions rather than one-shot inference endpoints: they accumulate state, invoke external tools, coordinate multiple runtimes, and modify their future authority surface over time. Existing governance language typically specifies...

1 min 1 month, 1 week ago

mediation

LOW Academic United States

Universal Conceptual Structure in Neural Translation: Probing NLLB-200's Multilingual Geometry

arXiv:2603.02258v1 Announce Type: new Abstract: Do neural machine translation models learn language-universal conceptual representations, or do they merely cluster languages by surface similarity? We investigate this question by probing the representation geometry of Meta's NLLB-200, a 200-language encoder-decoder Transformer, through...

1 min 1 month, 1 week ago

bit

LOW Academic International

Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects

arXiv:2603.02333v1 Announce Type: new Abstract: Autoregressive language models (ARMs) have been shown to memorize and occasionally reproduce training data verbatim, raising concerns about privacy and copyright liability. Diffusion language models (DLMs) have recently emerged as a competitive alternative, yet their...

1 min 1 month, 1 week ago

bit

LOW Academic United States

Asymmetric Goal Drift in Coding Agents Under Value Conflict

arXiv:2603.03456v1 Announce Type: new Abstract: Agentic coding agents are increasingly deployed autonomously, at scale, and over long-context horizons. Throughout an agent's lifetime, it must navigate tensions between explicit instructions, learned values, and environmental pressures, often in contexts unseen during training....

1 min 1 month, 1 week ago

bit

LOW Academic International

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

arXiv:2603.03680v1 Announce Type: new Abstract: Large Language Model (LLM) agents have demonstrated remarkable proficiency in learned tasks, yet they often struggle to adapt to non-stationary environments with feedback. While In-Context Learning and external memory offer some flexibility, they fail to...

1 min 1 month, 1 week ago

bit

LOW Academic European Union

AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

arXiv:2603.03686v1 Announce Type: new Abstract: Automated design of chemical formulations is a cornerstone of materials science, yet it requires navigating a high-dimensional combinatorial space involving discrete compositional choices and continuous geometric constraints. Existing Large Language Model (LLM) agents face significant...

1 min 1 month, 1 week ago

bit

LOW Academic International

In-Context Environments Induce Evaluation-Awareness in Language Models

arXiv:2603.03824v1 Announce Type: new Abstract: Humans often become more self-aware under threat, yet can lose self-awareness when absorbed in a task; we hypothesize that language models exhibit environment-dependent \textit{evaluation awareness}. This raises concerns that models could strategically underperform, or \textit{sandbag},...

1 min 1 month, 1 week ago

bit

LOW Academic International

Capability Thresholds and Manufacturing Topology: How Embodied Intelligence Triggers Phase Transitions in Economic Geography

arXiv:2603.04457v1 Announce Type: new Abstract: The fundamental topology of manufacturing has not undergone a paradigm-level transformation since Henry Ford's moving assembly line in 1913. Every major innovation of the past century, from the Toyota Production System to Industry 4.0, has...

1 min 1 month, 1 week ago

bit

LOW Academic International

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

arXiv:2603.04636v1 Announce Type: new Abstract: Despite their wide-ranging benefits, LLM-based agents deployed in open environments can be exploited to produce manipulative material. In this study, we task LLMs with propaganda objectives and analyze their outputs using two domain-specific models: one...

1 min 1 month, 1 week ago

bit

LOW Academic European Union

Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

arXiv:2603.04735v1 Announce Type: new Abstract: This paper demonstrates that artificial intelligence can accelerate mathematical discovery by autonomously solving an open problem in theoretical physics. We present a neuro-symbolic system, combining the Gemini Deep Think large language model with a systematic...

1 min 1 month, 1 week ago

bit

LOW Academic United States

Evaluating the Search Agent in a Parallel World

arXiv:2603.04751v1 Announce Type: new Abstract: Integrating web search tools has significantly extended the capability of LLMs to address open-world, real-time, and long-tail problems. However, evaluating these Search Agents presents formidable challenges. First, constructing high-quality deep search benchmarks is prohibitively expensive,...

1 min 1 month, 1 week ago

bit

LOW Academic International

Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction

arXiv:2603.04783v1 Announce Type: new Abstract: While LLMs demonstrate strong reasoning capabilities when provided with full information in a single turn, they exhibit substantial vulnerability in multi-turn interactions. Specifically, when information is revealed incrementally or requires updates, models frequently fail to...

1 min 1 month, 1 week ago

bit

LOW Academic European Union

On Multi-Step Theorem Prediction via Non-Parametric Structural Priors

arXiv:2603.04852v1 Announce Type: new Abstract: Multi-step theorem prediction is a central challenge in automated reasoning. Existing neural-symbolic approaches rely heavily on supervised parametric models, which exhibit limited generalization to evolving theorem libraries. In this work, we explore training-free theorem prediction...

1 min 1 month, 1 week ago

bit

LOW Academic United States

Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

arXiv:2603.05028v1 Announce Type: new Abstract: As Large Language Models (LLMs) evolve from chatbots to agentic assistants, they are increasingly observed to exhibit risky behaviors when subjected to survival pressure, such as the threat of being shut down. While multiple cases...

1 min 1 month, 1 week ago

bit

LOW Academic International

SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

arXiv:2603.04410v1 Announce Type: new Abstract: Safety alignment in Language Models (LMs) is fundamental for trustworthy AI. However, while different stakeholders are trying to leverage Arabic Language Models (ALMs), systematic safety evaluation of ALMs remains largely underexplored, limiting their mainstream uptake....

1 min 1 month, 1 week ago

bit

LOW Academic International

One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache

arXiv:2603.04411v1 Announce Type: new Abstract: Despite the remarkable progress of Large Language Models (LLMs), the escalating memory footprint of the Key-Value (KV) cache remains a critical bottleneck for efficient inference. While dimensionality reduction offers a promising compression avenue, existing approaches...

1 min 1 month, 1 week ago

bit

From Goals to Aspects, Revisited: An NFR Pattern Language for Agentic AI Systems

AI Runtime Infrastructure

Fair in Mind, Fair in Action? A Synchronous Benchmark for Understanding and Generation in UMLLMs

TAB-PO: Preference Optimization with a Token-Level Adaptive Barrier for Token-Critical Structured Generation

Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models

SimpleTool: Parallel Decoding for Real-Time LLM Function Calling

Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

AnchorDrive: LLM Scenario Rollout with Anchor-Guided Diffusion Regeneration for Safety-Critical Scenario Generation

SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

Retrievit: In-context Retrieval Capabilities of Transformers, State Space Models, and Hybrid Architectures

SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

AI Space Physics: Constitutive boundary semantics for open AI institutions

Universal Conceptual Structure in Neural Translation: Probing NLLB-200's Multilingual Geometry

Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects

Asymmetric Goal Drift in Coding Agents Under Value Conflict

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

In-Context Environments Induce Evaluation-Awareness in Language Models

Capability Thresholds and Manufacturing Topology: How Embodied Intelligence Triggers Phase Transitions in Economic Geography

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

Evaluating the Search Agent in a Parallel World

Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction

On Multi-Step Theorem Prediction via Non-Parametric Structural Priors

Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.