International Law

LOW Academic International

Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM

arXiv:2603.18507v1 Announce Type: new Abstract: Persona prompting can steer LLM generation towards a domain-specific tone and pattern. This behavior enables use cases in multi-agent systems where diverse interactions are crucial and human-centered tasks require high-level human alignment. Prior works provide...

1 min 1 month ago

ear

LOW Academic International

FaithSteer-BENCH: A Deployment-Aligned Stress-Testing Benchmark for Inference-Time Steering

arXiv:2603.18329v1 Announce Type: new Abstract: Inference-time steering is widely regarded as a lightweight and parameter-free mechanism for controlling large language model (LLM) behavior, and prior work has often suggested that simple activation-level interventions can reliably induce targeted behavioral changes. However,...

1 min 1 month ago

ear

LOW Academic European Union

Understanding the Theoretical Foundations of Deep Neural Networks through Differential Equations

arXiv:2603.18331v1 Announce Type: new Abstract: Deep neural networks (DNNs) have achieved remarkable empirical success, yet the absence of a principled theoretical foundation continues to hinder their systematic development. In this survey, we present differential equations as a theoretical foundation for...

1 min 1 month ago

ear

LOW Academic International

DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models

arXiv:2603.18048v1 Announce Type: new Abstract: Recent Audio Multimodal Large Language Models (Audio MLLMs) demonstrate impressive performance on speech benchmarks, yet it remains unclear whether these models genuinely process acoustic signals or rely on text-based semantic inference. To systematically study this...

1 min 1 month ago

ear

LOW Academic International

Agentic Flow Steering and Parallel Rollout Search for Spatially Grounded Text-to-Image Generation

arXiv:2603.18627v1 Announce Type: new Abstract: Precise Text-to-Image (T2I) generation has achieved great success but is hindered by the limited relational reasoning of static text encoders and the error accumulation in open-loop sampling. Without real-time feedback, initial semantic ambiguities during the...

1 min 1 month ago

ear

LOW Academic International

An Onto-Relational-Sophic Framework for Governing Synthetic Minds

arXiv:2603.18633v1 Announce Type: new Abstract: The rapid evolution of artificial intelligence, from task-specific systems to foundation models exhibiting broad, flexible competence across reasoning, creative synthesis, and social interaction, has outpaced the conceptual and governance frameworks designed to manage it. Current...

1 min 1 month ago

ear

LOW Academic United States

Consumer-to-Clinical Language Shifts in Ambient AI Draft Notes and Clinician-Finalized Documentation: A Multi-level Analysis

arXiv:2603.18327v1 Announce Type: new Abstract: Ambient AI generates draft clinical notes from patient-clinician conversations, often using lay or consumer-oriented phrasing to support patient understanding instead of standardized clinical terminology. How clinicians revise these drafts for professional documentation conventions remains unclear....

1 min 1 month ago

ear

LOW Academic International

ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs

arXiv:2603.18614v1 Announce Type: new Abstract: Tool-augmented large language models (LLMs) must tightly couple multi-step reasoning with external actions, yet existing benchmarks often confound this interplay with complex environment dynamics, memorized knowledge or dataset contamination. In this paper, we introduce ZebraArena,...

1 min 1 month ago

ear

LOW Academic United States

Balanced Thinking: Improving Chain of Thought Training in Vision Language Models

arXiv:2603.18656v1 Announce Type: new Abstract: Multimodal reasoning in vision-language models (VLMs) typically relies on a two-stage process: supervised fine-tuning (SFT) and reinforcement learning (RL). In standard SFT, all tokens contribute equally to the loss, even though reasoning data are inherently...

1 min 1 month ago

ear

LOW Academic United States

Retrieval-Augmented LLM Agents: Learning to Learn from Experience

arXiv:2603.18272v1 Announce Type: new Abstract: While large language models (LLMs) have advanced the development of general-purpose agents, achieving robust generalization to unseen tasks remains a significant challenge. Current approaches typically rely on either fine-tuning or training-free memory-augmented generation using retrieved...

1 min 1 month ago

ear

LOW Academic United States

Analysis Of Linguistic Stereotypes in Single and Multi-Agent Generative AI Architectures

arXiv:2603.18729v1 Announce Type: new Abstract: Many works in the literature show that LLM outputs exhibit discriminatory behaviour, triggering stereotype-based inferences based on the dialect in which the inputs are written. This bias has been shown to be particularly pronounced when...

1 min 1 month ago

ear

LOW Academic International

Reasonably reasoning AI agents can avoid game-theoretic failures in zero-shot, provably

arXiv:2603.18563v1 Announce Type: new Abstract: AI agents are increasingly deployed in interactive economic environments characterized by repeated AI-AI interactions. Despite AI agents' advanced capabilities, empirical studies reveal that such interactions often fail to stably induce a strategic equilibrium, such as...

1 min 1 month ago

ear

LOW Academic International

A Computationally Efficient Learning of Artificial Intelligence System Reliability Considering Error Propagation

arXiv:2603.18201v1 Announce Type: new Abstract: Artificial Intelligence (AI) systems are increasingly prominent in emerging smart cities, yet their reliability remains a critical concern. These systems typically operate through a sequence of interconnected functional stages, where upstream errors may propagate to...

1 min 1 month ago

ear

LOW Academic United States

The Validity Gap in Health AI Evaluation: A Cross-Sectional Analysis of Benchmark Composition

arXiv:2603.18294v1 Announce Type: new Abstract: Background: Clinical trials rely on transparent inclusion criteria to ensure generalizability. In contrast, benchmarks validating health-related large language models (LLMs) rarely characterize the "patient" or "query" populations they contain. Without defined composition, aggregate performance metrics...

1 min 1 month ago

ear

LOW Academic International

TeachingCoach: A Fine-Tuned Scaffolding Chatbot for Instructional Guidance to Instructors

arXiv:2603.18189v1 Announce Type: new Abstract: Higher education instructors often lack timely and pedagogically grounded support, as scalable instructional guidance remains limited and existing tools rely on generic chatbot advice or non-scalable teaching center human-human consultations. We present TeachingCoach, a pedagogically...

1 min 1 month ago

ear

LOW Academic International

Thinking with Constructions: A Benchmark and Policy Optimization for Visual-Text Interleaved Geometric Reasoning

arXiv:2603.18662v1 Announce Type: new Abstract: Geometric reasoning inherently requires "thinking with constructions" -- the dynamic manipulation of visual aids to bridge the gap between problem conditions and solutions. However, existing Multimodal Large Language Models (MLLMs) are largely confined to passive...

1 min 1 month ago

ear

LOW Academic United States

Continually self-improving AI

arXiv:2603.18073v1 Announce Type: new Abstract: Modern language model-based AI systems are remarkably powerful, yet their capabilities remain fundamentally capped by their human creators in three key ways. First, although a model's weights can be updated via fine-tuning, acquiring new knowledge...

1 min 1 month ago

ear

LOW Academic European Union

From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents

arXiv:2603.18382v1 Announce Type: new Abstract: Anonymization is widely treated as a practical safeguard because re-identifying anonymous records was historically costly, requiring domain expertise, tailored algorithms, and manual corroboration. We study a growing privacy risk that may weaken this barrier: LLM-based...

1 min 1 month ago

ear

LOW Academic United States

Agentic Framework for Political Biography Extraction

arXiv:2603.18010v1 Announce Type: new Abstract: The production of large-scale political datasets typically demands extracting structured facts from vast piles of unstructured documents or web sources, a task that traditionally relies on expensive human experts and remains prohibitively difficult to automate...

1 min 1 month ago

ear

LOW Academic International

MANAR: Memory-augmented Attention with Navigational Abstract Conceptual Representation

arXiv:2603.18676v1 Announce Type: new Abstract: MANAR (Memory-augmented Attention with Navigational Abstract Conceptual Representation), contextualization layer generalizes standard multi-head attention (MHA) by instantiating the principles of Global Workspace Theory (GWT). While MHA enables unconstrained all-to-all communication, it lacks the functional bottleneck...

1 min 1 month ago

ear

LOW Academic European Union

NeuroGame Transformer: Gibbs-Inspired Attention Driven by Game Theory and Statistical Physics

arXiv:2603.18761v1 Announce Type: new Abstract: Standard attention mechanisms in transformers are limited by their pairwise formulation, which hinders the modeling of higher-order dependencies among tokens. We introduce the NeuroGame Transformer (NGT) to overcome this by reconceptualizing attention through a dual...

1 min 1 month ago

ear

LOW Academic International

Interplay: Training Independent Simulators for Reference-Free Conversational Recommendation

arXiv:2603.18573v1 Announce Type: new Abstract: Training conversational recommender systems (CRS) requires extensive dialogue data, which is challenging to collect at scale. To address this, researchers have used simulated user-recommender conversations. Traditional simulation approaches often utilize a single large language model...

1 min 1 month ago

ear

LOW Academic European Union

Beyond Accuracy: An Explainability-Driven Analysis of Harmful Content Detection

arXiv:2603.18015v1 Announce Type: new Abstract: Although automated harmful content detection systems are frequently used to monitor online platforms, moderators and end users frequently cannot understand the logic underlying their predictions. While recent studies have focused on increasing classification accuracy, little...

1 min 1 month ago

ear

LOW Academic United States

Can LLM generate interesting mathematical research problems?

arXiv:2603.18813v1 Announce Type: new Abstract: This paper is the second one in a series of work on the mathematical creativity of LLM. In the first paper, the authors proposed three criteria for evaluating the mathematical creativity of LLM and constructed...

1 min 1 month ago

ear

LOW Academic International

Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm

arXiv:2603.18007v1 Announce Type: new Abstract: The study explores whether current Large Language Models (LLMs) exhibit Theory of Mind (ToM) capabilities -- specifically, the ability to infer others' beliefs, intentions, and emotions from text. Given that LLMs are trained on language...

1 min 1 month ago

ear

LOW Academic International

Controllable Evidence Selection in Retrieval-Augmented Question Answering via Deterministic Utility Gating

arXiv:2603.18011v1 Announce Type: new Abstract: Many modern AI question-answering systems convert text into vectors and retrieve the closest matches to a user question. While effective for topical similarity, similarity scores alone do not explain why some retrieved text can serve...

1 min 1 month ago

ear

LOW Academic International

An Agentic System for Schema Aware NL2SQL Generation

arXiv:2603.18018v1 Announce Type: new Abstract: The natural language to SQL (NL2SQL) task plays a pivotal role in democratizing data access by enabling non-expert users to interact with relational databases through intuitive language. While recent frameworks have enhanced translation accuracy via...

1 min 1 month ago

ear

LOW Academic United States

Interpretability without actionability: mechanistic methods cannot correct language model errors despite near-perfect internal representations

arXiv:2603.18353v1 Announce Type: new Abstract: Language models encode task-relevant knowledge in internal representations that far exceeds their output performance, but whether mechanistic interpretability methods can bridge this knowledge-action gap has not been systematically tested. We compared four mechanistic interpretability methods...

1 min 1 month ago

ear

LOW Academic International

Learned but Not Expressed: Capability-Expression Dissociation in Large Language Models

arXiv:2603.18013v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate the capacity to reconstruct and trace learned content from their training data under specific elicitation conditions, yet this capability does not manifest in standard generation contexts. This empirical observational study...

1 min 1 month ago

ear

LOW Academic International

Evaluating FrameNet-Based Semantic Modeling for Gender-Based Violence Detection in Clinical Records

arXiv:2603.18124v1 Announce Type: new Abstract: Gender-based violence (GBV) is a major public health issue, with the World Health Organization estimating that one in three women experiences physical or sexual violence by an intimate partner during her lifetime. In Brazil, although...

1 min 1 month ago

ear

Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM

FaithSteer-BENCH: A Deployment-Aligned Stress-Testing Benchmark for Inference-Time Steering

Understanding the Theoretical Foundations of Deep Neural Networks through Differential Equations

DEAF: A Benchmark for Diagnostic Evaluation of Acoustic Faithfulness in Audio Language Models

Agentic Flow Steering and Parallel Rollout Search for Spatially Grounded Text-to-Image Generation

An Onto-Relational-Sophic Framework for Governing Synthetic Minds

Consumer-to-Clinical Language Shifts in Ambient AI Draft Notes and Clinician-Finalized Documentation: A Multi-level Analysis

ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs

Balanced Thinking: Improving Chain of Thought Training in Vision Language Models

Retrieval-Augmented LLM Agents: Learning to Learn from Experience

Analysis Of Linguistic Stereotypes in Single and Multi-Agent Generative AI Architectures

Reasonably reasoning AI agents can avoid game-theoretic failures in zero-shot, provably

A Computationally Efficient Learning of Artificial Intelligence System Reliability Considering Error Propagation

The Validity Gap in Health AI Evaluation: A Cross-Sectional Analysis of Benchmark Composition

TeachingCoach: A Fine-Tuned Scaffolding Chatbot for Instructional Guidance to Instructors

Thinking with Constructions: A Benchmark and Policy Optimization for Visual-Text Interleaved Geometric Reasoning

Continually self-improving AI

From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents

Agentic Framework for Political Biography Extraction

MANAR: Memory-augmented Attention with Navigational Abstract Conceptual Representation

NeuroGame Transformer: Gibbs-Inspired Attention Driven by Game Theory and Statistical Physics

Interplay: Training Independent Simulators for Reference-Free Conversational Recommendation

Beyond Accuracy: An Explainability-Driven Analysis of Harmful Content Detection

Can LLM generate interesting mathematical research problems?

Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm

Controllable Evidence Selection in Retrieval-Augmented Question Answering via Deterministic Utility Gating

An Agentic System for Schema Aware NL2SQL Generation

Interpretability without actionability: mechanistic methods cannot correct language model errors despite near-perfect internal representations

Learned but Not Expressed: Capability-Expression Dissociation in Large Language Models

Evaluating FrameNet-Based Semantic Modeling for Gender-Based Violence Detection in Clinical Records

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.