International Law

LOW Academic International

RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models

arXiv:2603.21341v1 Announce Type: new Abstract: Improving embodied reasoning in multimodal-large-language models (MLLMs) is essential for building vision-language-action models (VLAs) on top of them to readily translate multimodal understanding into low-level actions. Accordingly, recent work has explored enhancing embodied reasoning in...

1 min 4 weeks ago

ear

LOW Academic European Union

ConsRoute:Consistency-Aware Adaptive Query Routing for Cloud-Edge-Device Large Language Models

arXiv:2603.21237v1 Announce Type: new Abstract: Large language models (LLMs) deliver impressive capabilities but incur substantial inference latency and cost, which hinders their deployment in latency-sensitive and resource-constrained scenarios. Cloud-edge-device collaborative inference has emerged as a promising paradigm by dynamically routing...

1 min 4 weeks ago

ear

LOW Academic European Union

Improving Coherence and Persistence in Agentic AI for System Optimization

arXiv:2603.21321v1 Announce Type: new Abstract: Designing high-performance system heuristics is a creative, iterative process requiring experts to form hypotheses and execute multi-step conceptual shifts. While Large Language Models (LLMs) show promise in automating this loop, they struggle with complex system...

1 min 4 weeks ago

ear

LOW Conference European Union

Introducing the Evaluations & Datasets Track at NeurIPS 2026

6 min 4 weeks ago

ear

LOW Academic International

Knowledge Boundary Discovery for Large Language Models

arXiv:2603.21022v1 Announce Type: new Abstract: We propose Knowledge Boundary Discovery (KBD), a reinforcement learning based framework to explore the knowledge boundaries of the Large Language Models (LLMs). We define the knowledge boundary by automatically generating two types of questions: (i)...

1 min 4 weeks ago

ear

LOW Conference European Union

NeurIPS 2026 Evaluations & Datasets Track Call for Papers

6 min 4 weeks ago

ear

LOW Academic United States

RedacBench: Can AI Erase Your Secrets?

arXiv:2603.20208v1 Announce Type: new Abstract: Modern language models can readily extract sensitive information from unstructured text, making redaction -- the selective removal of such information -- critical for data security. However, existing benchmarks for redaction typically focus on predefined categories...

1 min 4 weeks ago

ear

LOW Academic International

Deep reflective reasoning in interdependence constrained structured data extraction from clinical notes for digital health

arXiv:2603.20435v1 Announce Type: new Abstract: Extracting structured information from clinical notes requires navigating a dense web of interdependent variables where the value of one attribute logically constrains others. Existing Large Language Model (LLM)-based extraction pipelines often struggle to capture these...

1 min 4 weeks ago

ear

LOW Academic International

Do LLM-Driven Agents Exhibit Engagement Mechanisms? Controlled Tests of Information Load, Descriptive Norms, and Popularity Cues

arXiv:2603.20911v1 Announce Type: new Abstract: Large language models make agent-based simulation more behaviorally expressive, but they also sharpen a basic methodological tension: fluent, human-like output is not, by itself, evidence for theory. We evaluate what an LLM-driven simulation can credibly...

1 min 4 weeks ago

ear

LOW Academic United States

ReLaMix: Residual Latency-Aware Mixing for Delay-Robust Financial Time-Series Forecasting

arXiv:2603.20869v1 Announce Type: new Abstract: Financial time-series forecasting in real-world high-frequency markets is often hindered by delayed or partially stale observations caused by asynchronous data acquisition and transmission latency. To better reflect such practical conditions, we investigate a simulated delay...

1 min 4 weeks ago

ear

LOW Academic International

Position: Multi-Agent Algorithmic Care Systems Demand Contestability for Trustworthy AI

arXiv:2603.20595v1 Announce Type: new Abstract: Multi-agent systems (MAS) are increasingly used in healthcare to support complex decision-making through collaboration among specialized agents. Because these systems act as collective decision-makers, they raise challenges for trust, accountability, and human oversight. Existing approaches...

1 min 4 weeks ago

ear

LOW Academic International

Compression is all you need: Modeling Mathematics

arXiv:2603.20396v1 Announce Type: new Abstract: Human mathematics (HM), the mathematics humans discover and value, is a vanishingly small subset of formal mathematics (FM), the totality of all valid deductions. We argue that HM is distinguished by its compressibility through hierarchically...

1 min 4 weeks ago

ear

LOW Academic International

Me, Myself, and $\pi$ : Evaluating and Explaining LLM Introspection

arXiv:2603.20276v1 Announce Type: new Abstract: A hallmark of human intelligence is Introspection-the ability to assess and reason about one's own cognitive processes. Introspection has emerged as a promising but contested capability in large language models (LLMs). However, current evaluations often...

1 min 4 weeks ago

ear

LOW Academic International

SciNav: A General Agent Framework for Scientific Coding Tasks

arXiv:2603.20256v1 Announce Type: new Abstract: Autonomous science agents built on large language models (LLMs) are increasingly used to generate hypotheses, design experiments, and produce reports. However, prior work mainly targets open-ended scientific problems with subjective outputs that are difficult to...

1 min 4 weeks ago

ear

LOW Academic United Kingdom

The production of meaning in the processing of natural language

arXiv:2603.20381v1 Announce Type: new Abstract: Understanding the fundamental mechanisms governing the production of meaning in the processing of natural language is critical for designing safe, thoughtful, engaging, and empowering human-agent interactions. Experiments in cognitive science and social psychology have demonstrated...

1 min 4 weeks ago

ear

LOW Academic International

Coding Agents are Effective Long-Context Processors

arXiv:2603.20432v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable progress in scaling to access massive contexts. However, the access is via the latent and uninterpretable attention mechanisms, and LLMs fail to effective process long context, exhibiting significant...

1 min 4 weeks ago

ear

LOW Academic International

JUBAKU: An Adversarial Benchmark for Exposing Culturally Grounded Stereotypes in Japanese LLMs

arXiv:2603.20581v1 Announce Type: new Abstract: Social biases reflected in language are inherently shaped by cultural norms, which vary significantly across regions and lead to diverse manifestations of stereotypes. Existing evaluations of social bias in large language models (LLMs) for non-English...

1 min 4 weeks ago

ear

LOW Academic International

Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention

arXiv:2603.20640v1 Announce Type: new Abstract: Multi-Agent Debate has emerged as a promising framework for improving the reasoning quality of large language models through iterative inter-agent communication. However, broadcasting all agent messages at every round introduces noise and redundancy that can...

1 min 4 weeks ago

ear

LOW Academic United States

Weber's Law in Transformer Magnitude Representations: Efficient Coding, Representational Geometry, and Psychophysical Laws in Language Models

arXiv:2603.20642v1 Announce Type: new Abstract: How do transformer language models represent magnitude? Recent work disagrees: some find logarithmic spacing, others linear encoding, others per-digit circular representations. We apply the formal tools of psychophysics to resolve this. Using four converging paradigms...

1 min 4 weeks ago

ear

LOW Academic International

Can I guess where you are from? Modeling dialectal morphosyntactic similarities in Brazilian Portuguese

arXiv:2603.20695v1 Announce Type: new Abstract: This paper investigates morphosyntactic covariation in Brazilian Portuguese (BP) to assess whether dialectal origin can be inferred from the combined behavior of linguistic variables. Focusing on four grammatical phenomena related to pronouns, correlation and clustering...

1 min 4 weeks ago

ear

LOW Academic European Union

Reasoning Topology Matters: Network-of-Thought for Complex Reasoning Tasks

arXiv:2603.20730v1 Announce Type: new Abstract: Existing prompting paradigms structure LLM reasoning in limited topologies: Chain-of-Thought (CoT) produces linear traces, while Tree-of-Thought (ToT) performs branching search. Yet complex reasoning often requires merging intermediate results, revisiting hypotheses, and integrating evidence from multiple...

1 min 4 weeks ago

ear

LOW Academic European Union

MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages

arXiv:2603.20732v1 Announce Type: new Abstract: Decoder-only language models can be adapted to diverse tasks through instruction finetuning, but the extent to which this generalizes at small scale for low-resource languages remains unclear. We focus on the languages of South Africa,...

1 min 4 weeks ago

ear

LOW Academic United States

Code-MIE: A Code-style Model for Multimodal Information Extraction with Scene Graph and Entity Attribute Knowledge Enhancement

arXiv:2603.20781v1 Announce Type: new Abstract: With the rapid development of large language models (LLMs), more and more researchers have paid attention to information extraction based on LLMs. However, there are still some spaces to improve in the existing related methods....

1 min 4 weeks ago

ear

LOW Academic European Union

The Anatomy of an Edit: Mechanism-Guided Activation Steering for Knowledge Editing

arXiv:2603.20795v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as knowledge bases, but keeping them up to date requires targeted knowledge editing (KE). However, it remains unclear how edits are implemented inside the model once applied. In...

1 min 4 weeks ago

ear

LOW Academic United States

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution

arXiv:2603.20799v1 Announce Type: new Abstract: Reinforcement learning from verifiable rewards (RLVR) stimulates the thinking processes of large language models (LLMs), substantially enhancing their reasoning abilities on verifiable tasks. It is often assumed that similar gains should transfer to general question...

1 min 4 weeks ago

ear

LOW Academic International

BenchBench: Benchmarking Automated Benchmark Generation

arXiv:2603.20807v1 Announce Type: new Abstract: Benchmarks are the de facto standard for tracking progress in large language models (LLMs), yet static test sets can rapidly saturate, become vulnerable to contamination, and are costly to refresh. Scalable evaluation of open-ended items...

1 min 4 weeks ago

ear

LOW Academic International

Can ChatGPT Really Understand Modern Chinese Poetry?

arXiv:2603.20851v1 Announce Type: new Abstract: ChatGPT has demonstrated remarkable capabilities on both poetry generation and translation, yet its ability to truly understand poetry remains unexplored. Previous poetry-related work merely analyzed experimental outcomes without addressing fundamental issues of comprehension. This paper...

1 min 4 weeks ago

ear

LOW Academic International

NoveltyAgent: Autonomous Novelty Reporting Agent with Point-wise Novelty Analysis and Self-Validation

arXiv:2603.20884v1 Announce Type: new Abstract: The exponential growth of academic publications has led to a surge in papers of varying quality, increasing the cost of paper screening. Current approaches either use novelty assessment within general AI Reviewers or repurpose DeepResearch,...

1 min 4 weeks ago

ear

LOW Academic International

User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction

arXiv:2603.20939v1 Announce Type: new Abstract: Large language models are increasingly used as personal assistants, yet most lack a persistent user model, forcing users to repeatedly restate preferences across sessions. We propose Vector-Adapted Retrieval Scoring (VARS), a pipeline-agnostic, frozen-backbone framework that...

1 min 4 weeks ago

ear

LOW Academic United States

Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

arXiv:2603.20957v1 Announce Type: new Abstract: Frontier LLM companies have repeatedly assured courts and regulators that their models do not store copies of training data. They further rely on safety alignment strategies via RLHF, system prompts, and output filters to block...

1 min 4 weeks ago

ear

RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models

ConsRoute:Consistency-Aware Adaptive Query Routing for Cloud-Edge-Device Large Language Models

Improving Coherence and Persistence in Agentic AI for System Optimization

Introducing the Evaluations & Datasets Track at NeurIPS 2026

Knowledge Boundary Discovery for Large Language Models

NeurIPS 2026 Evaluations & Datasets Track Call for Papers

RedacBench: Can AI Erase Your Secrets?

Deep reflective reasoning in interdependence constrained structured data extraction from clinical notes for digital health

Do LLM-Driven Agents Exhibit Engagement Mechanisms? Controlled Tests of Information Load, Descriptive Norms, and Popularity Cues

ReLaMix: Residual Latency-Aware Mixing for Delay-Robust Financial Time-Series Forecasting

Position: Multi-Agent Algorithmic Care Systems Demand Contestability for Trustworthy AI

Compression is all you need: Modeling Mathematics

Me, Myself, and $\pi$ : Evaluating and Explaining LLM Introspection

SciNav: A General Agent Framework for Scientific Coding Tasks

The production of meaning in the processing of natural language

Coding Agents are Effective Long-Context Processors

JUBAKU: An Adversarial Benchmark for Exposing Culturally Grounded Stereotypes in Japanese LLMs

Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention

Weber's Law in Transformer Magnitude Representations: Efficient Coding, Representational Geometry, and Psychophysical Laws in Language Models

Can I guess where you are from? Modeling dialectal morphosyntactic similarities in Brazilian Portuguese

Reasoning Topology Matters: Network-of-Thought for Complex Reasoning Tasks

MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages

Code-MIE: A Code-style Model for Multimodal Information Extraction with Scene Graph and Entity Attribute Knowledge Enhancement

The Anatomy of an Edit: Mechanism-Guided Activation Steering for Knowledge Editing

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution

BenchBench: Benchmarking Automated Benchmark Generation

Can ChatGPT Really Understand Modern Chinese Poetry?

NoveltyAgent: Autonomous Novelty Reporting Agent with Point-wise Novelty Analysis and Self-Validation

User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction

Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.