International Law

LOW Academic International

Children's Intelligence Tests Pose Challenges for MLLMs? KidGym: A 2D Grid-Based Reasoning Benchmark for MLLMs

arXiv:2603.20209v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) combine the linguistic strengths of LLMs with the ability to process multimodal data, enbaling them to address a broader range of visual tasks. Because MLLMs aim at more general, human-like...

1 min 4 weeks ago

ear

LOW Conference United States

NeurIPS 2026 Evaluations & Datasets FAQ

7 min 4 weeks ago

ear

LOW Academic European Union

Revisiting Tree Search for LLMs: Gumbel and Sequential Halving for Budget-Scalable Reasoning

arXiv:2603.21162v1 Announce Type: new Abstract: Neural tree search is a powerful decision-making algorithm widely used in complex domains such as game playing and model-based reinforcement learning. Recent work has applied AlphaZero-style tree search to enhance the reasoning capabilities of Large...

1 min 4 weeks ago

ear

LOW Academic International

Expected Reward Prediction, with Applications to Model Routing

arXiv:2603.20217v1 Announce Type: new Abstract: Reward models are a standard tool to score responses from LLMs. Reward models are built to rank responses to a fixed prompt sampled from a single model, for example to choose the best of n...

1 min 4 weeks ago

ear

LOW Academic International

Beyond Test-Time Compute Strategies: Advocating Energy-per-Token in LLM Inference

arXiv:2603.20224v1 Announce Type: new Abstract: Large Language Models (LLMs) demonstrate exceptional performance across diverse tasks but come with substantial energy and computational costs, particularly in request-heavy scenarios. In many real-world applications, the full scale and capabilities of LLMs are often...

1 min 4 weeks ago

ear

LOW Academic United Kingdom

Can we automatize scientific discovery in the cognitive sciences?

arXiv:2603.20988v1 Announce Type: new Abstract: The cognitive sciences aim to understand intelligence by formalizing underlying operations as computational models. Traditionally, this follows a cycle of discovery where researchers develop paradigms, collect data, and test predefined model classes. However, this manual...

1 min 4 weeks ago

ear

LOW Academic European Union

AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

arXiv:2603.20213v1 Announce Type: new Abstract: Generative search engines represent a transition from traditional ranking-based retrieval to Large Language Model (LLM)-based synthesis, transforming optimization goals from ranking prominence towards content inclusion. Generative Engine Optimization (GEO), specifically, aims to maximize visibility and...

1 min 4 weeks ago

ear

LOW Academic United States

ARYA: A Physics-Constrained Composable & Deterministic World Model Architecture

arXiv:2603.21340v1 Announce Type: new Abstract: This paper presents ARYA, a composable, physics-constrained, deterministic world model architecture built on five foundational principles: nano models, composability, causal reasoning, determinism, and architectural AI safety. We demonstrate that ARYA satisfies all canonical world model...

1 min 4 weeks ago

ear

LOW Academic International

Deep reflective reasoning in interdependence constrained structured data extraction from clinical notes for digital health

arXiv:2603.20435v1 Announce Type: new Abstract: Extracting structured information from clinical notes requires navigating a dense web of interdependent variables where the value of one attribute logically constrains others. Existing Large Language Model (LLM)-based extraction pipelines often struggle to capture these...

1 min 4 weeks ago

ear

LOW Conference European Union

Call For Papers 2026

1 min 4 weeks ago

ear

LOW Academic United States

Leveraging Natural Language Processing and Machine Learning for Evidence-Based Food Security Policy Decision-Making in Data-Scarce Making

arXiv:2603.20425v1 Announce Type: new Abstract: Food security policy formulation in data-scarce regions remains a critical challenge due to limited structured datasets, fragmented textual reports, and demographic bias in decision-making systems. This study proposes ZeroHungerAI, an integrated Natural Language Processing (NLP)...

1 min 4 weeks ago

ear

LOW Academic International

RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models

arXiv:2603.21341v1 Announce Type: new Abstract: Improving embodied reasoning in multimodal-large-language models (MLLMs) is essential for building vision-language-action models (VLAs) on top of them to readily translate multimodal understanding into low-level actions. Accordingly, recent work has explored enhancing embodied reasoning in...

1 min 4 weeks ago

ear

LOW Academic International

Context Cartography: Toward Structured Governance of Contextual Space in Large Language Model Systems

arXiv:2603.20578v1 Announce Type: new Abstract: The prevailing approach to improving large language model (LLM) reasoning has centered on expanding context windows, implicitly assuming that more tokens yield better performance. However, empirical evidence - including the "lost in the middle" effect...

1 min 4 weeks ago

ear

LOW Academic International

The Intelligent Disobedience Game: Formulating Disobedience in Stackelberg Games and Markov Decision Processes

arXiv:2603.20994v1 Announce Type: new Abstract: In shared autonomy, a critical tension arises when an automated assistant must choose between obeying a human's instruction and deliberately overriding it to prevent harm. This safety-critical behavior is known as intelligent disobedience. To formalize...

1 min 4 weeks ago

ear

LOW Academic International

Seed1.8 Model Card: Towards Generalized Real-World Agency

arXiv:2603.20633v1 Announce Type: new Abstract: We present Seed1.8, a foundation model aimed at generalized real-world agency: going beyond single-turn prediction to multi-turn interaction, tool use, and multi-step execution. Seed1.8 keeps strong LLM and vision-language performance while supporting a unified agentic...

1 min 4 weeks ago

ear

LOW Conference European Union

NeurIPS Blog – NeurIPS conference blog

1 min 4 weeks ago

ear

LOW Academic International

ProMAS: Proactive Error Forecasting for Multi-Agent Systems Using Markov Transition Dynamics

arXiv:2603.20260v1 Announce Type: new Abstract: The integration of Large Language Models into Multi-Agent Systems (MAS) has enabled the so-lution of complex, long-horizon tasks through collaborative reasoning. However, this collec-tive intelligence is inherently fragile, as a single logical fallacy can rapidly...

1 min 4 weeks ago

ear

LOW Academic United States

RedacBench: Can AI Erase Your Secrets?

arXiv:2603.20208v1 Announce Type: new Abstract: Modern language models can readily extract sensitive information from unstructured text, making redaction -- the selective removal of such information -- critical for data security. However, existing benchmarks for redaction typically focus on predefined categories...

1 min 4 weeks ago

ear

LOW Academic United States

Profit is the Red Team: Stress-Testing Agents in Strategic Economic Interactions

arXiv:2603.20925v1 Announce Type: new Abstract: As agentic systems move into real-world deployments, their decisions increasingly depend on external inputs such as retrieved content, tool outputs, and information provided by other actors. When these inputs can be strategically shaped by adversaries,...

1 min 4 weeks ago

ear

LOW Academic International

Fast-Slow Thinking RM: Efficient Integration of Scalar and Generative Reward Models

arXiv:2603.20212v1 Announce Type: new Abstract: Reward models (RMs) are critical for aligning Large Language Models via Reinforcement Learning from Human Feedback (RLHF). While Generative Reward Models (GRMs) achieve superior accuracy through chain-of-thought (CoT) reasoning, they incur substantial computational costs. Conversely,...

1 min 4 weeks ago

ear

LOW Academic European Union

Graph of States: Solving Abductive Tasks with Large Language Models

arXiv:2603.21250v1 Announce Type: new Abstract: Logical reasoning encompasses deduction, induction, and abduction. However, while Large Language Models (LLMs) have effectively mastered the former two, abductive reasoning remains significantly underexplored. Existing frameworks, predominantly designed for static deductive tasks, fail to generalize...

1 min 4 weeks ago

ear

LOW Academic International

Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education

arXiv:2603.20255v1 Announce Type: new Abstract: Speech-based AI educational applications have gained significant interest in recent years, particularly for children. However, children speech research remains limited due to the lack of publicly available datasets, especially for low-resource languages such as Arabic.This...

1 min 4 weeks ago

ear

LOW Academic International

Compression is all you need: Modeling Mathematics

arXiv:2603.20396v1 Announce Type: new Abstract: Human mathematics (HM), the mathematics humans discover and value, is a vanishingly small subset of formal mathematics (FM), the totality of all valid deductions. We argue that HM is distinguished by its compressibility through hierarchically...

1 min 4 weeks ago

ear

LOW Academic International

Do LLM-Driven Agents Exhibit Engagement Mechanisms? Controlled Tests of Information Load, Descriptive Norms, and Popularity Cues

arXiv:2603.20911v1 Announce Type: new Abstract: Large language models make agent-based simulation more behaviorally expressive, but they also sharpen a basic methodological tension: fluent, human-like output is not, by itself, evidence for theory. We evaluate what an LLM-driven simulation can credibly...

1 min 4 weeks ago

ear

LOW Conference European Union

NeurIPS 2026 Evaluations & Datasets Track Call for Papers

6 min 4 weeks ago

ear

LOW Conference United States

Supporting Our Community’s Infrastructure: NeurIPS Foundation’s Donation to OpenReview

2 min 4 weeks ago

ear

LOW Academic International

The AI Scientific Community: Agentic Virtual Lab Swarms

arXiv:2603.21344v1 Announce Type: new Abstract: In this short note we propose using agentic swarms of virtual labs as a model of an AI Science Community. In this paradigm, each particle in the swarm represents a complete virtual laboratory instance, enabling...

1 min 4 weeks ago

ear

LOW Conference European Union

NeurIPS 2026 Call for Organizer Nominations

1 min 4 weeks ago

ear

LOW Academic International

AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwidth Collapse

arXiv:2603.20285v1 Announce Type: new Abstract: Cooperative multi-agent methods for embodied AI are almost universally evaluated under idealized communication: zero latency, no packet loss, and unlimited bandwidth. Real-world deployment on robots with wireless links, autonomous vehicles on congested networks, or drone...

1 min 4 weeks ago

ear

LOW Academic International

Position: Multi-Agent Algorithmic Care Systems Demand Contestability for Trustworthy AI

arXiv:2603.20595v1 Announce Type: new Abstract: Multi-agent systems (MAS) are increasingly used in healthcare to support complex decision-making through collaboration among specialized agents. Because these systems act as collective decision-makers, they raise challenges for trust, accountability, and human oversight. Existing approaches...

1 min 4 weeks ago

ear

Children's Intelligence Tests Pose Challenges for MLLMs? KidGym: A 2D Grid-Based Reasoning Benchmark for MLLMs

NeurIPS 2026 Evaluations & Datasets FAQ

Revisiting Tree Search for LLMs: Gumbel and Sequential Halving for Budget-Scalable Reasoning

Expected Reward Prediction, with Applications to Model Routing

Beyond Test-Time Compute Strategies: Advocating Energy-per-Token in LLM Inference

Can we automatize scientific discovery in the cognitive sciences?

AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

ARYA: A Physics-Constrained Composable & Deterministic World Model Architecture

Deep reflective reasoning in interdependence constrained structured data extraction from clinical notes for digital health

Call For Papers 2026

Leveraging Natural Language Processing and Machine Learning for Evidence-Based Food Security Policy Decision-Making in Data-Scarce Making

RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models

Context Cartography: Toward Structured Governance of Contextual Space in Large Language Model Systems

The Intelligent Disobedience Game: Formulating Disobedience in Stackelberg Games and Markov Decision Processes

Seed1.8 Model Card: Towards Generalized Real-World Agency

NeurIPS Blog – NeurIPS conference blog

ProMAS: Proactive Error Forecasting for Multi-Agent Systems Using Markov Transition Dynamics

RedacBench: Can AI Erase Your Secrets?

Profit is the Red Team: Stress-Testing Agents in Strategic Economic Interactions

Fast-Slow Thinking RM: Efficient Integration of Scalar and Generative Reward Models

Graph of States: Solving Abductive Tasks with Large Language Models

Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education

Compression is all you need: Modeling Mathematics

Do LLM-Driven Agents Exhibit Engagement Mechanisms? Controlled Tests of Information Load, Descriptive Norms, and Popularity Cues

NeurIPS 2026 Evaluations & Datasets Track Call for Papers

Supporting Our Community’s Infrastructure: NeurIPS Foundation’s Donation to OpenReview

The AI Scientific Community: Agentic Virtual Lab Swarms

NeurIPS 2026 Call for Organizer Nominations

AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwidth Collapse

Position: Multi-Agent Algorithmic Care Systems Demand Contestability for Trustworthy AI

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.