International Law

LOW Academic International

Optimizing In-Context Demonstrations for LLM-based Automated Grading

arXiv:2603.00465v1 Announce Type: new Abstract: Automated assessment of open-ended student responses is a critical capability for scaling personalized feedback in education. While large language models (LLMs) have shown promise in grading tasks via in-context learning (ICL), their reliability is heavily...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Why Not? Solver-Grounded Certificates for Explainable Mission Planning

arXiv:2603.00469v1 Announce Type: new Abstract: Operators of Earth observation satellites need justifications for scheduling decisions: why a request was selected, rejected, or what changes would make it schedulable. Existing approaches construct post-hoc reasoning layers independent of the optimizer, risking non-causal...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

LOGIGEN: Logic-Driven Generation of Verifiable Agentic Tasks

arXiv:2603.00540v1 Announce Type: new Abstract: The evolution of Large Language Models (LLMs) from static instruction-followers to autonomous agents necessitates operating within complex, stateful environments to achieve precise state-transition objectives. However, this paradigm is bottlenecked by data scarcity, as existing tool-centric...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Advancing Multimodal Judge Models through a Capability-Oriented Benchmark and MCTS-Driven Data Generation

arXiv:2603.00546v1 Announce Type: new Abstract: Using Multimodal Large Language Models (MLLMs) as judges to achieve precise and consistent evaluations has gradually become an emerging paradigm across various domains. Evaluating the capability and reliability of MLLM-as-a-judge systems is therefore essential for...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Draft-Thinking: Learning Efficient Reasoning in Long Chain-of-Thought LLMs

arXiv:2603.00578v1 Announce Type: new Abstract: Long chain-of-thought~(CoT) has become a dominant paradigm for enhancing the reasoning capability of large reasoning models~(LRMs); however, the performance gains often come with a substantial increase in reasoning budget. Recent studies show that existing CoT...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

TraceSIR: A Multi-Agent Framework for Structured Analysis and Reporting of Agentic Execution Traces

arXiv:2603.00623v1 Announce Type: new Abstract: Agentic systems augment large language models with external tools and iterative decision making, enabling complex tasks such as deep research, function calling, and coding. However, their long and intricate execution traces make failure diagnosis and...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

InfoPO: Information-Driven Policy Optimization for User-Centric Agents

arXiv:2603.00656v1 Announce Type: new Abstract: Real-world user requests to LLM agents are often underspecified. Agents must interact to acquire missing information and make correct downstream decisions. However, current multi-turn GRPO-based methods often rely on trajectory-level reward computation, which leads to...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

The Synthetic Web: Adversarially-Curated Mini-Internets for Diagnosing Epistemic Weaknesses of Language Agents

arXiv:2603.00801v1 Announce Type: new Abstract: Language agents increasingly act as web-enabled systems that search, browse, and synthesize information from diverse sources. However, these sources can include unreliable or adversarial content, and the robustness of agents to adversarial ranking - where...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

MetaMind: General and Cognitive World Models in Multi-Agent Systems by Meta-Theory of Mind

arXiv:2603.00808v1 Announce Type: new Abstract: A major challenge for world models in multi-agent systems is to understand interdependent agent dynamics, predict interactive multi-agent trajectories, and plan over long horizons with collective awareness, without centralized supervision or explicit communication. In this...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

MC-Search: Evaluating and Enhancing Multimodal Agentic Search with Structured Long Reasoning Chains

arXiv:2603.00873v1 Announce Type: new Abstract: With the increasing demand for step-wise, cross-modal, and knowledge-grounded reasoning, multimodal large language models (MLLMs) are evolving beyond the traditional fixed retrieve-then-generate paradigm toward more sophisticated agentic multimodal retrieval-augmented generation (MM-RAG). Existing benchmarks, however, mainly...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents

arXiv:2603.00977v1 Announce Type: new Abstract: Large language model (LLM) agents have recently demonstrated strong capabilities in interactive decision-making, yet they remain fundamentally limited in long-horizon tasks that require structured planning and reliable execution. Existing approaches predominantly rely on flat autoregressive...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage

arXiv:2603.01106v1 Announce Type: new Abstract: Reinforcement learning (RL) with group relative policy optimization (GRPO) has become a widely adopted approach for enhancing the reasoning capabilities of multimodal large language models (MLLMs). While GRPO enables long-chain reasoning without a critic, it...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

HVR-Met: A Hypothesis-Verification-Replaning Agentic System for Extreme Weather Diagnosis

arXiv:2603.01121v1 Announce Type: new Abstract: While deep learning-based weather forecasting paradigms have made significant strides, addressing extreme weather diagnostics remains a formidable challenge. This gap exists primarily because the diagnostic process demands sophisticated multi-step logical reasoning, dynamic tool invocation, and...

1 min 1 month, 2 weeks ago

ear

LOW Journal International

Delaware Journal of Corporate Law

Delaware Journal of Corporate Law | 604 followers on LinkedIn. The Delaware Journal of Corporate Law continues to operate as a nationally recognized student-edited publication | The Delaware Journal of Corporate Law is a student-edited publication established in 1975 at...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent

arXiv:2603.01152v1 Announce Type: new Abstract: Deep-research agents are capable of executing multi-step web exploration, targeted retrieval, and sophisticated question answering. Despite their powerful capabilities, deep-research agents face two critical bottlenecks: (1) the lack of large-scale, challenging datasets with real-world difficulty,...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Agents Learn Their Runtime: Interpreter Persistence as Training-Time Semantics

arXiv:2603.01209v1 Announce Type: new Abstract: Tool-augmented LLMs are increasingly deployed as agents that interleave natural-language reasoning with executable Python actions, as in CodeAct-style frameworks. In deployment, these agents rely on runtime state that persists across steps. By contrast, common training...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

arXiv:2603.00021v1 Announce Type: new Abstract: This paper proposes a data-driven method to automatically construct graph-based document representations. Building upon the recent work of Bugue\~no and de Melo (2025), we leverage the dynamic sliding-window attention module to effectively capture local and...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Autorubric: A Unified Framework for Rubric-Based LLM Evaluation

arXiv:2603.00077v1 Announce Type: new Abstract: Rubric-based evaluation with large language models (LLMs) has become standard practice for assessing text generation at scale, yet the underlying techniques are scattered across papers with inconsistent terminology and partial solutions. We present a unified...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

arXiv:2603.00296v1 Announce Type: new Abstract: Large reasoning models improve with more test-time computation, but often overthink, producing unnecessarily long chains-of-thought that raise cost without improving accuracy. Prior reinforcement learning approaches typically rely on a single outcome reward with trajectory-level length...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation

arXiv:2603.00314v1 Announce Type: new Abstract: This paper details the baseline model selection, fine-tuning process, evaluation methods, and the implications of deploying more accurate LLMs in healthcare settings. As large language models (LLMs) are increasingly employed to address diverse problems, including...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Estimating Visual Attribute Effects in Advertising from Observational Data: A Deepfake-Informed Double Machine Learning Approach

arXiv:2603.02359v1 Announce Type: new Abstract: Digital advertising increasingly relies on visual content, yet marketers lack rigorous methods for understanding how specific visual attributes causally affect consumer engagement. This paper addresses a fundamental methodological challenge: estimating causal effects when the treatment,...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

VL-KGE: Vision-Language Models Meet Knowledge Graph Embeddings

arXiv:2603.02435v1 Announce Type: new Abstract: Real-world multimodal knowledge graphs (MKGs) are inherently heterogeneous, modeling entities that are associated with diverse modalities. Traditional knowledge graph embedding (KGE) methods excel at learning continuous representations of entities and relations, yet they are typically...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

arXiv:2603.02473v1 Announce Type: new Abstract: Memory-augmented LLM agents store and retrieve information from prior interactions, yet the relative importance of how memories are written versus how they are retrieved remains unclear. We introduce a diagnostic framework that analyzes how performance...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

LLM-MLFFN: Multi-Level Autonomous Driving Behavior Feature Fusion via Large Language Model

arXiv:2603.02528v1 Announce Type: new Abstract: Accurate classification of autonomous vehicle (AV) driving behaviors is critical for safety validation, performance diagnosis, and traffic integration analysis. However, existing approaches primarily rely on numerical time-series modeling and often lack semantic abstraction, limiting interpretability...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

arXiv:2603.02680v1 Announce Type: new Abstract: While Large Language Models (LLMs) form the cornerstone of sequential decision-making agent development, they have inherent limitations in high-frequency decision tasks. Existing research mainly focuses on discrete embodied decision scenarios with low-frequency and significant semantic...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Retrieval-Augmented Robots via Retrieve-Reason-Act

arXiv:2603.02688v1 Announce Type: new Abstract: To achieve general-purpose utility, we argue that robots must evolve from passive executors into active Information Retrieval users. In strictly zero-shot settings where no prior demonstrations exist, robots face a critical information gap, such as...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

A Natural Language Agentic Approach to Study Affective Polarization

arXiv:2603.02711v1 Announce Type: new Abstract: Affective polarization has been central to political and social studies, with growing focus on social media, where partisan divisions are often exacerbated. Real-world studies tend to have limited scope, while simulated studies suffer from insufficient...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

Rethinking Code Similarity for Automated Algorithm Design with LLMs

arXiv:2603.02787v1 Announce Type: new Abstract: The rise of Large Language Model-based Automated Algorithm Design (LLM-AAD) has transformed algorithm development by autonomously generating code implementations of expert-level algorithms. Unlike traditional expert-driven algorithm development, in the LLM-AAD paradigm, the main design principle...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates

arXiv:2603.02858v1 Announce Type: new Abstract: Large Language Models (LLMs) achieve strong performance in analyzing and generating text, yet they struggle with explicit, transparent, and verifiable reasoning over complex texts such as those containing debates. In particular, they lack structured representations...

1 min 1 month, 2 weeks ago

ear

LOW Academic International

SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

arXiv:2603.02908v1 Announce Type: new Abstract: In recent years, pre-trained large language models have achieved remarkable success across diverse tasks. Besides the pivotal role of self-supervised pre-training, their effectiveness in downstream applications also depends critically on the post-training process, which adapts...

1 min 1 month, 2 weeks ago

ear

Optimizing In-Context Demonstrations for LLM-based Automated Grading

Why Not? Solver-Grounded Certificates for Explainable Mission Planning

LOGIGEN: Logic-Driven Generation of Verifiable Agentic Tasks

Advancing Multimodal Judge Models through a Capability-Oriented Benchmark and MCTS-Driven Data Generation

Draft-Thinking: Learning Efficient Reasoning in Long Chain-of-Thought LLMs

TraceSIR: A Multi-Agent Framework for Structured Analysis and Reporting of Agentic Execution Traces

InfoPO: Information-Driven Policy Optimization for User-Centric Agents

The Synthetic Web: Adversarially-Curated Mini-Internets for Diagnosing Epistemic Weaknesses of Language Agents

MetaMind: General and Cognitive World Models in Multi-Agent Systems by Meta-Theory of Mind

MC-Search: Evaluating and Enhancing Multimodal Agentic Search with Structured Long Reasoning Chains

HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents

DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage

HVR-Met: A Hypothesis-Verification-Replaning Agentic System for Extreme Weather Diagnosis

Delaware Journal of Corporate Law

DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent

Agents Learn Their Runtime: Interpreter Persistence as Training-Time Semantics

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

Autorubric: A Unified Framework for Rubric-Based LLM Evaluation

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation

Estimating Visual Attribute Effects in Advertising from Observational Data: A Deepfake-Informed Double Machine Learning Approach

VL-KGE: Vision-Language Models Meet Knowledge Graph Embeddings

Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory

LLM-MLFFN: Multi-Level Autonomous Driving Behavior Feature Fusion via Large Language Model

LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

Retrieval-Augmented Robots via Retrieve-Reason-Act

A Natural Language Agentic Approach to Study Affective Polarization

Rethinking Code Similarity for Automated Algorithm Design with LLMs

LLM-based Argument Mining meets Argumentation and Description Logics: a Unified Framework for Reasoning about Debates

SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.