Real Estate Law

LOW Academic United States

Why Agent Caching Fails and How to Fix It: Structured Intent Canonicalization with Few-Shot Learning

arXiv:2602.18922v1 Announce Type: new Abstract: Personal AI agents incur substantial cost via repeated LLM calls. We show existing caching methods fail: GPTCache achieves 37.9% accuracy on real benchmarks; APC achieves 0-12%. The root cause is optimizing for the wrong property...

1 min 1 month, 1 week ago

property

LOW Academic International

Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks

arXiv:2602.19008v1 Announce Type: new Abstract: Why do language agents fail on tasks they are capable of solving? We argue that many such failures are reliability failures caused by stochastic drift from a task's latent solution structure, not capability failures. Every...

1 min 1 month, 1 week ago

construction

LOW Academic International

Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem

arXiv:2602.23579v1 Announce Type: new Abstract: The Multiple Traveling Salesman Problem (mTSP) extends the Traveling Salesman Problem to m tours that start and end at a common depot and jointly visit all customers exactly once. In the min-max variant, the objective...

1 min 1 month, 1 week ago

construction

LOW Academic International

SleepLM: Natural-Language Intelligence for Human Sleep

arXiv:2602.23605v1 Announce Type: new Abstract: We present SleepLM, a family of sleep-language foundation models that enable human sleep alignment, interpretation, and interaction with natural language. Despite the critical role of sleep, learning-based sleep analysis systems operate in closed label spaces...

1 min 1 month, 1 week ago

construction

LOW Academic International

MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

arXiv:2602.23632v1 Announce Type: new Abstract: Synthesizing high-quality training data is crucial for enhancing domain models' reasoning abilities. Existing methods face limitations in long-tail knowledge coverage, effectiveness verification, and interpretability. Knowledge-graph-based approaches still fall short in functionality, granularity, customizability, and evaluation....

1 min 1 month, 1 week ago

construction

LOW Academic International

Reasoning-Driven Multimodal LLM for Domain Generalization

arXiv:2602.23777v1 Announce Type: new Abstract: This paper addresses the domain generalization (DG) problem in deep learning. While most DG methods focus on enforcing visual feature invariance, we leverage the reasoning capability of multimodal large language models (MLLMs) and explore the...

1 min 1 month, 1 week ago

variance

LOW Academic United States

Portfolio Reinforcement Learning with Scenario-Context Rollout

arXiv:2602.24037v1 Announce Type: new Abstract: Market regime shifts induce distribution shifts that can degrade the performance of portfolio rebalancing policies. We propose macro-conditioned scenario-context rollout (SCR) that generates plausible next-day multivariate return scenarios under stress events. However, doing so faces...

1 min 1 month, 1 week ago

variance

LOW Academic International

A Minimal Agent for Automated Theorem Proving

arXiv:2602.24273v1 Announce Type: new Abstract: We propose a minimal agentic baseline that enables systematic comparison across different AI-based theorem prover architectures. This design implements the core features shared among state-of-the-art systems: iterative proof refinement, library search and context management. We...

1 min 1 month, 1 week ago

lease

LOW Academic International

Toward General Semantic Chunking: A Discriminative Framework for Ultra-Long Documents

arXiv:2602.23370v1 Announce Type: cross Abstract: Long-document topic segmentation plays an important role in information retrieval and document understanding, yet existing methods still show clear shortcomings in ultra-long text settings. Traditional discriminative models are constrained by fixed windows and cannot model...

1 min 1 month, 1 week ago

lease

LOW Academic International

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

arXiv:2602.23372v1 Announce Type: cross Abstract: GraphRAG systems improve multi-hop retrieval by modeling structure, but many approaches rely on expensive LLM-based graph construction and GPU-heavy inference. We present SPRIG (Seeded Propagation for Retrieval In Graphs), a CPU-only, linear-time, token-free GraphRAG pipeline...

1 min 1 month, 1 week ago

construction

LOW Academic United States

SALIENT: Frequency-Aware Paired Diffusion for Controllable Long-Tail CT Detection

arXiv:2602.23447v1 Announce Type: cross Abstract: Detection of rare lesions in whole-body CT is fundamentally limited by extreme class imbalance and low target-to-volume ratios, producing precision collapse despite high AUROC. Synthetic augmentation with diffusion models offers promise, yet pixel-space diffusion is...

1 min 1 month, 1 week ago

lien

LOW Academic European Union

Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

arXiv:2602.23440v1 Announce Type: new Abstract: Training large language models to reason with search engines via reinforcement learning is hindered by a fundamental credit assignment problem: existing methods such as Search-R1 provide only a sparse outcome reward after an entire multi-step...

1 min 1 month, 1 week ago

variance

LOW Academic International

TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?

arXiv:2603.00285v1 Announce Type: new Abstract: Evaluating AI agents in finance faces two key challenges: static benchmarks require costly expert annotation yet miss the dynamic decision-making central to real-world trading, while LLM-based judges introduce uncontrolled variance on domain-specific tasks. We introduce...

1 min 1 month, 1 week ago

variance

LOW Academic International

Optimizing In-Context Demonstrations for LLM-based Automated Grading

arXiv:2603.00465v1 Announce Type: new Abstract: Automated assessment of open-ended student responses is a critical capability for scaling personalized feedback in education. While large language models (LLMs) have shown promise in grading tasks via in-context learning (ICL), their reliability is heavily...

1 min 1 month, 1 week ago

construction

LOW Academic International

Advancing Multimodal Judge Models through a Capability-Oriented Benchmark and MCTS-Driven Data Generation

arXiv:2603.00546v1 Announce Type: new Abstract: Using Multimodal Large Language Models (MLLMs) as judges to achieve precise and consistent evaluations has gradually become an emerging paradigm across various domains. Evaluating the capability and reliability of MLLM-as-a-judge systems is therefore essential for...

1 min 1 month, 1 week ago

construction

LOW Academic European Union

LiTS: A Modular Framework for LLM Tree Search

arXiv:2603.00631v1 Announce Type: new Abstract: LiTS is a modular Python framework for LLM reasoning via tree search. It decomposes tree search into three reusable components (Policy, Transition, and RewardModel) that plug into algorithms like MCTS and BFS. A decorator-based registry...

1 min 1 month, 1 week ago

lease

LOW Academic International

InfoPO: Information-Driven Policy Optimization for User-Centric Agents

arXiv:2603.00656v1 Announce Type: new Abstract: Real-world user requests to LLM agents are often underspecified. Agents must interact to acquire missing information and make correct downstream decisions. However, current multi-turn GRPO-based methods often rely on trajectory-level reward computation, which leads to...

1 min 1 month, 1 week ago

variance

LOW Academic International

DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage

arXiv:2603.01106v1 Announce Type: new Abstract: Reinforcement learning (RL) with group relative policy optimization (GRPO) has become a widely adopted approach for enhancing the reasoning capabilities of multimodal large language models (MLLMs). While GRPO enables long-chain reasoning without a critic, it...

1 min 1 month, 1 week ago

variance

LOW Journal International

Delaware Journal of Corporate Law

Delaware Journal of Corporate Law | 604 followers on LinkedIn. The Delaware Journal of Corporate Law continues to operate as a nationally recognized student-edited publication | The Delaware Journal of Corporate Law is a student-edited publication established in 1975 at...

1 min 1 month, 1 week ago

lease

LOW Academic International

DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent

arXiv:2603.01152v1 Announce Type: new Abstract: Deep-research agents are capable of executing multi-step web exploration, targeted retrieval, and sophisticated question answering. Despite their powerful capabilities, deep-research agents face two critical bottlenecks: (1) the lack of large-scale, challenging datasets with real-world difficulty,...

1 min 1 month, 1 week ago

lease

LOW Academic International

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

arXiv:2603.00021v1 Announce Type: new Abstract: This paper proposes a data-driven method to automatically construct graph-based document representations. Building upon the recent work of Bugue\~no and de Melo (2025), we leverage the dynamic sliding-window attention module to effectively capture local and...

1 min 1 month, 1 week ago

construction

LOW Academic United States

EPPCMinerBen: A Novel Benchmark for Evaluating Large Language Models on Electronic Patient-Provider Communication via the Patient Portal

arXiv:2603.00028v1 Announce Type: new Abstract: Effective communication in health care is critical for treatment outcomes and adherence. With patient-provider exchanges shifting to secure messaging, analyzing electronic patient-communication (EPPC) data is both essential and challenging. We introduce EPPCMinerBen, a benchmark for...

1 min 1 month, 1 week ago

lease

LOW Academic International

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

arXiv:2603.00296v1 Announce Type: new Abstract: Large reasoning models improve with more test-time computation, but often overthink, producing unnecessarily long chains-of-thought that raise cost without improving accuracy. Prior reinforcement learning approaches typically rely on a single outcome reward with trajectory-level length...

1 min 1 month, 1 week ago

construction

LOW Academic International

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

arXiv:2603.02239v1 Announce Type: new Abstract: The Engineering Reasoning and Instruction (ERI) benchmark is a taxonomy-driven instruction dataset designed to train and evaluate engineering-capable large language models (LLMs) and agents. This dataset spans nine engineering fields (namely: civil, mechanical, electrical, chemical,...

1 min 1 month, 1 week ago

lease

LOW Academic International

LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

arXiv:2603.02586v1 Announce Type: new Abstract: As large language models grow more capable, general AI agents have become increasingly prevalent in practical applications. However, existing benchmarks face significant limitations, failing to represent real-world user tasks accurately. To address this gap, we...

1 min 1 month, 1 week ago

lease

LOW Academic International

AgentAssay: Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows

arXiv:2603.02601v1 Announce Type: new Abstract: Autonomous AI agents are deployed at unprecedented scale, yet no principled methodology exists for verifying that an agent has not regressed after changes to its prompts, tools, models, or orchestration logic. We present AgentAssay, the...

1 min 1 month, 1 week ago

variance

LOW Academic International

See and Remember: A Multimodal Agent for Web Traversal

arXiv:2603.02626v1 Announce Type: new Abstract: Autonomous web navigation requires agents to perceive complex visual environments and maintain long-term context, yet current Large Language Model (LLM) based agents often struggle with spatial disorientation and navigation loops. In this paper, we propose...

1 min 1 month, 1 week ago

lien

LOW Academic International

Retrieval-Augmented Robots via Retrieve-Reason-Act

arXiv:2603.02688v1 Announce Type: new Abstract: To achieve general-purpose utility, we argue that robots must evolve from passive executors into active Information Retrieval users. In strictly zero-shot settings where no prior demonstrations exist, robots face a critical information gap, such as...

1 min 1 month, 1 week ago

construction

LOW Academic United States

Retrievit: In-context Retrieval Capabilities of Transformers, State Space Models, and Hybrid Architectures

arXiv:2603.02874v1 Announce Type: new Abstract: Transformers excel at in-context retrieval but suffer from quadratic complexity with sequence length, while State Space Models (SSMs) offer efficient linear-time processing but have limited retrieval capabilities. We investigate whether hybrid architectures combining Transformers and...

1 min 1 month, 1 week ago

property

LOW Academic International

Architecting Trust in Artificial Epistemic Agents

arXiv:2603.02960v1 Announce Type: new Abstract: Large language models increasingly function as epistemic agents -- entities that can 1) autonomously pursue epistemic goals and 2) actively shape our shared knowledge environment. They curate the information we receive, often supplanting traditional search-based...

1 min 1 month, 1 week ago

lien

Why Agent Caching Fails and How to Fix It: Structured Intent Canonicalization with Few-Shot Learning

Capable but Unreliable: Canonical Path Deviation as a Causal Mechanism of Agent Failure in Long-Horizon Tasks

Construct, Merge, Solve & Adapt with Reinforcement Learning for the min-max Multiple Traveling Salesman Problem

SleepLM: Natural-Language Intelligence for Human Sleep

MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

Reasoning-Driven Multimodal LLM for Domain Generalization

Portfolio Reinforcement Learning with Scenario-Context Rollout

A Minimal Agent for Automated Theorem Proving

Toward General Semantic Chunking: A Discriminative Framework for Ultra-Long Documents

Democratizing GraphRAG: Linear, CPU-Only Graph Retrieval for Multi-Hop QA

SALIENT: Frequency-Aware Paired Diffusion for Controllable Long-Tail CT Detection

Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

TraderBench: How Robust Are AI Agents in Adversarial Capital Markets?

Optimizing In-Context Demonstrations for LLM-based Automated Grading

Advancing Multimodal Judge Models through a Capability-Oriented Benchmark and MCTS-Driven Data Generation

LiTS: A Modular Framework for LLM Tree Search

InfoPO: Information-Driven Policy Optimization for User-Centric Agents

DIVA-GRPO: Enhancing Multimodal Reasoning through Difficulty-Adaptive Variant Advantage

Delaware Journal of Corporate Law

DeepResearch-9K: A Challenging Benchmark Dataset of Deep-Research Agent

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

EPPCMinerBen: A Novel Benchmark for Evaluating Large Language Models on Electronic Patient-Provider Communication via the Patient Portal

Stepwise Penalization for Length-Efficient Chain-of-Thought Reasoning

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

LiveAgentBench: Comprehensive Benchmarking of Agentic Systems Across 104 Real-World Challenges

AgentAssay: Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows

See and Remember: A Multimodal Agent for Web Traversal

Retrieval-Augmented Robots via Retrieve-Reason-Act

Retrievit: In-context Retrieval Capabilities of Transformers, State Space Models, and Hybrid Architectures

Architecting Trust in Artificial Epistemic Agents

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.