Litigation

LOW Academic International

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

arXiv:2603.02798v1 Announce Type: new Abstract: As LLM-powered agents have been used for high-stakes decision-making, such as clinical diagnosis, it becomes critical to develop reliable verification of their decisions to facilitate trustworthy deployment. Yet, existing verifiers usually underperform owing to a...

1 min 1 month, 1 week ago

evidence

LOW Academic International

RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

arXiv:2603.03078v1 Announce Type: new Abstract: Agentic Reinforcement Learning (Agentic RL) has shown remarkable potential in large language model-based (LLM) agents. These works can empower LLM agents to tackle complex tasks via multi-step, tool-integrated reasoning. However, an inherent limitation of existing...

1 min 1 month, 1 week ago

discovery

LOW Academic International

Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation

arXiv:2603.03080v1 Announce Type: new Abstract: LLM-based explainable recommenders can produce fluent explanations that are factually correct, yet still justify items using attributes that conflict with a user's historical preferences. Such preference-inconsistent explanations yield logically valid but unconvincing reasoning and are...

1 min 1 month, 1 week ago

evidence

LOW Academic European Union

Odin: Multi-Signal Graph Intelligence for Autonomous Discovery in Knowledge Graphs

arXiv:2603.03097v1 Announce Type: new Abstract: We present Odin, the first production-deployed graph intelligence engine for autonomous discovery of meaningful patterns in knowledge graphs without prior specification. Unlike retrieval-based systems that answer predefined queries, Odin guides exploration through the COMPASS (Composite...

1 min 1 month, 1 week ago

discovery

LOW Academic United States

Universal Conceptual Structure in Neural Translation: Probing NLLB-200's Multilingual Geometry

arXiv:2603.02258v1 Announce Type: new Abstract: Do neural machine translation models learn language-universal conceptual representations, or do they merely cluster languages by surface similarity? We investigate this question by probing the representation geometry of Meta's NLLB-200, a 200-language encoder-decoder Transformer, through...

1 min 1 month, 1 week ago

evidence

LOW Academic International

CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

arXiv:2603.02547v1 Announce Type: new Abstract: We study why continuous diffusion language models (DLMs) have lagged behind discrete diffusion approaches despite their appealing continuous generative dynamics. Under a controlled token--recovery study, we identify token rounding, the final projection from denoised embeddings...

1 min 1 month, 1 week ago

appeal

LOW Academic European Union

Mozi: Governed Autonomy for Drug Discovery LLM Agents

arXiv:2603.03655v1 Announce Type: new Abstract: Tool-augmented large language model (LLM) agents promise to unify scientific reasoning with computation, yet their deployment in high-stakes domains like drug discovery is bottlenecked by two critical barriers: unconstrained tool-use governance and poor long-horizon reliability....

1 min 1 month, 1 week ago

discovery

LOW Academic European Union

AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

arXiv:2603.03686v1 Announce Type: new Abstract: Automated design of chemical formulations is a cornerstone of materials science, yet it requires navigating a high-dimensional combinatorial space involving discrete compositional choices and continuous geometric constraints. Existing Large Language Model (LLM) agents face significant...

1 min 1 month, 1 week ago

discovery

LOW Academic International

Phi-4-reasoning-vision-15B Technical Report

arXiv:2603.03975v1 Announce Type: new Abstract: We present Phi-4-reasoning-vision-15B, a compact open-weight multimodal reasoning model, and share the motivations, design choices, experiments, and learnings that informed its development. Our goal is to contribute practical insight to the research community on building...

1 min 1 month, 1 week ago

standing

LOW Academic International

Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

arXiv:2603.04191v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly serving as personal assistants, where users share complex and diverse preferences over extended interactions. However, assessing how well LLMs can follow these preferences in realistic, long-term situations remains underexplored....

1 min 1 month, 1 week ago

standing

LOW Academic International

$\tau$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

arXiv:2603.04370v1 Announce Type: new Abstract: Conversational agents are increasingly deployed in knowledge-intensive settings, where correct behavior depends on retrieving and applying domain-specific knowledge from large, proprietary, and unstructured corpora during live interactions with users. Yet most existing benchmarks evaluate retrieval...

1 min 1 month, 1 week ago

trial

LOW Academic European Union

Discovering mathematical concepts through a multi-agent system

arXiv:2603.04528v1 Announce Type: new Abstract: Mathematical concepts emerge through an interplay of processes, including experimentation, efforts at proof, and counterexamples. In this paper, we present a new multi-agent model for computational mathematical discovery based on this observation. Our system, conceived...

1 min 1 month, 1 week ago

discovery

LOW Academic International

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

arXiv:2603.04636v1 Announce Type: new Abstract: Despite their wide-ranging benefits, LLM-based agents deployed in open environments can be exploited to produce manipulative material. In this study, we task LLMs with propaganda objectives and analyze their outputs using two domain-specific models: one...

1 min 1 month, 1 week ago

appeal

LOW Academic European Union

Model Medicine: A Clinical Framework for Understanding, Diagnosing, and Treating AI Models

arXiv:2603.04722v1 Announce Type: new Abstract: Model Medicine is the science of understanding, diagnosing, treating, and preventing disorders in AI models, grounded in the principle that AI models -- like biological organisms -- have internal structures, dynamic processes, heritable traits, observable...

1 min 1 month, 1 week ago

standing

LOW Academic European Union

Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

arXiv:2603.04735v1 Announce Type: new Abstract: This paper demonstrates that artificial intelligence can accelerate mathematical discovery by autonomously solving an open problem in theoretical physics. We present a neuro-symbolic system, combining the Gemini Deep Think large language model with a systematic...

1 min 1 month, 1 week ago

discovery

LOW Academic United States

Evaluating the Search Agent in a Parallel World

arXiv:2603.04751v1 Announce Type: new Abstract: Integrating web search tools has significantly extended the capability of LLMs to address open-world, real-time, and long-tail problems. However, evaluating these Search Agents presents formidable challenges. First, constructing high-quality deep search benchmarks is prohibitively expensive,...

1 min 1 month, 1 week ago

evidence

LOW Academic International

LLM-Grounded Explainability for Port Congestion Prediction via Temporal Graph Attention Networks

arXiv:2603.04818v1 Announce Type: new Abstract: Port congestion at major maritime hubs disrupts global supply chains, yet existing prediction systems typically prioritize forecasting accuracy without providing operationally interpretable explanations. This paper proposes AIS-TGNN, an evidence-grounded framework that jointly performs congestion-escalation prediction...

1 min 1 month, 1 week ago

evidence

LOW Academic European Union

Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models

arXiv:2603.04837v1 Announce Type: new Abstract: We introduce the Dynamic Behavioral Constraint (DBC) benchmark, the first empirical framework for evaluating the efficacy of a structured, 150-control behavioral governance layer, the MDBC (Madan DBC) system, applied at inference time to large language...

1 min 1 month, 1 week ago

jurisdiction

LOW Academic International

Retrieval-Augmented Generation with Covariate Time Series

arXiv:2603.04951v1 Announce Type: new Abstract: While RAG has greatly enhanced LLMs, extending this paradigm to Time-Series Foundation Models (TSFMs) remains a challenge. This is exemplified in the Predictive Maintenance of the Pressure Regulating and Shut-Off Valve (PRSOV), a high-stakes industrial...

1 min 1 month, 1 week ago

trial

LOW Academic United States

Enhancing Zero-shot Commonsense Reasoning by Integrating Visual Knowledge via Machine Imagination

arXiv:2603.05040v1 Announce Type: new Abstract: Recent advancements in zero-shot commonsense reasoning have empowered Pre-trained Language Models (PLMs) to acquire extensive commonsense knowledge without requiring task-specific fine-tuning. Despite this progress, these models frequently suffer from limitations caused by human reporting biases...

1 min 1 month, 1 week ago

standing

LOW Academic United States

Jagarin: A Three-Layer Architecture for Hibernating Personal Duty Agents on Mobile

arXiv:2603.05069v1 Announce Type: new Abstract: Personal AI agents face a fundamental deployment paradox on mobile: persistent background execution drains battery and violates platform sandboxing policies, yet purely reactive agents miss time-sensitive obligations until the user remembers to ask. We present...

1 min 1 month, 1 week ago

motion

LOW Academic United States

MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

arXiv:2603.05129v1 Announce Type: new Abstract: Diagnosing hepatic diseases accurately and interpretably is critical, yet it remains challenging in real-world clinical settings. Existing AI approaches for clinical diagnosis often lack transparency, structured reasoning, and deployability. Recent efforts have leveraged large language...

1 min 1 month, 1 week ago

evidence

LOW Academic International

CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models

arXiv:2603.04406v1 Announce Type: new Abstract: With the growing use of Retrieval-Augmented Generation (RAG), training large language models (LLMs) for context-sensitive reasoning and faithfulness is increasingly important. Existing RAG-oriented reinforcement learning (RL) methods rely on external rewards that often fail to...

1 min 1 month, 1 week ago

evidence

LOW Academic European Union

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

arXiv:2603.04414v1 Announce Type: new Abstract: Multiclass hate speech detection across demographic categories remains computationally challenging due to implicit targeting strategies and linguistic variability in social media content. Existing approaches rely solely on learned representations from training data, without explicitly incorporating...

1 min 1 month, 1 week ago

standing

LOW Academic International

Coordinated Semantic Alignment and Evidence Constraints for Retrieval-Augmented Generation with Large Language Models

arXiv:2603.04647v1 Announce Type: new Abstract: Retrieval augmented generation mitigates limitations of large language models in factual consistency and knowledge updating by introducing external knowledge. However, practical applications still suffer from semantic misalignment between retrieved results and generation objectives, as well...

1 min 1 month, 1 week ago

evidence

LOW Academic International

iAgentBench: Benchmarking Sensemaking Capabilities of Information-Seeking Agents on High-Traffic Topics

arXiv:2603.04656v1 Announce Type: new Abstract: With the emergence of search-enabled generative QA systems, users are increasingly turning to tools that browse, aggregate, and reconcile evidence across multiple sources on their behalf. Yet many widely used QA benchmarks remain answerable by...

1 min 1 month, 1 week ago

evidence

LOW Academic International

Stacked from One: Multi-Scale Self-Injection for Context Window Extension

arXiv:2603.04759v1 Announce Type: new Abstract: The limited context window of contemporary large language models (LLMs) remains a primary bottleneck for their broader application across diverse domains. Although continual pre-training on long-context data offers a straightforward solution, it incurs prohibitive data...

1 min 1 month, 1 week ago

standing

LOW Academic International

TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings

arXiv:2603.04772v1 Announce Type: new Abstract: Despite the exceptional reasoning capabilities of Multimodal Large Language Models (MLLMs), their adaptation into universal embedding models is significantly impeded by task conflict. To address this, we propose TSEmbed, a universal multimodal embedding framework that...

1 min 1 month, 1 week ago

trial

LOW Academic International

Autoscoring Anticlimax: A Meta-analytic Understanding of AI's Short-answer Shortcomings and Wording Weaknesses

arXiv:2603.04820v1 Announce Type: new Abstract: Automated short-answer scoring lags other LLM applications. We meta-analyze 890 culminating results across a systematic review of LLM short-answer scoring studies, modeling the traditional effect size of Quadratic Weighted Kappa (QWK) with mixed effects metaregression....

1 min 1 month, 1 week ago

standing

LOW Academic European Union

Machine Learning for Complex Systems Dynamics: Detecting Bifurcations in Dynamical Systems with Deep Neural Networks

arXiv:2603.04420v1 Announce Type: new Abstract: Critical transitions are the abrupt shifts between qualitatively different states of a system, and they are crucial to understanding tipping points in complex dynamical systems across ecology, climate science, and biology. Detecting these shifts typically...

1 min 1 month, 1 week ago

standing

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation

Odin: Multi-Signal Graph Intelligence for Autonomous Discovery in Knowledge Graphs

Universal Conceptual Structure in Neural Translation: Probing NLLB-200's Multilingual Geometry

CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

Mozi: Governed Autonomy for Drug Discovery LLM Agents

AI4S-SDS: A Neuro-Symbolic Solvent Design System via Sparse MCTS and Differentiable Physics Alignment

Phi-4-reasoning-vision-15B Technical Report

Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

$\tau$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

Discovering mathematical concepts through a multi-agent system

When Agents Persuade: Propaganda Generation and Mitigation in LLMs

Model Medicine: A Clinical Framework for Understanding, Diagnosing, and Treating AI Models

Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

Evaluating the Search Agent in a Parallel World

LLM-Grounded Explainability for Port Congestion Prediction via Temporal Graph Attention Networks

Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models

Retrieval-Augmented Generation with Covariate Time Series

Enhancing Zero-shot Commonsense Reasoning by Integrating Visual Knowledge via Machine Imagination

Jagarin: A Three-Layer Architecture for Hibernating Personal Duty Agents on Mobile

MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models

Multiclass Hate Speech Detection with RoBERTa-OTA: Integrating Transformer Attention and Graph Convolutional Networks

Coordinated Semantic Alignment and Evidence Constraints for Retrieval-Augmented Generation with Large Language Models

iAgentBench: Benchmarking Sensemaking Capabilities of Information-Seeking Agents on High-Traffic Topics

Stacked from One: Multi-Scale Self-Injection for Context Window Extension

TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings

Autoscoring Anticlimax: A Meta-analytic Understanding of AI's Short-answer Shortcomings and Wording Weaknesses

Machine Learning for Complex Systems Dynamics: Detecting Bifurcations in Dynamical Systems with Deep Neural Networks

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.