AI & Technology Law

LOW Academic International

Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets

arXiv:2604.02460v1 Announce Type: new Abstract: Recent work reports strong performance from multi-agent LLM systems (MAS), but these gains are often confounded by increased test-time computation. When computation is normalized, single-agent systems (SAS) can match or outperform MAS, yet the theoretical...

1 min 1 week, 5 days ago

ai llm

LOW Academic International

Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems

arXiv:2604.02668v1 Announce Type: new Abstract: Large language models (LLMs) often exhibit sycophancy: agreement with user stance even when it conflicts with the model's opinion. While prior work has mostly studied this in single-agent settings, it remains underexplored in collaborative multi-agent...

1 min 1 week, 5 days ago

ai llm

LOW Academic International

Reinforcement Learning-based Knowledge Distillation with LLM-as-a-Judge

arXiv:2604.02621v1 Announce Type: new Abstract: Reinforcement Learning (RL) has been shown to substantially improve the reasoning capability of small and large language models (LLMs), but existing approaches typically rely on verifiable rewards, hence ground truth labels. We propose an RL...

1 min 1 week, 5 days ago

ai llm

LOW Academic International

GRADE: Probing Knowledge Gaps in LLMs through Gradient Subspace Dynamics

arXiv:2604.02830v1 Announce Type: new Abstract: Detecting whether a model's internal knowledge is sufficient to correctly answer a given question is a fundamental challenge in deploying responsible LLMs. In addition to verbalising the confidence by LLM self-report, more recent methods explore...

1 min 1 week, 5 days ago

ai llm

LOW Academic International

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

arXiv:2604.03147v1 Announce Type: new Abstract: We present a method to identify a valence-arousal (VA) subspace within large language model representations. From 211k emotion-labeled texts, we derive emotion steering vectors, then learn VA axes as linear combinations of their top PCA...

1 min 1 week, 5 days ago

ai llm

LOW Academic International

Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity

arXiv:2604.02770v1 Announce Type: new Abstract: In large language model (LLM)-driven multi-agent systems, disobey role specification (failure to adhere to the defined responsibilities and constraints of an assigned role, potentially leading to an agent behaving like another) is a major failure...

1 min 1 week, 5 days ago

ai llm

LOW Academic International

Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation

arXiv:2604.02557v1 Announce Type: new Abstract: Language models are known to exhibit various forms of cultural bias in decision-making tasks, yet much less is known about their degree of cultural familiarity in open-ended text generation tasks. In this paper, we introduce...

1 min 1 week, 5 days ago

ai bias

LOW Academic International

Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization

arXiv:2604.02528v1 Announce Type: new Abstract: The new Specifications for the National Bridge Inventory (SNBI), in effect from 2022, emphasize the use of element-level condition states (CS) for risk-based bridge management. Instead of a general component rating, element-level condition data use...

1 min 1 week, 5 days ago

ai algorithm

LOW Academic International

Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming

arXiv:2604.02554v1 Announce Type: new Abstract: Diversity-aware retrieval is essential for Retrieval-Augmented Generation (RAG), yet existing methods lack theoretical guarantees and face scalability issues as the number of retrieved passages $k$ increases. We propose a principled formulation of diversity retrieval as...

1 min 1 week, 5 days ago

ai algorithm

LOW Academic International

AdaHOP: Fast and Accurate Low-Precision Training via Outlier-Pattern-Aware Rotation

arXiv:2604.02525v1 Announce Type: new Abstract: Low-precision training (LPT) commonly employs Hadamard transforms to suppress outliers and mitigate quantization error in large language models (LLMs). However, prior methods apply a fixed transform uniformly, despite substantial variation in outlier structures across tensors....

1 min 1 week, 5 days ago

ai llm

LOW News International

Anthropic ramps up its political activities with a new PAC

With the midterms right around the corner, the new group is positioned to back candidates who support the AI company's policy agenda.

1 min 2 weeks, 1 day ago

ai artificial intelligence

LOW Academic International

Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

arXiv:2604.00375v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) theoretically permit token decoding in arbitrary order, a flexibility that could enable richer exploration of reasoning paths than autoregressive (AR) LLMs. In practice, however, random-order decoding often hurts generation quality....

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Adaptive Parallel Monte Carlo Tree Search for Efficient Test-time Compute Scaling

arXiv:2604.00510v1 Announce Type: new Abstract: Monte Carlo Tree Search (MCTS) is an effective test-time compute scaling (TTCS) method for improving the reasoning performance of large language models, but its highly variable execution time leads to severe long-tail latency in practice....

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

arXiv:2604.00489v1 Announce Type: new Abstract: Adapting pre-trained text Large Language Models (LLMs) into Speech Language Models (Speech LMs) via continual pretraining on speech data is promising, but often degrades the original text capabilities. We propose Multimodal Depth Upscaling, an extension...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

arXiv:2604.00013v1 Announce Type: cross Abstract: Multimodal sentiment analysis aims to understand human emotions by integrating textual, auditory, and visual modalities. Although Multimodal Large Language Models (MLLMs) have achieved state-of-the-art performance via supervised fine-tuning (SFT), their end-to-end "black-box" nature limits interpretability....

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning

arXiv:2604.00790v1 Announce Type: new Abstract: While large language models (LLMs) have demonstrated strong performance on complex reasoning tasks such as competitive programming (CP), existing methods predominantly focus on single-attempt settings, overlooking their capacity for iterative refinement. In this paper, we...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Asymmetric Actor-Critic for Multi-turn LLM Agents

arXiv:2604.00304v1 Announce Type: new Abstract: Large language models (LLMs) exhibit strong reasoning and conversational abilities, but ensuring reliable behavior in multi-turn interactions remains challenging. In many real-world applications, agents must succeed in one-shot settings where retries are impossible. Existing approaches...

1 min 2 weeks, 1 day ago

ai llm

LOW News International

Microsoft takes on AI rivals with three new foundational models

MAI released models that can transcribe voice into text as well as generate audio and images after the group's formation six months ago.

1 min 2 weeks, 1 day ago

ai artificial intelligence

LOW Academic International

Criterion Validity of LLM-as-Judge for Business Outcomes in Conversational Commerce

arXiv:2604.00022v1 Announce Type: cross Abstract: Multi-dimensional rubric-based dialogue evaluation is widely used to assess conversational AI, yet its criterion validity -- whether quality scores are associated with the downstream outcomes they are meant to serve -- remains largely untested. We...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Can Large Language Models Self-Correct in Medical Question Answering? An Exploratory Study

arXiv:2604.00261v2 Announce Type: new Abstract: Large language models (LLMs) have achieved strong performance on medical question answering (medical QA), and chain-of-thought (CoT) prompting has further improved results by eliciting explicit intermediate reasoning; meanwhile, self-reflective (self-corrective) prompting has been widely claimed...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Matching Accuracy, Different Geometry: Evolution Strategies vs GRPO in LLM Post-Training

arXiv:2604.01499v1 Announce Type: new Abstract: Evolution Strategies (ES) have emerged as a scalable gradient-free alternative to reinforcement learning based LLM fine-tuning, but it remains unclear whether comparable task performance implies comparable solutions in parameter space. We compare ES and Group...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

A Taxonomy of Programming Languages for Code Generation

arXiv:2604.00239v1 Announce Type: new Abstract: The world's 7,000+ languages vary widely in the availability of resources for NLP, motivating efforts to systematically categorize them by their degree of resourcefulness (Joshi et al., 2020). A similar disparity exists among programming languages...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Label Shift Estimation With Incremental Prior Update

arXiv:2604.01651v1 Announce Type: new Abstract: An assumption often made in supervised learning is that the training and testing sets have the same label distribution. However, in real-life scenarios, this assumption rarely holds. For example, medical diagnosis result distributions change over...

News Monitor (1_14_4)

**Key Developments and Relevance to AI & Technology Law Practice Area** The article discusses a new approach for post-hoc label shift estimation, which is relevant to AI & Technology Law practice area as it addresses the challenges of adapting machine learning models to changing data distributions, a common issue in real-life scenarios such as medical diagnosis, fraud detection, and social media analysis. The proposed method incrementally updates the prior on each sample, adjusting each posterior for more accurate label shift estimation, which can have implications for liability and accountability in high-stakes applications of AI. This research finding highlights the need for more robust and adaptive AI systems that can handle changing data distributions, which may inform policy and regulatory developments in AI & Technology Law. **Key Research Findings and Policy Signals** 1. **Label Shift Estimation**: The article proposes a new approach for post-hoc label shift estimation, which can be applied to any black-box probabilistic classifier. 2. **Incremental Prior Update**: The proposed method incrementally updates the prior on each sample, adjusting each posterior for more accurate label shift estimation. 3. **Implications for Liability and Accountability**: The research highlights the need for more robust and adaptive AI systems that can handle changing data distributions, which may inform policy and regulatory developments in AI & Technology Law. **Relevance to Current Legal Practice** The article's findings and proposed method have implications for various areas of AI & Technology Law, including: 1. **Liability and Accountability**:

Commentary Writer (1_14_6)

### **Jurisdictional Comparison & Analytical Commentary on "Label Shift Estimation With Incremental Prior Update" in AI & Technology Law** This paper’s focus on **label shift estimation**—a critical challenge in AI model reliability—has significant implications for **AI governance, liability, and compliance** across jurisdictions. The **U.S.** (via sectoral regulations like the FDA’s AI/ML guidance and FTC enforcement actions) would likely emphasize **transparency in model drift detection** as part of AI risk management, while **South Korea’s AI Act** (aligned with the EU AI Act but with stricter accountability provisions) would require **documented post-hoc adjustments** to ensure fairness and safety. Internationally, the **OECD AI Principles** and **UNESCO Recommendation on AI Ethics** would frame this as a **human rights and accountability issue**, pushing for **auditable AI systems** that can justify label shift corrections under regulatory scrutiny. The **incremental prior update method** proposed here could influence **AI liability frameworks**, particularly in cases where undetected label shift leads to discriminatory outcomes (e.g., in hiring or lending AI). The **U.S. approach** (case-by-case enforcement) may treat this as a **FTC Act or state-level AI bias concern**, while **Korea’s AI Act** would mandate **pre-market conformity assessments** for such adaptive models. At the **international level**, this work reinforces the

AI Liability Expert (1_14_9)

This paper’s focus on incremental prior update for label shift estimation addresses a critical gap in supervised learning assumptions, particularly relevant to practitioners in high-stakes domains like medical diagnostics and fraud detection, where label distributions evolve over time. Practitioners should note that this method’s reliance on a weaker calibration notion aligns with evolving regulatory expectations under frameworks like the EU AI Act, which emphasize adaptability and transparency in AI systems’ decision-making—particularly in Article 13 (Transparency Obligations) and Recital 32 (Risk Assessment). Moreover, the paper’s compatibility with black-box classifiers mirrors precedents in *Google v. Oracle* (2021), where the Court affirmed the viability of interoperability and post-hoc analysis in complex systems, supporting the legal permissibility of adapting AI models without full retraining. This approach offers a pragmatic bridge between technical innovation and legal compliance.

Statutes: Article 13, EU AI Act

Cases: Google v. Oracle

1 min 2 weeks, 1 day ago

ai algorithm

LOW Academic International

Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

arXiv:2604.00842v1 Announce Type: new Abstract: Proactive agents that anticipate user needs and autonomously execute tasks hold great promise as digital assistants, yet the lack of realistic user simulation frameworks hinders their development. Existing approaches model apps as flat tool-calling APIs,...

1 min 2 weeks, 1 day ago

ai autonomous

LOW Academic International

Execution-Verified Reinforcement Learning for Optimization Modeling

arXiv:2604.00442v1 Announce Type: new Abstract: Automating optimization modeling with LLMs is a promising path toward scalable decision intelligence, but existing approaches either rely on agentic pipelines built on closed-source LLMs with high inference latency, or fine-tune smaller LLMs using costly...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

WHBench: Evaluating Frontier LLMs with Expert-in-the-Loop Validation on Women's Health Topics

arXiv:2604.00024v1 Announce Type: new Abstract: Large language models are increasingly used for medical guidance, but women's health remains under-evaluated in benchmark design. We present the Women's Health Benchmark (WHBench), a targeted evaluation suite of 47 expert-crafted scenarios across 10 women's...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Massively Parallel Exact Inference for Hawkes Processes

arXiv:2604.01342v1 Announce Type: new Abstract: Multivariate Hawkes processes are a widely used class of self-exciting point processes, but maximum likelihood estimation naively scales as $O(N^2)$ in the number of events. The canonical linear exponential Hawkes process admits a faster $O(N)$...

1 min 2 weeks, 1 day ago

ai algorithm

LOW Academic International

Open, Reliable, and Collective: A Community-Driven Framework for Tool-Using AI Agents

arXiv:2604.00137v1 Announce Type: new Abstract: Tool-integrated LLMs can retrieve, compute, and take real-world actions via external tools, but reliability remains a key bottleneck. We argue that failures stem from both tool-use accuracy (how well an agent invokes a tool) and...

1 min 2 weeks, 1 day ago

ai llm

LOW Academic International

Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning

arXiv:2604.00795v1 Announce Type: new Abstract: We propose the Preference Guided Iterated Pareto Referent Optimisation (PG-IPRO) for urban route planning for people with different accessibility requirements and preferences. With this algorithm the user can interact with the system by giving feedback...

News Monitor (1_14_4)

This article highlights the development of AI systems like PG-IPRO that personalize route planning for individuals with diverse accessibility needs. For AI & Technology Law, this signals increasing legal focus on **AI explainability and transparency** in decision-making (how user preferences are weighted), **data privacy and bias** in collecting and utilizing sensitive accessibility data, and potential **regulatory requirements for algorithmic fairness and non-discrimination** in AI-powered services affecting public access and mobility. The interactive nature and efficiency claims also touch upon user experience and potential liability for system failures or suboptimal recommendations.

Commentary Writer (1_14_6)

## Analytical Commentary: Preference Guided Iterated Pareto Referent Optimisation and its Impact on AI & Technology Law The "Preference Guided Iterated Pareto Referent Optimisation (PG-IPRO)" algorithm, as described in arXiv:2604.00795v1, presents a compelling advancement in human-AI interaction for complex, multi-objective decision-making, particularly in the domain of accessible urban route planning. Its core innovation lies in the intuitive, iterative feedback mechanism, allowing users to guide optimization without requiring full Pareto front computation. This has significant implications across various facets of AI & Technology Law, primarily concerning user rights, algorithmic accountability, and data governance. From a legal perspective, PG-IPRO's user-centric design, which allows individuals to directly influence the optimization process, inherently strengthens arguments around user autonomy and control over algorithmic outcomes. This is particularly salient in the context of accessibility, where personalized solutions are paramount. The algorithm's efficiency, by avoiding full Pareto front computation, also mitigates potential legal challenges related to computational burden or "black box" decision-making, as the user is actively participating in shaping the output. However, the iterative feedback loop also introduces new considerations. The nature and scope of "feedback" and its impact on subsequent iterations could become a point of legal scrutiny, particularly if the system's responsiveness to user preferences is perceived as inadequate or discriminatory. Furthermore, while the algorithm avoids full Pareto front computation, the underlying objective

AI Liability Expert (1_14_9)

This article introduces PG-IPRO, an AI-driven route planning system for accessible urban navigation, which presents significant implications for practitioners in AI liability. The system's iterative, user-feedback-driven optimization for "accessible" routes introduces a complex interplay of user preferences and algorithmic decision-making. **Expert Analysis & Implications for Practitioners:** The PG-IPRO system, while designed to enhance accessibility, introduces several layers of potential liability for practitioners. The core issue lies in the system's reliance on *user-guided feedback* to refine "optimal" routes, and its *avoidance of computing the full Pareto front*. 1. **Product Liability for Defective Design/Warning (Restatement (Third) of Torts: Products Liability § 2):** * **Implication:** If a PG-IPRO generated route, refined by user feedback, leads to an injury (e.g., directing a user with specific mobility needs down an unexpectedly hazardous path), the manufacturer/developer could face claims of defective design. The "user preference" input, while intended to personalize, could be argued to offload critical safety considerations onto the end-user without adequate safeguards or warnings. * **Connection:** This directly relates to the duty to design a reasonably safe product. The fact that the system *never computes the full set of alternative optimal policies* means it might miss a truly safer, albeit less "preferred" by the user, route.

Statutes: § 2

1 min 2 weeks, 1 day ago

ai algorithm

LOW Academic International

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

arXiv:2604.00477v1 Announce Type: new Abstract: LLM-based agent judges are an emerging approach to evaluating conversational AI, yet a fundamental uncertainty remains: can we trust their assessments, and if so, how many are needed? Through 960 sessions with two model pairs...

1 min 2 weeks, 1 day ago

ai llm

Single-Agent LLMs Outperform Multi-Agent Systems on Multi-Hop Reasoning Under Equal Thinking Token Budgets

Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems

Reinforcement Learning-based Knowledge Distillation with LLM-as-a-Judge

GRADE: Probing Knowledge Gaps in LLMs through Gradient Subspace Dynamics

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity

Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation

Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization

Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming

AdaHOP: Fast and Accurate Low-Precision Training via Outlier-Pattern-Aware Rotation

Anthropic ramps up its political activities with a new PAC

Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

Adaptive Parallel Monte Carlo Tree Search for Efficient Test-time Compute Scaling

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning

Asymmetric Actor-Critic for Multi-turn LLM Agents

Microsoft takes on AI rivals with three new foundational models

Criterion Validity of LLM-as-Judge for Business Outcomes in Conversational Commerce

Can Large Language Models Self-Correct in Medical Question Answering? An Exploratory Study

Matching Accuracy, Different Geometry: Evolution Strategies vs GRPO in LLM Post-Training

A Taxonomy of Programming Languages for Code Generation

Label Shift Estimation With Incremental Prior Update

Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

Execution-Verified Reinforcement Learning for Optimization Modeling

WHBench: Evaluating Frontier LLMs with Expert-in-the-Loop Validation on Women's Health Topics

Massively Parallel Exact Inference for Hawkes Processes

Open, Reliable, and Collective: A Community-Driven Framework for Tool-Using AI Agents

Preference Guided Iterated Pareto Referent Optimisation for Accessible Route Planning

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.