Litigation

LOW Academic International

OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration

arXiv:2604.02349v1 Announce Type: cross Abstract: Preference-based reinforcement learning (PbRL) can help avoid sophisticated reward designs and align better with human intentions, showing great promise in various real-world applications. However, obtaining human feedback for preferences can be expensive and time-consuming, which...

1 min 1 week, 4 days ago

motion

LOW Academic International

VoxelCodeBench: Benchmarking 3D World Modeling Through Code Generation

arXiv:2604.02580v1 Announce Type: new Abstract: Evaluating code generation models for 3D spatial reasoning requires executing generated code in realistic environments and assessing outputs beyond surface-level correctness. We introduce a platform VoxelCode, for analyzing code generation capabilities for 3D understanding and...

1 min 1 week, 4 days ago

standing

LOW Academic International

Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models

arXiv:2604.02340v1 Announce Type: new Abstract: Recent advances in masked diffusion language models (MDLMs) narrow the quality gap to autoregressive LMs, but their sampling remains expensive because generation requires many full-sequence denoising passes with a large Transformer and, unlike autoregressive decoding,...

1 min 1 week, 4 days ago

mdl

LOW Academic International

ESL-Bench: An Event-Driven Synthetic Longitudinal Benchmark for Health Agents

arXiv:2604.02834v1 Announce Type: new Abstract: Longitudinal health agents must reason across multi-source trajectories that combine continuous device streams, sparse clinical exams, and episodic life events - yet evaluating them is hard: real-world data cannot be released at scale, and temporally...

1 min 1 week, 4 days ago

evidence

LOW Academic International

Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems

arXiv:2604.02668v1 Announce Type: new Abstract: Large language models (LLMs) often exhibit sycophancy: agreement with user stance even when it conflicts with the model's opinion. While prior work has mostly studied this in single-agent settings, it remains underexplored in collaborative multi-agent...

1 min 1 week, 4 days ago

standing

LOW Academic International

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

arXiv:2604.02713v1 Announce Type: new Abstract: Conversational AI is increasingly deployed in emotionally charged and ethically sensitive interactions. Previous research has primarily concentrated on emotional benchmarks or static safety checks, overlooking how alignment unfolds in evolving conversation. We explore the research...

1 min 1 week, 4 days ago

motion

LOW Academic International

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

arXiv:2604.02423v1 Announce Type: new Abstract: Large language models exhibit sycophancy: the tendency to shift outputs toward user-expressed stances, regardless of correctness or consistency. While prior work has studied this issue and its impacts, rigorous computational linguistic metrics are needed to...

1 min 1 week, 4 days ago

evidence

LOW Academic International

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

arXiv:2604.00666v1 Announce Type: new Abstract: Diffusion language models (DLMs) offer a promising path toward low-latency generation through parallel decoding, but their practical efficiency depends heavily on the decoding trajectory. In practice, this advantage often fails to fully materialize because standard...

1 min 2 weeks, 1 day ago

mdl

LOW Academic International

Therefore I am. I Think

arXiv:2604.01202v2 Announce Type: new Abstract: We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable,...

1 min 2 weeks, 1 day ago

evidence

LOW Academic International

Agent psychometrics: Task-level performance prediction in agentic coding benchmarks

arXiv:2604.00594v1 Announce Type: new Abstract: As the focus in LLM-based coding shifts from static single-step code generation to multi-step agentic interaction with tools and environments, understanding which tasks will challenge agents and why becomes increasingly difficult. This is compounded by...

1 min 2 weeks, 1 day ago

standing

LOW Academic International

Agentic AI -- Physicist Collaboration in Experimental Particle Physics: A Proof-of-Concept Measurement with LEP Open Data

arXiv:2603.05735v2 Announce Type: cross Abstract: We present an AI agentic measurement of the thrust distribution in $e^{+}e^{-}$ collisions at $\sqrt{s}=91.2$~GeV using archived ALEPH data. The analysis and all note writing is carried out entirely by AI agents (OpenAI Codex and...

1 min 2 weeks, 1 day ago

discovery

LOW Academic International

DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data

arXiv:2604.01481v1 Announce Type: new Abstract: The development of robust clinical decision support systems is frequently impeded by the scarcity of high-fidelity, privacy-preserving biomedical data. While Generative Large Language Models (LLMs) offer a promising avenue for synthetic data generation, they often...

1 min 2 weeks, 1 day ago

discovery

LOW Academic International

Detecting Multi-Agent Collusion Through Multi-Agent Interpretability

arXiv:2604.01151v1 Announce Type: new Abstract: As LLM agents are increasingly deployed in multi-agent systems, they introduce risks of covert coordination that may evade standard forms of human oversight. While linear probes on model activations have shown promise for detecting deception...

1 min 2 weeks, 1 day ago

evidence

LOW Academic International

Improvisational Games as a Benchmark for Social Intelligence of AI Agents: The Case of Connections

arXiv:2604.00284v1 Announce Type: new Abstract: We formally introduce a improvisational wordplay game called Connections to explore reasoning capabilities of AI agents. Playing Connections combines skills in knowledge retrieval, summarization and awareness of cognitive states of other agents. We show how...

1 min 2 weeks, 1 day ago

standing

LOW Academic International

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via Diffusion Sampler

arXiv:2604.01870v1 Announce Type: new Abstract: In modern process industries, data-driven models are important tools for real-time monitoring when key performance indicators are difficult to measure directly. While accurate predictions are essential, reliable uncertainty quantification (UQ) is equally critical for safety,...

1 min 2 weeks, 1 day ago

trial

LOW Academic International

When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signals

arXiv:2604.01476v1 Announce Type: new Abstract: Reinforcement learning for LLMs is vulnerable to reward hacking, where models exploit shortcuts to maximize reward without solving the intended task. We systematically study this phenomenon in coding tasks using an environment-manipulation setting, where models...

1 min 2 weeks, 1 day ago

standing

LOW Academic International

An Online Machine Learning Multi-resolution Optimization Framework for Energy System Design Limit of Performance Analysis

arXiv:2604.01308v1 Announce Type: new Abstract: Designing reliable integrated energy systems for industrial processes requires optimization and verification models across multiple fidelities, from architecture-level sizing to high-fidelity dynamic operation. However, model mismatch across fidelities obscures the sources of performance loss and...

1 min 2 weeks, 1 day ago

trial

LOW Academic International

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

arXiv:2604.00013v1 Announce Type: cross Abstract: Multimodal sentiment analysis aims to understand human emotions by integrating textual, auditory, and visual modalities. Although Multimodal Large Language Models (MLLMs) have achieved state-of-the-art performance via supervised fine-tuning (SFT), their end-to-end "black-box" nature limits interpretability....

1 min 2 weeks, 1 day ago

motion

LOW Academic International

CRIT: Graph-Based Automatic Data Synthesis to Enhance Cross-Modal Multi-Hop Reasoning

arXiv:2604.01634v1 Announce Type: new Abstract: Real-world reasoning often requires combining information across modalities, connecting textual context with visual cues in a multi-hop process. Yet, most multimodal benchmarks fail to capture this ability: they typically rely on single images or set...

1 min 2 weeks, 1 day ago

evidence

LOW Academic International

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

arXiv:2604.00477v1 Announce Type: new Abstract: LLM-based agent judges are an emerging approach to evaluating conversational AI, yet a fundamental uncertainty remains: can we trust their assessments, and if so, how many are needed? Through 960 sessions with two model pairs...

1 min 2 weeks, 1 day ago

discovery

LOW Academic International

Does Unification Come at a Cost? Uni-SafeBench: A Safety Benchmark for Unified Multimodal Large Models

arXiv:2604.00547v1 Announce Type: new Abstract: Unified Multimodal Large Models (UMLMs) integrate understanding and generation capabilities within a single architecture. While this architectural unification, driven by the deep fusion of multimodal features, enhances model performance, it also introduces important yet underexplored...

1 min 2 weeks, 1 day ago

standing

LOW Academic International

Dynin-Omni: Omnimodal Unified Large Diffusion Language Model

arXiv:2604.00007v1 Announce Type: cross Abstract: We present Dynin-Omni, the first masked-diffusion-based omnimodal foundation model that unifies text, image, and speech understanding and generation, together with video understanding, within a single architecture. Unlike autoregressive unified models that serialize heterogeneous modalities, or...

1 min 2 weeks, 1 day ago

standing

LOW Academic International

Can Large Language Models Self-Correct in Medical Question Answering? An Exploratory Study

arXiv:2604.00261v2 Announce Type: new Abstract: Large language models (LLMs) have achieved strong performance on medical question answering (medical QA), and chain-of-thought (CoT) prompting has further improved results by eliciting explicit intermediate reasoning; meanwhile, self-reflective (self-corrective) prompting has been widely claimed...

1 min 2 weeks, 1 day ago

standing

LOW Academic International

How Emotion Shapes the Behavior of LLMs and Agents: A Mechanistic Study

arXiv:2604.00005v1 Announce Type: new Abstract: Emotion plays an important role in human cognition and performance. Motivated by this, we investigate whether analogous emotional signals can shape the behavior of large language models (LLMs) and agents. Existing emotion-aware studies mainly treat...

1 min 2 weeks, 1 day ago

motion

LOW News International

Perplexity's "Incognito Mode" is a "sham," lawsuit says

Google, Meta, and Perplexity accused of sharing millions of chats to increase ad revenue.

1 min 2 weeks, 1 day ago

lawsuit

LOW Academic International

Speech LLMs are Contextual Reasoning Transcribers

arXiv:2604.00610v1 Announce Type: new Abstract: Despite extensions to speech inputs, effectively leveraging the rich knowledge and contextual understanding of large language models (LLMs) in automatic speech recognition (ASR) remains non-trivial, as the task primarily involves direct speech-to-text mapping. To address...

1 min 2 weeks, 1 day ago

standing

LOW Academic International

HippoCamp: Benchmarking Contextual Agents on Personal Computers

arXiv:2604.01221v1 Announce Type: new Abstract: We present HippoCamp, a new benchmark designed to evaluate agents' capabilities on multimodal file management. Unlike existing agent benchmarks that focus on tasks like web interaction, tool use, or software automation in generic settings, HippoCamp...

1 min 2 weeks, 1 day ago

evidence

LOW Academic International

BloClaw: An Omniscient, Multi-Modal Agentic Workspace for Next-Generation Scientific Discovery

arXiv:2604.00550v1 Announce Type: new Abstract: The integration of Large Language Models (LLMs) into life sciences has catalyzed the development of "AI Scientists." However, translating these theoretical capabilities into deployment-ready research environments exposes profound infrastructural vulnerabilities. Current frameworks are bottlenecked by...

1 min 2 weeks, 1 day ago

discovery

LOW News International

AV1’s open, royalty-free promise in question as Dolby sues Snapchat over codec

Big Tech declaring AV1 royalty-free “doesn't mean that it is."

1 min 2 weeks, 6 days ago

lawsuit

LOW News International

Elon Musk loses big in court; X boycott perfectly legal

X admonished for "fishing expedition" as judge dismisses ad boycott lawsuit.

1 min 2 weeks, 6 days ago

lawsuit

OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration

VoxelCodeBench: Benchmarking 3D World Modeling Through Code Generation

Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models

ESL-Bench: An Event-Driven Synthetic Longitudinal Benchmark for Health Agents

Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

TRIMS: Trajectory-Ranked Instruction Masked Supervision for Diffusion Language Models

Therefore I am. I Think

Agent psychometrics: Task-level performance prediction in agentic coding benchmarks

Agentic AI -- Physicist Collaboration in Experimental Particle Physics: A Proof-of-Concept Measurement with LEP Open Data

DISCO-TAB: A Hierarchical Reinforcement Learning Framework for Privacy-Preserving Synthesis of Complex Clinical Data

Detecting Multi-Agent Collusion Through Multi-Agent Interpretability

Improvisational Games as a Benchmark for Social Intelligence of AI Agents: The Case of Connections

Towards Intrinsically Calibrated Uncertainty Quantification in Industrial Data-Driven Models via Diffusion Sampler

When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signals

An Online Machine Learning Multi-resolution Optimization Framework for Energy System Design Limit of Performance Analysis

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

CRIT: Graph-Based Automatic Data Synthesis to Enhance Cross-Modal Multi-Hop Reasoning

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

Does Unification Come at a Cost? Uni-SafeBench: A Safety Benchmark for Unified Multimodal Large Models

Dynin-Omni: Omnimodal Unified Large Diffusion Language Model

Can Large Language Models Self-Correct in Medical Question Answering? An Exploratory Study

How Emotion Shapes the Behavior of LLMs and Agents: A Mechanistic Study

Perplexity's "Incognito Mode" is a "sham," lawsuit says

Speech LLMs are Contextual Reasoning Transcribers

HippoCamp: Benchmarking Contextual Agents on Personal Computers

BloClaw: An Omniscient, Multi-Modal Agentic Workspace for Next-Generation Scientific Discovery

AV1’s open, royalty-free promise in question as Dolby sues Snapchat over codec

Elon Musk loses big in court; X boycott perfectly legal

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.