Intellectual Property

LOW Academic International

When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signals

arXiv:2604.01476v1 Announce Type: new Abstract: Reinforcement learning for LLMs is vulnerable to reward hacking, where models exploit shortcuts to maximize reward without solving the intended task. We systematically study this phenomenon in coding tasks using an environment-manipulation setting, where models...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial

arXiv:2604.01328v1 Announce Type: new Abstract: Traditional scientific discovery relies on an iterative hypothesise-experiment-refine cycle that has driven progress for centuries, but its intuitive, ad-hoc implementation often wastes resources, yields inefficient designs, and misses critical insights. This tutorial presents Bayesian Optimisation...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Asymmetric Actor-Critic for Multi-turn LLM Agents

arXiv:2604.00304v1 Announce Type: new Abstract: Large language models (LLMs) exhibit strong reasoning and conversational abilities, but ensuring reliable behavior in multi-turn interactions remains challenging. In many real-world applications, agents must succeed in one-shot settings where retries are impossible. Existing approaches...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models

arXiv:2604.00688v2 Announce Type: new Abstract: We present OmniVoice, a massive multilingual zero-shot text-to-speech (TTS) model that scales to over 600 languages. At its core is a novel diffusion language model-style discrete non-autoregressive (NAR) architecture. Unlike conventional discrete NAR models that...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

The Chronicles of RiDiC: Generating Datasets with Controlled Popularity Distribution for Long-form Factuality Evaluation

arXiv:2604.00019v1 Announce Type: cross Abstract: We present a configurable pipeline for generating multilingual sets of entities with specified characteristics, such as domain, geographical location and popularity, using data from Wikipedia and Wikidata. These datasets are intended for evaluating the factuality...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

arXiv:2604.00477v1 Announce Type: new Abstract: LLM-based agent judges are an emerging approach to evaluating conversational AI, yet a fundamental uncertainty remains: can we trust their assessments, and if so, how many are needed? Through 960 sessions with two model pairs...

1 min 2 weeks, 2 days ago

nda

LOW Academic International

Benchmark for Assessing Olfactory Perception of Large Language Models

arXiv:2604.00002v1 Announce Type: cross Abstract: Here we introduce the Olfactory Perception (OP) benchmark, designed to assess the capability of large language models (LLMs) to reason about smell. The benchmark contains 1,010 questions across eight task categories spanning odor classification, odor...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Variational LSTM with Augmented Inputs: Nonlinear Response History Metamodeling with Aleatoric and Epistemic Uncertainty

arXiv:2604.01587v1 Announce Type: new Abstract: Uncertainty propagation in high-dimensional nonlinear dynamic structural systems is pivotal in state-of-the-art performance-based design and risk assessment, where uncertainties from both excitations and structures, i.e., the aleatoric uncertainty, must be considered. This poses a significant...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Dynin-Omni: Omnimodal Unified Large Diffusion Language Model

arXiv:2604.00007v1 Announce Type: cross Abstract: We present Dynin-Omni, the first masked-diffusion-based omnimodal foundation model that unifies text, image, and speech understanding and generation, together with video understanding, within a single architecture. Unlike autoregressive unified models that serialize heterogeneous modalities, or...

1 min 2 weeks, 2 days ago

nda

LOW Academic International

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

arXiv:2604.00013v1 Announce Type: cross Abstract: Multimodal sentiment analysis aims to understand human emotions by integrating textual, auditory, and visual modalities. Although Multimodal Large Language Models (MLLMs) have achieved state-of-the-art performance via supervised fine-tuning (SFT), their end-to-end "black-box" nature limits interpretability....

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Brevity Constraints Reverse Performance Hierarchies in Language Models

arXiv:2604.00025v1 Announce Type: new Abstract: Standard evaluation protocols reveal a counterintuitive phenomenon: on 7.7% of benchmark problems spanning five datasets, larger language models underperform smaller ones by 28.4 percentage points despite 10-100x more parameters. Through systematic evaluation of 31 models...

1 min 2 weeks, 2 days ago

nda

LOW Academic International

Therefore I am. I Think

arXiv:2604.01202v2 Announce Type: new Abstract: We consider the question: when a large language reasoning model makes a choice, did it think first and then decide to, or decide first and then think? In this paper, we present evidence that detectable,...

1 min 2 weeks, 2 days ago

ip

LOW News International

Microsoft takes on AI rivals with three new foundational models

MAI released models that can transcribe voice into text as well as generate audio and images after the group's formation six months ago.

1 min 2 weeks, 2 days ago

nda

LOW Academic International

A Taxonomy of Programming Languages for Code Generation

arXiv:2604.00239v1 Announce Type: new Abstract: The world's 7,000+ languages vary widely in the availability of resources for NLP, motivating efforts to systematically categorize them by their degree of resourcefulness (Joshi et al., 2020). A similar disparity exists among programming languages...

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Criterion Validity of LLM-as-Judge for Business Outcomes in Conversational Commerce

arXiv:2604.00022v1 Announce Type: cross Abstract: Multi-dimensional rubric-based dialogue evaluation is widely used to assess conversational AI, yet its criterion validity -- whether quality scores are associated with the downstream outcomes they are meant to serve -- remains largely untested. We...

1 min 2 weeks, 2 days ago

nda

LOW News International

OpenAI, not yet public, raises $3B from retail investors in monster $122B fund raise

OpenAI's latest funding round, led by Amazon, Nvidia, and SoftBank, values the AI lab at $852 billion as it nears an IPO.

1 min 2 weeks, 2 days ago

ip

LOW Academic International

Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

arXiv:2604.00375v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) theoretically permit token decoding in arbitrary order, a flexibility that could enable richer exploration of reasoning paths than autoregressive (AR) LLMs. In practice, however, random-order decoding often hurts generation quality....

1 min 2 weeks, 2 days ago

nda

LOW Academic International

Do Language Models Know When They'll Refuse? Probing Introspective Awareness of Safety Boundaries

arXiv:2604.00228v1 Announce Type: new Abstract: Large language models are trained to refuse harmful requests, but can they accurately predict when they will refuse before responding? We investigate this question through a systematic study where models first predict their refusal behavior,...

1 min 2 weeks, 2 days ago

nda

LOW Academic International

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning

arXiv:2604.01345v1 Announce Type: new Abstract: Inverse reinforcement learning (IRL) recovers the loss function of a forward learner from its observed responses adaptive IRL aims to reconstruct the loss function of a forward learner by passively observing its gradients as it...

1 min 2 weeks, 2 days ago

nda

LOW News International

Cognichip wants AI to design the chips that power AI, and just raised $60M to try

The firm says it can reduce the cost of chip development by more than 75% and cut the timeline by more than half.

1 min 2 weeks, 2 days ago

ip

LOW Think Tank International

AI Company Safety Practices Fall Short of Public Commitments and Show Structural Weaknesses, as Top Performers Widen the Gap

But in a win for transparency, five leading companies participated in the scorecard's survey for the first time, providing critical new information to the public.

1 min 3 weeks ago

ip

LOW News International

Judge irate as defendant joins by Zoom while driving—then lies about it

"Let me see the driver!"

1 min 3 weeks ago

nda

LOW News International

AV1’s open, royalty-free promise in question as Dolby sues Snapchat over codec

Big Tech declaring AV1 royalty-free “doesn't mean that it is."

1 min 3 weeks ago

royalty

LOW News International

Senators want US energy information agency to monitor data center electricity usage

In a letter, senators press for mandated annual electricity disclosure for data centers.

1 min 3 weeks ago

nda

LOW News International

Anthropic’s Claude popularity with paying consumers is skyrocketing

Estimates for total Claude consumer users are all over the map (we've seen figures ranging from 18 million to 30 million). Anthropic hasn't disclosed this data, but a spokesperson did tell TechCrunch that Claude paid subscriptions have more than doubled...

1 min 3 weeks ago

ip

LOW News International

VCs are betting billions on AI’s next wave, so why is OpenAI killing Sora?

When an 82-year-old Kentucky woman was offered $26 million from an AI company that wanted to build a data center on her land, she said no. Sure, that same company can try to rezone 2,000 acres nearby anyway, but as...

1 min 3 weeks ago

ip

LOW News International

Wikipedia cracks down on the use of AI in article writing

The site, whose policies are subject to change, has struggled with the issue of AI-generated writing.

1 min 3 weeks ago

ip

LOW News International

Cohere launches an open source voice model specifically for transcription

Relatively light at just 2 billion parameters, the model is meant for use with consumer-grade GPUs for those who want to self-host it. It currently supports 14 languages.

1 min 3 weeks ago

ip

LOW Academic International

Corrigendum to “Generating dynamic lip-syncing using target audio in a multimedia environment” [Natural Language Processing Journal, Volume 8, 2024]

1 min 3 weeks, 2 days ago

ip

LOW Academic International

Do 3D Large Language Models Really Understand 3D Spatial Relationships?

arXiv:2603.23523v1 Announce Type: new Abstract: Recent 3D Large-Language Models (3D-LLMs) claim to understand 3D worlds, especially spatial relationships among objects. Yet, we find that simply fine-tuning a language model on text-only question-answer pairs can perform comparably or even surpass these...

1 min 3 weeks, 3 days ago

ip

When Reward Hacking Rebounds: Understanding and Mitigating It with Representation-Level Signals

Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial

Asymmetric Actor-Critic for Multi-turn LLM Agents

OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models

The Chronicles of RiDiC: Generating Datasets with Controlled Popularity Distribution for Long-form Factuality Evaluation

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

Benchmark for Assessing Olfactory Perception of Large Language Models

Variational LSTM with Augmented Inputs: Nonlinear Response History Metamodeling with Aleatoric and Epistemic Uncertainty

Dynin-Omni: Omnimodal Unified Large Diffusion Language Model

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

Brevity Constraints Reverse Performance Hierarchies in Language Models

Therefore I am. I Think

Microsoft takes on AI rivals with three new foundational models

A Taxonomy of Programming Languages for Code Generation

Criterion Validity of LLM-as-Judge for Business Outcomes in Conversational Commerce

OpenAI, not yet public, raises $3B from retail investors in monster $122B fund raise

Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

Do Language Models Know When They'll Refuse? Probing Introspective Awareness of Safety Boundaries

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning

Cognichip wants AI to design the chips that power AI, and just raised $60M to try

AI Company Safety Practices Fall Short of Public Commitments and Show Structural Weaknesses, as Top Performers Widen the Gap

Judge irate as defendant joins by Zoom while driving—then lies about it

AV1’s open, royalty-free promise in question as Dolby sues Snapchat over codec

Senators want US energy information agency to monitor data center electricity usage

Anthropic’s Claude popularity with paying consumers is skyrocketing

VCs are betting billions on AI’s next wave, so why is OpenAI killing Sora?

Wikipedia cracks down on the use of AI in article writing

Cohere launches an open source voice model specifically for transcription

Corrigendum to “Generating dynamic lip-syncing using target audio in a multimedia environment” [Natural Language Processing Journal, Volume 8, 2024]

Do 3D Large Language Models Really Understand 3D Spatial Relationships?

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.