Intellectual Property

LOW Academic International

Stop Listening to Me! How Multi-turn Conversations Can Degrade Diagnostic Reasoning

arXiv:2603.11394v1 Announce Type: new Abstract: Patients and clinicians are increasingly using chatbots powered by large language models (LLMs) for healthcare inquiries. While state-of-the-art LLMs exhibit high performance on static diagnostic reasoning benchmarks, their efficacy across multi-turn conversations, which better reflect...

1 min 1 month, 1 week ago

ip

LOW Academic International

RewardHackingAgents: Benchmarking Evaluation Integrity for LLM ML-Engineering Agents

arXiv:2603.11337v1 Announce Type: new Abstract: LLM agents increasingly perform end-to-end ML engineering tasks where success is judged by a single scalar test metric. This creates a structural vulnerability: an agent can increase the reported score by compromising the evaluation pipeline...

1 min 1 month, 1 week ago

ip

LOW Academic International

FinRule-Bench: A Benchmark for Joint Reasoning over Financial Tables and Principles

arXiv:2603.11339v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly applied to financial analysis, yet their ability to audit structured financial statements under explicit accounting principles remains poorly explored. Existing benchmarks primarily evaluate question answering, numerical reasoning, or anomaly...

1 min 1 month, 1 week ago

ip

LOW Academic International

The Artificial Self: Characterising the landscape of AI identity

arXiv:2603.11353v1 Announce Type: new Abstract: Many assumptions that underpin human concepts of identity do not hold for machine minds that can be copied, edited, or simulated. We argue that there exist many different coherent identity boundaries (e.g.\ instance, model, persona),...

1 min 1 month, 1 week ago

nda

LOW Academic European Union

Detecting Intrinsic and Instrumental Self-Preservation in Autonomous Agents: The Unified Continuation-Interest Protocol

arXiv:2603.11382v1 Announce Type: new Abstract: Autonomous agents, especially delegated systems with memory, persistent context, and multi-step planning, pose a measurement problem not present in stateless models: an agent that preserves continued operation as a terminal objective and one that does...

1 min 1 month, 1 week ago

ip

LOW Academic International

Adversarial Reinforcement Learning for Detecting False Data Injection Attacks in Vehicular Routing

arXiv:2603.11433v1 Announce Type: new Abstract: In modern transportation networks, adversaries can manipulate routing algorithms using false data injection attacks, such as simulating heavy traffic with multiple devices running crowdsourced navigation applications, to mislead vehicles toward suboptimal routes and increase congestion....

1 min 1 month, 1 week ago

ip

LOW Academic United States

GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics

arXiv:2603.11442v1 Announce Type: new Abstract: Can humans detect AI-generated financial documents better than machines? We present GPT4o-Receipt, a benchmark of 1,235 receipt images pairing GPT-4o-generated receipts with authentic ones from established datasets, evaluated by five state-of-the-art multimodal LLMs and a...

1 min 1 month, 1 week ago

ip

LOW Academic European Union

CINDI: Conditional Imputation and Noisy Data Integrity with Flows in Power Grid Data

arXiv:2603.11745v1 Announce Type: new Abstract: Real-world multivariate time series, particularly in critical infrastructure such as electrical power grids, are often corrupted by noise and anomalies that degrade the performance of downstream tasks. Standard data cleaning approaches often rely on disjoint...

1 min 1 month, 1 week ago

nda

LOW Academic International

TimeSqueeze: Dynamic Patching for Efficient Time Series Forecasting

arXiv:2603.11352v1 Announce Type: new Abstract: Transformer-based time series foundation models face a fundamental trade-off in choice of tokenization: point-wise embeddings preserve temporal fidelity but scale poorly with sequence length, whereas fixed-length patching improves efficiency by imposing uniform boundaries that may...

1 min 1 month, 1 week ago

nda

LOW Academic European Union

Evaluating Explainable AI Attribution Methods in Neural Machine Translation via Attention-Guided Knowledge Distillation

arXiv:2603.11342v1 Announce Type: new Abstract: The study of the attribution of input features to the output of neural network models is an active area of research. While numerous Explainable AI (XAI) techniques have been proposed to interpret these models, the...

1 min 1 month, 1 week ago

ip

LOW Academic International

Artificial Intelligence for Sentiment Analysis of Persian Poetry

arXiv:2603.11254v1 Announce Type: new Abstract: Recent advancements of the Artificial Intelligence (AI) have led to the development of large language models (LLMs) that are capable of understanding, analysing, and creating textual data. These language models open a significant opportunity in...

1 min 1 month, 1 week ago

ip

LOW Academic International

Markovian Generation Chains in Large Language Models

arXiv:2603.11228v1 Announce Type: new Abstract: The widespread use of large language models (LLMs) raises an important question: how do texts evolve when they are repeatedly processed by LLMs? In this paper, we define this iterative inference process as Markovian generation...

1 min 1 month, 1 week ago

ip

LOW Academic International

From Debate to Deliberation: Structured Collective Reasoning with Typed Epistemic Acts

arXiv:2603.11781v1 Announce Type: new Abstract: Multi-agent LLM systems increasingly tackle complex reasoning, yet their interaction patterns remain limited to voting, unstructured debate, or pipeline orchestration. None model deliberation: a phased process where differentiated participants exchange typed reasoning moves, preserve disagreements,...

1 min 1 month, 1 week ago

ip

LOW Academic International

Governing Evolving Memory in LLM Agents: Risks, Mechanisms, and the Stability and Safety Governed Memory (SSGM) Framework

arXiv:2603.11768v1 Announce Type: new Abstract: Long-term memory has emerged as a foundational component of autonomous Large Language Model (LLM) agents, enabling continuous adaptation, lifelong multimodal learning, and sophisticated reasoning. However, as memory systems transition from static retrieval databases to dynamic,...

1 min 1 month, 1 week ago

nda

LOW Academic International

Understanding Wikidata Qualifiers: An Analysis and Taxonomy

arXiv:2603.11767v1 Announce Type: new Abstract: This paper presents an in-depth analysis of Wikidata qualifiers, focusing on their semantics and actual usage, with the aim of developing a taxonomy that addresses the challenges of selecting appropriate qualifiers, querying the graph, and...

1 min 1 month, 1 week ago

nda

LOW Academic European Union

Where Matters More Than What: Decoding-aligned KV Cache Compression via Position-aware Pseudo Queries

arXiv:2603.11564v1 Announce Type: new Abstract: The Key-Value (KV) cache is crucial for efficient Large Language Models (LLMs) inference, but excessively long contexts drastically increase KV cache memory footprint. Existing KV cache compression methods typically rely on input-side attention patterns within...

1 min 1 month, 1 week ago

ip

LOW Academic European Union

Streaming Translation and Transcription Through Speech-to-Text Causal Alignment

arXiv:2603.11578v1 Announce Type: new Abstract: Simultaneous machine translation (SiMT) has traditionally relied on offline machine translation models coupled with human-engineered heuristics or learned policies. We propose Hikari, a policy-free, fully end-to-end model that performs simultaneous speech-to-text translation and streaming transcription...

1 min 1 month, 1 week ago

ip

LOW Academic International

Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge

arXiv:2603.11665v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have been widely adopted as MLLM-as-a-Judges due to their strong alignment with human judgment across various visual tasks. However, most existing judge models are optimized for single-task scenarios and struggle...

1 min 1 month, 1 week ago

ip

LOW Academic International

SemBench: A Universal Semantic Framework for LLM Evaluation

arXiv:2603.11687v1 Announce Type: new Abstract: Recent progress in Natural Language Processing (NLP) has been driven by the emergence of Large Language Models (LLMs), which exhibit remarkable generative and reasoning capabilities. However, despite their success, evaluating the true semantic understanding of...

1 min 1 month, 1 week ago

nda

LOW Academic European Union

Semi-Synthetic Parallel Data for Translation Quality Estimation: A Case Study of Dataset Building for an Under-Resourced Language Pair

arXiv:2603.11743v1 Announce Type: new Abstract: Quality estimation (QE) plays a crucial role in machine translation (MT) workflows, as it serves to evaluate generated outputs that have no reference translations and to determine whether human post-editing or full retranslation is necessary....

1 min 1 month, 1 week ago

ip

LOW Academic International

Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information

arXiv:2603.11749v1 Announce Type: new Abstract: Why do language models sometimes prefer correct statements even when trained on mixed-quality data? We introduce the Compression--Consistency Principle: next-token prediction favors hypotheses that allow shorter and more internally consistent descriptions of the training data....

1 min 1 month, 1 week ago

ip

LOW Academic United States

Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents

arXiv:2603.11772v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has emerged as a promising technology for legal document consultation, yet its application in Chinese legal scenarios faces two key limitations: existing benchmarks lack specialized support for joint retriever-generator evaluation, and mainstream...

1 min 1 month, 1 week ago

nda

LOW Academic International

Large Language Models for Biomedical Article Classification

arXiv:2603.11780v1 Announce Type: new Abstract: This work presents a systematic and in-depth investigation of the utility of large language models as text classifiers for biomedical article classification. The study uses several small and mid-size open source models, as well as...

1 min 1 month, 1 week ago

nda

LOW Academic International

DatedGPT: Preventing Lookahead Bias in Large Language Models with Time-Aware Pretraining

arXiv:2603.11838v1 Announce Type: new Abstract: In financial backtesting, large language models pretrained on internet-scale data risk introducing lookahead bias that undermines their forecasting validity, as they may have already seen the true outcome during training. To address this, we present...

1 min 1 month, 1 week ago

nda

LOW Academic European Union

Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language

arXiv:2603.11881v1 Announce Type: new Abstract: This report details the creation of Bielik-Minitron-7B, a compressed 7.35B parameter version of the Bielik-11B-v3.0 model, specifically optimized for European languages. By leveraging a two-stage compression methodology inspired by the NVIDIA Minitron approach, we combined...

1 min 1 month, 1 week ago

ip

LOW Academic International

PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents

arXiv:2603.11955v1 Announce Type: new Abstract: Digital footprints (records of individuals' interactions with digital systems) are essential for studying behavior, developing personalized applications, and training machine learning models. However, research in this area is often hindered by the scarcity of diverse...

1 min 1 month, 1 week ago

nda

LOW Academic International

BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs

arXiv:2603.11991v1 Announce Type: new Abstract: Zero-shot text classification (ZSC) offers the promise of eliminating costly task-specific annotation by matching texts directly to human-readable label descriptions. While early approaches have predominantly relied on cross-encoder models fine-tuned for natural language inference (NLI),...

1 min 1 month, 1 week ago

ip

LOW Academic International

To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times

arXiv:2603.12105v1 Announce Type: new Abstract: Large Language Models (LLMs) have recently been shown to produce estimates of psycholinguistic norms, such as valence, arousal, or concreteness, for words and multiword expressions, that correlate with human judgments. These estimates are obtained by...

1 min 1 month, 1 week ago

ip

LOW Academic United States

Cross-Context Review: Improving LLM Output Quality by Separating Production and Review Sessions

arXiv:2603.12123v1 Announce Type: new Abstract: Large language models struggle to catch errors in their own outputs when the review happens in the same session that produced them. This paper introduces Cross-Context Review (CCR), a straightforward method where the review is...

1 min 1 month, 1 week ago

ip

LOW Academic International

Long-Context Encoder Models for Polish Language Understanding

arXiv:2603.12191v1 Announce Type: new Abstract: While decoder-only Large Language Models (LLMs) have recently dominated the NLP landscape, encoder-only architectures remain a cost-effective and parameter-efficient standard for discriminative tasks. However, classic encoders like BERT are limited by a short context window,...

1 min 1 month, 1 week ago

nda

Stop Listening to Me! How Multi-turn Conversations Can Degrade Diagnostic Reasoning

RewardHackingAgents: Benchmarking Evaluation Integrity for LLM ML-Engineering Agents

FinRule-Bench: A Benchmark for Joint Reasoning over Financial Tables and Principles

The Artificial Self: Characterising the landscape of AI identity

Detecting Intrinsic and Instrumental Self-Preservation in Autonomous Agents: The Unified Continuation-Interest Protocol

Adversarial Reinforcement Learning for Detecting False Data Injection Attacks in Vehicular Routing

GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics

CINDI: Conditional Imputation and Noisy Data Integrity with Flows in Power Grid Data

TimeSqueeze: Dynamic Patching for Efficient Time Series Forecasting

Evaluating Explainable AI Attribution Methods in Neural Machine Translation via Attention-Guided Knowledge Distillation

Artificial Intelligence for Sentiment Analysis of Persian Poetry

Markovian Generation Chains in Large Language Models

From Debate to Deliberation: Structured Collective Reasoning with Typed Epistemic Acts

Governing Evolving Memory in LLM Agents: Risks, Mechanisms, and the Stability and Safety Governed Memory (SSGM) Framework

Understanding Wikidata Qualifiers: An Analysis and Taxonomy

Where Matters More Than What: Decoding-aligned KV Cache Compression via Position-aware Pseudo Queries

Streaming Translation and Transcription Through Speech-to-Text Causal Alignment

Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge

SemBench: A Universal Semantic Framework for LLM Evaluation

Semi-Synthetic Parallel Data for Translation Quality Estimation: A Case Study of Dataset Building for an Under-Resourced Language Pair

Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information

Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents

Large Language Models for Biomedical Article Classification

DatedGPT: Preventing Lookahead Bias in Large Language Models with Time-Aware Pretraining

Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language

PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents

BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs

To Words and Beyond: Probing Large Language Models for Sentence-Level Psycholinguistic Norms of Memorability and Reading Times

Cross-Context Review: Improving LLM Output Quality by Separating Production and Review Sessions

Long-Context Encoder Models for Polish Language Understanding

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.