International Law

LOW Academic International

CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges

arXiv:2603.11863v1 Announce Type: new Abstract: The saturation of high-quality pre-training data has shifted research focus toward evolutionary systems capable of continuously generating novel artifacts, leading to the success of AlphaEvolve. However, the progress of such systems is hindered by the...

1 min 1 month, 1 week ago

ear

LOW Academic United States

LLM-Assisted Causal Structure Disambiguation and Factor Extraction for Legal Judgment Prediction

arXiv:2603.11446v1 Announce Type: new Abstract: Mainstream methods for Legal Judgment Prediction (LJP) based on Pre-trained Language Models (PLMs) heavily rely on the statistical correlation between case facts and judgment results. This paradigm lacks explicit modeling of legal constituent elements and...

1 min 1 month, 1 week ago

ear

LOW Academic International

ThReadMed-QA: A Multi-Turn Medical Dialogue Benchmark from Real Patient Questions

arXiv:2603.11281v1 Announce Type: new Abstract: Medical question-answering benchmarks predominantly evaluate single-turn exchanges, failing to capture the iterative, clarification-seeking nature of real patient consultations. We introduce ThReadMed-QA, a benchmark of 2,437 fully-answered patient-physician conversation threads extracted from r/AskDocs, comprising 8,204 question-answer...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

CINDI: Conditional Imputation and Noisy Data Integrity with Flows in Power Grid Data

arXiv:2603.11745v1 Announce Type: new Abstract: Real-world multivariate time series, particularly in critical infrastructure such as electrical power grids, are often corrupted by noise and anomalies that degrade the performance of downstream tasks. Standard data cleaning approaches often rely on disjoint...

1 min 1 month, 1 week ago

ear

LOW Academic United States

Counterweights and Complementarities: The Convergence of AI and Blockchain Powering a Decentralized Future

arXiv:2603.11299v1 Announce Type: new Abstract: This editorial addresses the critical intersection of artificial intelligence (AI) and blockchain technologies, highlighting their contrasting tendencies toward centralization and decentralization, respectively. While AI, particularly with the rise of large language models (LLMs), exhibits a...

1 min 1 month, 1 week ago

ear

LOW Academic United States

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

arXiv:2603.11214v1 Announce Type: new Abstract: We evaluate the autonomous cyber-attack capabilities of frontier AI models on two purpose-built cyber ranges-a 32-step corporate network attack and a 7-step industrial control system attack-that require chaining heterogeneous capabilities across extended action sequences. By...

1 min 1 month, 1 week ago

ear

LOW Academic International

Summarize Before You Speak with ARACH: A Training-Free Inference-Time Plug-In for Enhancing LLMs via Global Attention Reallocation

arXiv:2603.11067v1 Announce Type: new Abstract: Large language models (LLMs) achieve remarkable performance, yet further gains often require costly training. This has motivated growing interest in post-training techniques-especially training-free approaches that improve models at inference time without updating weights. Most training-free...

1 min 1 month, 1 week ago

ear

LOW Academic International

The Unlearning Mirage: A Dynamic Framework for Evaluating LLM Unlearning

arXiv:2603.11266v1 Announce Type: new Abstract: Unlearning in Large Language Models (LLMs) aims to enhance safety, mitigate biases, and comply with legal mandates, such as the right to be forgotten. However, existing unlearning methods are brittle: minor query modifications, such as...

1 min 1 month, 1 week ago

ear

LOW Academic International

Examining Users' Behavioural Intention to Use OpenClaw Through the Cognition--Affect--Conation Framework

arXiv:2603.11455v1 Announce Type: new Abstract: This study examines users' behavioural intention to use OpenClaw through the Cognition--Affect--Conation (CAC) framework. The research investigates how cognitive perceptions of the system influence affective responses and subsequently shape behavioural intention. Enabling factors include perceived...

1 min 1 month, 1 week ago

ear

LOW Academic International

COMPASS: The explainable agentic framework for Sovereignty, Sustainability, Compliance, and Ethics

arXiv:2603.11277v1 Announce Type: new Abstract: The rapid proliferation of large language model (LLM)-based agentic systems raises critical concerns regarding digital sovereignty, environmental sustainability, regulatory compliance, and ethical alignment. Whilst existing frameworks address individual dimensions in isolation, no unified architecture systematically...

1 min 1 month, 1 week ago

sovereignty

LOW Academic United States

GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics

arXiv:2603.11442v1 Announce Type: new Abstract: Can humans detect AI-generated financial documents better than machines? We present GPT4o-Receipt, a benchmark of 1,235 receipt images pairing GPT-4o-generated receipts with authentic ones from established datasets, evaluated by five state-of-the-art multimodal LLMs and a...

1 min 1 month, 1 week ago

ear

LOW Academic United States

A Survey of Reasoning in Autonomous Driving Systems: Open Challenges and Emerging Paradigms

arXiv:2603.11093v1 Announce Type: new Abstract: The development of high-level autonomous driving (AD) is shifting from perception-centric limitations to a more fundamental bottleneck, namely, a deficit in robust and generalizable reasoning. Although current AD systems manage structured environments, they consistently falter...

1 min 1 month, 1 week ago

ear

LOW Academic International

Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs

arXiv:2603.11495v1 Announce Type: new Abstract: Tool-calling empowers Large Language Models (LLMs) to interact with external environments. However, current methods often struggle to handle massive and noisy candidate tools in long-context tool-calling tasks, limiting their real-world application. To this end, we...

1 min 1 month, 1 week ago

ear

LOW Academic United States

Can Small Language Models Use What They Retrieve? An Empirical Study of Retrieval Utilization Across Model Scale

arXiv:2603.11513v1 Announce Type: new Abstract: Retrieval augmented generation RAG is widely deployed to improve factual accuracy in language models yet it remains unclear whether smaller models of size 7B parameters or less can effectively utilize retrieved information. To investigate this...

1 min 1 month, 1 week ago

ear

LOW Academic International

One Supervisor, Many Modalities: Adaptive Tool Orchestration for Autonomous Queries

arXiv:2603.11545v1 Announce Type: new Abstract: We present an agentic AI framework for autonomous multimodal query processing that coordinates specialized tools across text, image, audio, video, and document modalities. A central Supervisor dynamically decomposes user queries, delegates subtasks to modality-appropriate tools...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

Where Matters More Than What: Decoding-aligned KV Cache Compression via Position-aware Pseudo Queries

arXiv:2603.11564v1 Announce Type: new Abstract: The Key-Value (KV) cache is crucial for efficient Large Language Models (LLMs) inference, but excessively long contexts drastically increase KV cache memory footprint. Existing KV cache compression methods typically rely on input-side attention patterns within...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

Streaming Translation and Transcription Through Speech-to-Text Causal Alignment

arXiv:2603.11578v1 Announce Type: new Abstract: Simultaneous machine translation (SiMT) has traditionally relied on offline machine translation models coupled with human-engineered heuristics or learned policies. We propose Hikari, a policy-free, fully end-to-end model that performs simultaneous speech-to-text translation and streaming transcription...

1 min 1 month, 1 week ago

ear

LOW Academic International

QChunker: Learning Question-Aware Text Chunking for Domain RAG via Multi-Agent Debate

arXiv:2603.11650v1 Announce Type: new Abstract: The effectiveness upper bound of retrieval-augmented generation (RAG) is fundamentally constrained by the semantic integrity and information granularity of text chunks in its knowledge base. To address these challenges, this paper proposes QChunker, which restructures...

1 min 1 month, 1 week ago

ear

LOW Academic International

Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge

arXiv:2603.11665v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have been widely adopted as MLLM-as-a-Judges due to their strong alignment with human judgment across various visual tasks. However, most existing judge models are optimized for single-task scenarios and struggle...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

Semi-Synthetic Parallel Data for Translation Quality Estimation: A Case Study of Dataset Building for an Under-Resourced Language Pair

arXiv:2603.11743v1 Announce Type: new Abstract: Quality estimation (QE) plays a crucial role in machine translation (MT) workflows, as it serves to evaluate generated outputs that have no reference translations and to determine whether human post-editing or full retranslation is necessary....

1 min 1 month, 1 week ago

ear

LOW Academic International

Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information

arXiv:2603.11749v1 Announce Type: new Abstract: Why do language models sometimes prefer correct statements even when trained on mixed-quality data? We introduce the Compression--Consistency Principle: next-token prediction favors hypotheses that allow shorter and more internally consistent descriptions of the training data....

1 min 1 month, 1 week ago

ear

LOW Academic United States

Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents

arXiv:2603.11772v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has emerged as a promising technology for legal document consultation, yet its application in Chinese legal scenarios faces two key limitations: existing benchmarks lack specialized support for joint retriever-generator evaluation, and mainstream...

1 min 1 month, 1 week ago

ear

LOW Academic International

Large Language Models for Biomedical Article Classification

arXiv:2603.11780v1 Announce Type: new Abstract: This work presents a systematic and in-depth investigation of the utility of large language models as text classifiers for biomedical article classification. The study uses several small and mid-size open source models, as well as...

1 min 1 month, 1 week ago

ear

LOW Academic International

DatedGPT: Preventing Lookahead Bias in Large Language Models with Time-Aware Pretraining

arXiv:2603.11838v1 Announce Type: new Abstract: In financial backtesting, large language models pretrained on internet-scale data risk introducing lookahead bias that undermines their forecasting validity, as they may have already seen the true outcome during training. To address this, we present...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language

arXiv:2603.11881v1 Announce Type: new Abstract: This report details the creation of Bielik-Minitron-7B, a compressed 7.35B parameter version of the Bielik-11B-v3.0 model, specifically optimized for European languages. By leveraging a two-stage compression methodology inspired by the NVIDIA Minitron approach, we combined...

1 min 1 month, 1 week ago

ear

LOW Academic International

PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents

arXiv:2603.11955v1 Announce Type: new Abstract: Digital footprints (records of individuals' interactions with digital systems) are essential for studying behavior, developing personalized applications, and training machine learning models. However, research in this area is often hindered by the scarcity of diverse...

1 min 1 month, 1 week ago

ear

LOW Academic International

CHiL(L)Grader: Calibrated Human-in-the-Loop Short-Answer Grading

arXiv:2603.11957v1 Announce Type: new Abstract: Scaling educational assessment with large language models requires not just accuracy, but the ability to recognize when predictions are trustworthy. Instruction-tuned models tend to be overconfident, and their reliability deteriorates as curricula evolve, making fully...

1 min 1 month, 1 week ago

ear

LOW Academic International

BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs

arXiv:2603.11991v1 Announce Type: new Abstract: Zero-shot text classification (ZSC) offers the promise of eliminating costly task-specific annotation by matching texts directly to human-readable label descriptions. While early approaches have predominantly relied on cross-encoder models fine-tuned for natural language inference (NLI),...

1 min 1 month, 1 week ago

ear

LOW Academic United States

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

arXiv:2603.12201v1 Announce Type: new Abstract: Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention...

1 min 1 month, 1 week ago

ear

LOW Academic United States

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

arXiv:2603.12206v1 Announce Type: new Abstract: State space models (SSMs) like Mamba have gained significant traction as efficient alternatives to Transformers, achieving linear complexity while maintaining competitive performance. However, Hidden State Poisoning Attacks (HiSPAs), a recently discovered vulnerability that corrupts SSM...

1 min 1 month, 1 week ago

ear

CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges

LLM-Assisted Causal Structure Disambiguation and Factor Extraction for Legal Judgment Prediction

ThReadMed-QA: A Multi-Turn Medical Dialogue Benchmark from Real Patient Questions

CINDI: Conditional Imputation and Noisy Data Integrity with Flows in Power Grid Data

Counterweights and Complementarities: The Convergence of AI and Blockchain Powering a Decentralized Future

Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios

Summarize Before You Speak with ARACH: A Training-Free Inference-Time Plug-In for Enhancing LLMs via Global Attention Reallocation

The Unlearning Mirage: A Dynamic Framework for Evaluating LLM Unlearning

Examining Users' Behavioural Intention to Use OpenClaw Through the Cognition--Affect--Conation Framework

COMPASS: The explainable agentic framework for Sovereignty, Sustainability, Compliance, and Ethics

GPT4o-Receipt: A Dataset and Human Study for AI-Generated Document Forensics

A Survey of Reasoning in Autonomous Driving Systems: Open Challenges and Emerging Paradigms

Try, Check and Retry: A Divide-and-Conquer Framework for Boosting Long-context Tool-Calling Performance of LLMs

Can Small Language Models Use What They Retrieve? An Empirical Study of Retrieval Utilization Across Model Scale

One Supervisor, Many Modalities: Adaptive Tool Orchestration for Autonomous Queries

Where Matters More Than What: Decoding-aligned KV Cache Compression via Position-aware Pseudo Queries

Streaming Translation and Transcription Through Speech-to-Text Causal Alignment

QChunker: Learning Question-Aware Text Chunking for Domain RAG via Multi-Agent Debate

Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge

Semi-Synthetic Parallel Data for Translation Quality Estimation: A Case Study of Dataset Building for an Under-Resourced Language Pair

Compression Favors Consistency, Not Truth: When and Why Language Models Prefer Correct Information

Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents

Large Language Models for Biomedical Article Classification

DatedGPT: Preventing Lookahead Bias in Large Language Models with Time-Aware Pretraining

Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language

PersonaTrace: Synthesizing Realistic Digital Footprints with LLM Agents

CHiL(L)Grader: Calibrated Human-in-the-Loop Short-Answer Grading

BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.