Immigration Law

LOW Academic International

Self-Attribution Bias: When AI Monitors Go Easy on Themselves

arXiv:2603.04582v1 Announce Type: new Abstract: Agentic systems increasingly rely on language models to monitor their own behavior. For example, coding agents may self critique generated code for pull request approval or assess the safety of tool-use actions. We show that...

1 min 1 month, 1 week ago

ead

LOW Academic International

Interactive Benchmarks

arXiv:2603.04737v1 Announce Type: new Abstract: Standard benchmarks have become increasingly unreliable due to saturation, subjectivity, and poor generalization. We argue that evaluating model's ability to acquire information actively is important to assess model's intelligence. We propose Interactive Benchmarks, a unified...

1 min 1 month, 1 week ago

tps

LOW Academic International

Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction

arXiv:2603.04783v1 Announce Type: new Abstract: While LLMs demonstrate strong reasoning capabilities when provided with full information in a single turn, they exhibit substantial vulnerability in multi-turn interactions. Specifically, when information is revealed incrementally or requires updates, models frequently fail to...

1 min 1 month, 1 week ago

ead

LOW Academic International

Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

arXiv:2603.04791v1 Announce Type: new Abstract: We introduce Timer-S1, a strong Mixture-of-Experts (MoE) time series foundation model with 8.3B total parameters, 0.75B activated parameters for each token, and a context length of 11.5K. To overcome the scalability bottleneck in existing pre-trained...

1 min 1 month, 1 week ago

ead

LOW Academic International

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

arXiv:2603.04822v1 Announce Type: new Abstract: Aligning Large Language Models (LLMs) with nuanced human values remains a critical challenge, as existing methods like Reinforcement Learning from Human Feedback (RLHF) often handle only coarse-grained attributes. In practice, fine-tuning LLMs on task-specific datasets...

1 min 1 month, 1 week ago

visa

LOW Academic International

K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

arXiv:2603.04868v1 Announce Type: new Abstract: Generating realistic and diverse trajectories is a critical challenge in autonomous driving simulation. While Large Language Models (LLMs) show promise, existing methods often rely on structured data like vectorized maps, which fail to capture the...

1 min 1 month, 1 week ago

ead

LOW Academic International

SEA-TS: Self-Evolving Agent for Autonomous Code Generation of Time Series Forecasting Algorithms

arXiv:2603.04873v1 Announce Type: new Abstract: Accurate time series forecasting underpins decision-making across domains, yet conventional ML development suffers from data scarcity in new deployments, poor adaptability under distribution shift, and diminishing returns from manual iteration. We propose Self-Evolving Agent for...

1 min 1 month, 1 week ago

ead

LOW Academic International

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

arXiv:2603.04885v1 Announce Type: new Abstract: Real-world dialogue usually unfolds as an infinite stream. It thus requires bounded-state memory mechanisms to operate within an infinite horizon. However, existing read-then-think memory is fundamentally misaligned with this setting, as it cannot support ad-hoc...

1 min 1 month, 1 week ago

ead

LOW Academic International

The Thinking Boundary: Quantifying Reasoning Suitability of Multimodal Tasks via Dual Tuning

arXiv:2603.04415v1 Announce Type: new Abstract: While reasoning-enhanced Large Language Models (LLMs) have demonstrated remarkable advances in complex tasks such as mathematics and coding, their effectiveness across universal multimodal scenarios remains uncertain. The trend of releasing parallel "Instruct" and "Thinking" models...

1 min 1 month, 1 week ago

ead

LOW Academic International

Same Input, Different Scores: A Multi Model Study on the Inconsistency of LLM Judge

arXiv:2603.04417v1 Announce Type: new Abstract: Large language models are increasingly used as automated evaluators in research and enterprise settings, a practice known as LLM-as-a-judge. While prior work has examined accuracy, bias, and alignment with human preferences, far less attention has...

1 min 1 month, 1 week ago

ead

LOW Academic International

A unified foundational framework for knowledge injection and evaluation of Large Language Models in Combustion Science

arXiv:2603.04452v1 Announce Type: new Abstract: To advance foundation Large Language Models (LLMs) for combustion science, this study presents the first end-to-end framework for developing domain-specialized models for the combustion community. The framework comprises an AI-ready multimodal knowledge base at the...

1 min 1 month, 1 week ago

ead

LOW Academic International

Induced Numerical Instability: Hidden Costs in Multimodal Large Language Models

arXiv:2603.04453v1 Announce Type: new Abstract: The use of multimodal large language models has become widespread, and as such the study of these models and their failure points has become of utmost importance. We study a novel mode of failure that...

1 min 1 month, 1 week ago

ead

LOW Academic International

Query Disambiguation via Answer-Free Context: Doubling Performance on Humanity's Last Exam

arXiv:2603.04454v1 Announce Type: new Abstract: How carefully and unambiguously a question is phrased has a profound impact on the quality of the response, for Language Models (LMs) as well as people. While model capabilities continue to advance, the interplay between...

1 min 1 month, 1 week ago

tps

LOW Academic International

From Static Inference to Dynamic Interaction: Navigating the Landscape of Streaming Large Language Models

arXiv:2603.04592v1 Announce Type: new Abstract: Standard Large Language Models (LLMs) are predominantly designed for static inference with pre-defined inputs, which limits their applicability in dynamic, real-time scenarios. To address this gap, the streaming LLM paradigm has emerged. However, existing definitions...

1 min 1 month, 1 week ago

tps

LOW Academic International

Non-Zipfian Distribution of Stopwords and Subset Selection Models

arXiv:2603.04691v1 Announce Type: new Abstract: Stopwords are words that are not very informative to the content or the meaning of a language text. Most stopwords are function words but can also be common verbs, adjectives and adverbs. In contrast to...

1 min 1 month, 1 week ago

ead

LOW Academic International

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation

arXiv:2603.04738v1 Announce Type: new Abstract: Instruction-following is a foundational capability of large language models (LLMs), with its improvement hinging on scalable and accurate feedback from judge models. However, the reliability of current judge models in instruction-following remains underexplored due to...

1 min 1 month, 1 week ago

tps

LOW Academic International

Beyond the Context Window: A Cost-Performance Analysis of Fact-Based Memory vs. Long-Context LLMs for Persistent Agents

arXiv:2603.04814v1 Announce Type: new Abstract: Persistent conversational AI systems face a choice between passing full conversation histories to a long-context large language model (LLM) and maintaining a dedicated memory system that extracts and retrieves structured facts. We compare a fact-based...

1 min 1 month, 1 week ago

ead

LOW Academic International

SinhaLegal: A Benchmark Corpus for Information Extraction and Analysis in Sinhala Legislative Texts

arXiv:2603.04854v1 Announce Type: new Abstract: SinhaLegal introduces a Sinhala legislative text corpus containing approximately 2 million words across 1,206 legal documents. The dataset includes two types of legal documents: 1,065 Acts dated from 1981 to 2014 and 141 Bills from...

1 min 1 month, 1 week ago

ead

LOW Academic International

AILS-NTUA at SemEval-2026 Task 10: Agentic LLMs for Psycholinguistic Marker Extraction and Conspiracy Endorsement Detection

arXiv:2603.04921v1 Announce Type: new Abstract: This paper presents a novel agentic LLM pipeline for SemEval-2026 Task 10 that jointly extracts psycholinguistic conspiracy markers and detects conspiracy endorsement. Unlike traditional classifiers that conflate semantic reasoning with structural localization, our decoupled design...

1 min 1 month, 1 week ago

ead

LOW Academic International

When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

arXiv:2603.04968v1 Announce Type: new Abstract: Preference alignment is an essential step in adapting large language models (LLMs) to human values, but existing approaches typically depend on costly human annotations or large-scale API-based models. We explore whether a weak LLM can...

1 min 1 month, 1 week ago

ead

LOW Academic International

MPCEval: A Benchmark for Multi-Party Conversation Generation

arXiv:2603.04969v1 Announce Type: new Abstract: Multi-party conversation generation, such as smart reply and collaborative assistants, is an increasingly important capability of generative AI, yet its evaluation remains a critical bottleneck. Compared to two-party dialogue, multi-party settings introduce distinct challenges, including...

1 min 1 month, 1 week ago

tps

LOW Academic International

Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection

arXiv:2603.04427v1 Announce Type: new Abstract: Standard transformer attention uses identical dimensionality for queries, keys, and values ($d_q = d_k = d_v = \dmodel$). Our insight is that these components serve fundamentally different roles, and this symmetry is unnecessary. Queries and...

1 min 1 month, 1 week ago

ead

LOW Academic International

VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

arXiv:2603.04460v1 Announce Type: new Abstract: The quadratic complexity of self-attention during the prefill phase impedes long-context inference in large language models. Existing sparse attention methods face a trade-off among context adaptivity, sampling overhead, and fine-tuning costs. We propose VSPrefill, a...

1 min 1 month, 1 week ago

ead

LOW Academic International

Understanding the Dynamics of Demonstration Conflict in In-Context Learning

arXiv:2603.04464v1 Announce Type: new Abstract: In-context learning enables large language models to perform novel tasks through few-shot demonstrations. However, demonstrations per se can naturally contain noise and conflicting examples, making this capability vulnerable. To understand how models process such conflicts,...

1 min 1 month, 1 week ago

ead

LOW Academic International

PDE foundation model-accelerated inverse estimation of system parameters in inertial confinement fusion

arXiv:2603.04606v1 Announce Type: new Abstract: PDE foundation models are typically pretrained on large, diverse corpora of PDE datasets and can be adapted to new settings with limited task-specific data. However, most downstream evaluations focus on forward problems, such as autoregressive...

1 min 1 month, 1 week ago

ead

LOW Academic International

When Priors Backfire: On the Vulnerability of Unlearnable Examples to Pretraining

arXiv:2603.04731v1 Announce Type: new Abstract: Unlearnable Examples (UEs) serve as a data protection strategy that generates imperceptible perturbations to mislead models into learning spurious correlations instead of underlying semantics. In this paper, we uncover a fundamental vulnerability of UEs that...

1 min 1 month, 1 week ago

ead

LOW Academic International

KindSleep: Knowledge-Informed Diagnosis of Obstructive Sleep Apnea from Oximetry

arXiv:2603.04755v1 Announce Type: new Abstract: Obstructive sleep apnea (OSA) is a sleep disorder that affects nearly one billion people globally and significantly elevates cardiovascular risk. Traditional diagnosis through polysomnography is resource-intensive and limits widespread access, creating a critical need for...

1 min 1 month, 1 week ago

ead

LOW Academic International

Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

arXiv:2603.04780v1 Announce Type: new Abstract: Causal discovery with latent variables is a fundamental task. Yet most existing methods rely on strong structural assumptions, such as enforcing specific indicator patterns for latents or restricting how they can interact with others. We...

1 min 1 month, 1 week ago

tps

LOW Academic International

Missingness Bias Calibration in Feature Attribution Explanations

arXiv:2603.04831v1 Announce Type: new Abstract: Popular explanation methods often produce unreliable feature importance scores due to missingness bias, a systematic distortion that arises when models are probed with ablated, out-of-distribution inputs. Existing solutions treat this as a deep representational flaw...

1 min 1 month, 1 week ago

ead

LOW Academic International

Why Is RLHF Alignment Shallow? A Gradient Analysis

arXiv:2603.04851v1 Announce Type: new Abstract: Why is safety alignment in LLMs shallow? We prove that gradient-based alignment inherently concentrates on positions where harm is decided and vanishes beyond. Using a martingale decomposition of sequence-level harm, we derive an exact characterization...

1 min 1 month, 1 week ago

ead

Self-Attribution Bias: When AI Monitors Go Easy on Themselves

Interactive Benchmarks

Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction

Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

SEA-TS: Self-Evolving Agent for Autonomous Code Generation of Time Series Forecasting Algorithms

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

The Thinking Boundary: Quantifying Reasoning Suitability of Multimodal Tasks via Dual Tuning

Same Input, Different Scores: A Multi Model Study on the Inconsistency of LLM Judge

A unified foundational framework for knowledge injection and evaluation of Large Language Models in Combustion Science

Induced Numerical Instability: Hidden Costs in Multimodal Large Language Models

Query Disambiguation via Answer-Free Context: Doubling Performance on Humanity's Last Exam

From Static Inference to Dynamic Interaction: Navigating the Landscape of Streaming Large Language Models

Non-Zipfian Distribution of Stopwords and Subset Selection Models

IF-RewardBench: Benchmarking Judge Models for Instruction-Following Evaluation

Beyond the Context Window: A Cost-Performance Analysis of Fact-Based Memory vs. Long-Context LLMs for Persistent Agents

SinhaLegal: A Benchmark Corpus for Information Extraction and Analysis in Sinhala Legislative Texts

AILS-NTUA at SemEval-2026 Task 10: Agentic LLMs for Psycholinguistic Marker Extraction and Conspiracy Endorsement Detection

When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

MPCEval: A Benchmark for Multi-Party Conversation Generation

Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection

VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling

Understanding the Dynamics of Demonstration Conflict in In-Context Learning

PDE foundation model-accelerated inverse estimation of system parameters in inertial confinement fusion

When Priors Backfire: On the Vulnerability of Unlearnable Examples to Pretraining

KindSleep: Knowledge-Informed Diagnosis of Obstructive Sleep Apnea from Oximetry

Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

Missingness Bias Calibration in Feature Attribution Explanations

Why Is RLHF Alignment Shallow? A Gradient Analysis

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.