International Law

LOW Academic International

Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems

arXiv:2603.23508v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) is increasingly deployed in enterprise search and document-centric assistants, where responses must be grounded in long and complex source materials. In practice, verifying that generated answers faithfully reflect retrieved documents is difficult:...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking

arXiv:2603.23506v1 Announce Type: new Abstract: The rapid proliferation of large language models (LLMs) in healthcare creates an urgent need for scalable and psychometrically sound evaluation methods. Conventional static benchmarks are costly to administer repeatedly, vulnerable to data contamination, and lack...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages

arXiv:2603.23521v1 Announce Type: new Abstract: Multimodal research has predominantly focused on single-image reasoning, with limited exploration of multi-image scenarios. Recent models have sought to enhance multi-image understanding through large-scale pretraining on interleaved image-text datasets. However, most Vision-Language Models (VLMs) are...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?

arXiv:2603.23519v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various specialist domains and have been integrated into high-stakes areas such as medicine. However, as existing medical-related benchmarks rarely stress-test the long-context memory, interference robustness, and...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

arXiv:2603.23516v1 Announce Type: new Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information remains a long-standing pursuit in the field. Due to the constraints of full-attention architectures, the effective context length of large language...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

arXiv:2603.23514v1 Announce Type: new Abstract: Large Language Models appear competent when answering general questions but often fail when pushed into domain-specific details. No existing methodology provides an out-of-the-box solution for measuring how deeply LLMs can sustain accurate responses under adaptive...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Internal Safety Collapse in Frontier Large Language Models

arXiv:2603.23509v1 Announce Type: new Abstract: This work identifies a critical failure mode in frontier large language models (LLMs), which we term Internal Safety Collapse (ISC): under certain task conditions, models enter a state in which they continuously generate harmful content...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges

arXiv:2603.23659v1 Announce Type: new Abstract: When large language models make ethical judgments, do their internal representations distinguish between normative frameworks, or collapse ethics into a single acceptability dimension? We probe hidden representations across five ethical frameworks (deontology, utilitarianism, virtue, justice,...

1 min 3 weeks, 2 days ago

itar

LOW Academic International

The Diminishing Returns of Early-Exit Decoding in Modern LLMs

arXiv:2603.23701v1 Announce Type: new Abstract: In Large Language Model (LLM) inference, early-exit refers to stopping computation at an intermediate layer once the prediction is sufficiently confident, thereby reducing latency and cost. However, recent LLMs adopt improved pretraining recipes and architectures...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Language Model Planners do not Scale, but do Formalizers?

arXiv:2603.23844v1 Announce Type: new Abstract: Recent work shows overwhelming evidence that LLMs, even those trained to scale their reasoning trace, perform unsatisfactorily when solving planning problems too complex. Whether the same conclusion holds for LLM formalizers that generate solver-oriented programs...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents

arXiv:2603.23848v1 Announce Type: new Abstract: LLMs are increasingly used as long-running conversational agents, yet every major benchmark evaluating their memory treats user information as static facts to be stored and retrieved. That's the wrong model. People change their minds, and...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

OmniACBench: A Benchmark for Evaluating Context-Grounded Acoustic Control in Omni-Modal Models

arXiv:2603.23938v1 Announce Type: new Abstract: Most testbeds for omni-modal models assess multimodal understanding via textual outputs, leaving it unclear whether these models can properly speak their answers. To study this, we introduce OmniACBench, a benchmark for evaluating context-grounded acoustic control...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Argument Mining as a Text-to-Text Generation Task

arXiv:2603.23949v1 Announce Type: new Abstract: Argument Mining(AM) aims to uncover the argumentative structures within a text. Previous methods require several subtasks, such as span identification, component classification, and relation classification. Consequently, these methods need rule-based postprocessing to derive argumentative structures...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction

arXiv:2603.23550v1 Announce Type: new Abstract: Multi-turn human-AI collaboration is fundamental to deploying interactive services such as adaptive tutoring, conversational recommendation, and professional consultation. However, optimizing these interactions via reinforcement learning is hindered by the sparsity of verifiable intermediate rewards and...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Upper Entropy for 2-Monotone Lower Probabilities

arXiv:2603.23558v1 Announce Type: new Abstract: Uncertainty quantification is a key aspect in many tasks such as model selection/regularization, or quantifying prediction uncertainties to perform active learning or OOD detection. Within credal approaches that consider modeling uncertainty as probability sets, upper...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG

arXiv:2603.23562v1 Announce Type: new Abstract: Synthetic data augmentation helps language models learn new knowledge in data-constrained domains. However, naively scaling existing synthetic data methods by training on more synthetic tokens or using stronger generators yields diminishing returns below the performance...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Safe Reinforcement Learning with Preference-based Constraint Inference

arXiv:2603.23565v1 Announce Type: new Abstract: Safe reinforcement learning (RL) is a standard paradigm for safety-critical decision making. However, real-world safety constraints can be complex, subjective, and even hard to explicitly specify. Existing works on constraint inference rely on restrictive assumptions...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

arXiv:2603.23574v1 Announce Type: new Abstract: Federated Learning (FL), as a popular distributed learning paradigm, has shown outstanding performance in improving computational efficiency and protecting data privacy, and is widely applied in industrial image classification. However, due to its distributed nature,...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations

arXiv:2603.23577v1 Announce Type: new Abstract: Large language models (LLMs) generalize smoothly across continuous semantic spaces, yet strict logical reasoning demands the formation of discrete decision boundaries. Prevailing theories relying on linear isometric projections fail to resolve this fundamental tension. In...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis

arXiv:2603.23580v1 Announce Type: new Abstract: Existing LLM-based Kubernetes diagnostic systems cannot learn from operational experience, operating on static knowledge bases without improving from past resolutions. We present MetaKube, an experience-aware LLM framework through three synergistic innovations: (1) an Episodic Pattern...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

AI Generalisation Gap In Comorbid Sleep Disorder Staging

arXiv:2603.23582v1 Announce Type: new Abstract: Accurate sleep staging is essential for diagnosing OSA and hypopnea in stroke patients. Although PSG is reliable, it is costly, labor-intensive, and manually scored. While deep learning enables automated EEG-based sleep staging in healthy subjects,...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

CDMT-EHR: A Continuous-Time Diffusion Framework for Generating Mixed-Type Time-Series Electronic Health Records

arXiv:2603.23719v1 Announce Type: new Abstract: Electronic health records (EHRs) are invaluable for clinical research, yet privacy concerns severely restrict data sharing. Synthetic data generation offers a promising solution, but EHRs present unique challenges: they contain both numerical and categorical features...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

BXRL: Behavior-Explainable Reinforcement Learning

arXiv:2603.23738v1 Announce Type: new Abstract: A major challenge of Reinforcement Learning is that agents often learn undesired behaviors that seem to defy the reward structure they were given. Explainable Reinforcement Learning (XRL) methods can answer queries such as "explain this...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Self Paced Gaussian Contextual Reinforcement Learning

arXiv:2603.23755v1 Announce Type: new Abstract: Curriculum learning improves reinforcement learning (RL) efficiency by sequencing tasks from simple to complex. However, many self-paced curriculum methods rely on computationally expensive inner-loop optimizations, limiting their scalability in high-dimensional context spaces. In this paper,...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Manifold Generalization Provably Proceeds Memorization in Diffusion Models

arXiv:2603.23792v1 Announce Type: new Abstract: Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

HDPO: Hybrid Distillation Policy Optimization via Privileged Self-Distillation

arXiv:2603.23871v1 Announce Type: new Abstract: Large language models trained with reinforcement learning (RL) for mathematical reasoning face a fundamental challenge: on problems the model cannot solve at all - "cliff" prompts - the RL gradient vanishes entirely, preventing any learning...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Optimal Variance-Dependent Regret Bounds for Infinite-Horizon MDPs

arXiv:2603.23926v1 Announce Type: new Abstract: Online reinforcement learning in infinite-horizon Markov decision processes (MDPs) remains less theoretically and algorithmically developed than its episodic counterpart, with many algorithms suffering from high ``burn-in'' costs and failing to adapt to benign instance-specific complexity....

1 min 3 weeks, 2 days ago

ear

LOW Academic International

GRMLR: Knowledge-Enhanced Small-Data Learning for Deep-Sea Cold Seep Stage Inference

arXiv:2603.23961v1 Announce Type: new Abstract: Deep-sea cold seep stage assessment has traditionally relied on costly, high-risk manned submersible operations and visual surveys of macrofauna. Although microbial communities provide a promising and more cost-effective alternative, reliable inference remains challenging because the...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Diet Your LLM: Dimension-wise Global Pruning of LLMs via Merging Task-specific Importance Score

arXiv:2603.23985v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated remarkable capabilities, but their massive scale poses significant challenges for practical deployment. Structured pruning offers a promising solution by removing entire dimensions or layers, yet existing methods face critical...

1 min 3 weeks, 2 days ago

ear

LOW Academic International

Can we generate portable representations for clinical time series data using LLMs?

arXiv:2603.23987v1 Announce Type: new Abstract: Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs)...

1 min 3 weeks, 2 days ago

ear

Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems

Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking

Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages

MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?

MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

Internal Safety Collapse in Frontier Large Language Models

Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges

The Diminishing Returns of Early-Exit Decoding in Modern LLMs

Language Model Planners do not Scale, but do Formalizers?

BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents

OmniACBench: A Benchmark for Evaluating Context-Grounded Acoustic Control in Omni-Modal Models

Argument Mining as a Text-to-Text Generation Task

Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction

Upper Entropy for 2-Monotone Lower Probabilities

Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG

Safe Reinforcement Learning with Preference-based Constraint Inference

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations

MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis

AI Generalisation Gap In Comorbid Sleep Disorder Staging

CDMT-EHR: A Continuous-Time Diffusion Framework for Generating Mixed-Type Time-Series Electronic Health Records

BXRL: Behavior-Explainable Reinforcement Learning

Self Paced Gaussian Contextual Reinforcement Learning

Manifold Generalization Provably Proceeds Memorization in Diffusion Models

HDPO: Hybrid Distillation Policy Optimization via Privileged Self-Distillation

Optimal Variance-Dependent Regret Bounds for Infinite-Horizon MDPs

GRMLR: Knowledge-Enhanced Small-Data Learning for Deep-Sea Cold Seep Stage Inference

Diet Your LLM: Dimension-wise Global Pruning of LLMs via Merging Task-specific Importance Score

Can we generate portable representations for clinical time series data using LLMs?

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.