International Law

LOW Academic International

MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?

arXiv:2603.23519v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated impressive capabilities across various specialist domains and have been integrated into high-stakes areas such as medicine. However, as existing medical-related benchmarks rarely stress-test the long-context memory, interference robustness, and...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Internal Safety Collapse in Frontier Large Language Models

arXiv:2603.23509v1 Announce Type: new Abstract: This work identifies a critical failure mode in frontier large language models (LLMs), which we term Internal Safety Collapse (ISC): under certain task conditions, models enter a state in which they continuously generate harmful content...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Navigating the Concept Space of Language Models

arXiv:2603.23524v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) trained on large language model activations output thousands of features that enable mapping to human-interpretable concepts. The current practice for analyzing these features primarily relies on inspecting top-activating examples, manually browsing individual...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems

arXiv:2603.23508v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) is increasingly deployed in enterprise search and document-centric assistants, where responses must be grounded in long and complex source materials. In practice, verifying that generated answers faithfully reflect retrieved documents is difficult:...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking

arXiv:2603.23506v1 Announce Type: new Abstract: The rapid proliferation of large language models (LLMs) in healthcare creates an urgent need for scalable and psychometrically sound evaluation methods. Conventional static benchmarks are costly to administer repeatedly, vulnerable to data contamination, and lack...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

Revisiting Real-Time Digging-In Effects: No Evidence from NP/Z Garden-Paths

arXiv:2603.23624v1 Announce Type: new Abstract: Digging-in effects, where disambiguation difficulty increases with longer ambiguous regions, have been cited as evidence for self-organized sentence processing, in which structural commitments strengthen over time. In contrast, surprisal theory predicts no such effect unless...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges

arXiv:2603.23659v1 Announce Type: new Abstract: When large language models make ethical judgments, do their internal representations distinguish between normative frameworks, or collapse ethics into a single acceptability dimension? We probe hidden representations across five ethical frameworks (deontology, utilitarianism, virtue, justice,...

1 min 3 weeks, 5 days ago

itar

LOW Academic International

The Diminishing Returns of Early-Exit Decoding in Modern LLMs

arXiv:2603.23701v1 Announce Type: new Abstract: In Large Language Model (LLM) inference, early-exit refers to stopping computation at an intermediate layer once the prediction is sufficiently confident, thereby reducing latency and cost. However, recent LLMs adopt improved pretraining recipes and architectures...

1 min 3 weeks, 5 days ago

ear

LOW Academic United States

Infrequent Child-Directed Speech Is Bursty and May Draw Infant Vocalizations

arXiv:2603.23797v1 Announce Type: new Abstract: Children in many parts of the world hear relatively little speech directed to them, yet still reach major language development milestones. What differs about the speech input that infants learn from when directed input is...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

Perturbation: A simple and efficient adversarial tracer for representation learning in language models

arXiv:2603.23821v1 Announce Type: new Abstract: Linguistic representation learning in deep neural language models (LMs) has been studied for decades, for both practical and theoretical reasons. However, finding representations in LMs remains an unsolved problem, in part due to a dilemma...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Language Model Planners do not Scale, but do Formalizers?

arXiv:2603.23844v1 Announce Type: new Abstract: Recent work shows overwhelming evidence that LLMs, even those trained to scale their reasoning trace, perform unsatisfactorily when solving planning problems too complex. Whether the same conclusion holds for LLM formalizers that generate solver-oriented programs...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents

arXiv:2603.23848v1 Announce Type: new Abstract: LLMs are increasingly used as long-running conversational agents, yet every major benchmark evaluating their memory treats user information as static facts to be stored and retrieved. That's the wrong model. People change their minds, and...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

OmniACBench: A Benchmark for Evaluating Context-Grounded Acoustic Control in Omni-Modal Models

arXiv:2603.23938v1 Announce Type: new Abstract: Most testbeds for omni-modal models assess multimodal understanding via textual outputs, leaving it unclear whether these models can properly speak their answers. To study this, we introduce OmniACBench, a benchmark for evaluating context-grounded acoustic control...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Argument Mining as a Text-to-Text Generation Task

arXiv:2603.23949v1 Announce Type: new Abstract: Argument Mining(AM) aims to uncover the argumentative structures within a text. Previous methods require several subtasks, such as span identification, component classification, and relation classification. Consequently, these methods need rule-based postprocessing to derive argumentative structures...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents

arXiv:2603.23951v1 Announce Type: new Abstract: Discovering improved policy optimization algorithms for language models remains a costly manual process requiring repeated mechanism-level modification and validation. Unlike simple combinatorial code search, this problem requires searching over algorithmic mechanisms tightly coupled with training...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

Thinking with Tables: Enhancing Multi-Modal Tabular Understanding via Neuro-Symbolic Reasoning

arXiv:2603.24004v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have demonstrated remarkable reasoning capabilities across modalities such as images and text. However, tabular data, despite being a critical real-world modality, remains relatively underexplored in multimodal learning. In this paper,...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction

arXiv:2603.23550v1 Announce Type: new Abstract: Multi-turn human-AI collaboration is fundamental to deploying interactive services such as adaptive tutoring, conversational recommendation, and professional consultation. However, optimizing these interactions via reinforcement learning is hindered by the sparsity of verifiable intermediate rewards and...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Upper Entropy for 2-Monotone Lower Probabilities

arXiv:2603.23558v1 Announce Type: new Abstract: Uncertainty quantification is a key aspect in many tasks such as model selection/regularization, or quantifying prediction uncertainties to perform active learning or OOD detection. Within credal approaches that consider modeling uncertainty as probability sets, upper...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG

arXiv:2603.23562v1 Announce Type: new Abstract: Synthetic data augmentation helps language models learn new knowledge in data-constrained domains. However, naively scaling existing synthetic data methods by training on more synthetic tokens or using stronger generators yields diminishing returns below the performance...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

Safe Reinforcement Learning with Preference-based Constraint Inference

arXiv:2603.23565v1 Announce Type: new Abstract: Safe reinforcement learning (RL) is a standard paradigm for safety-critical decision making. However, real-world safety constraints can be complex, subjective, and even hard to explicitly specify. Existing works on constraint inference rely on restrictive assumptions...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization

arXiv:2603.23566v1 Announce Type: new Abstract: AscendC (Ascend C) operator optimization on Huawei Ascend neural processing units (NPUs) faces a two-fold knowledge bottleneck: unlike the CUDA ecosystem, there are few public reference implementations to learn from, and performance hinges on a...

1 min 3 weeks, 5 days ago

ear

LOW Academic United States

StateLinFormer: Stateful Training Enhancing Long-term Memory in Navigation

arXiv:2603.23571v1 Announce Type: new Abstract: Effective navigation intelligence relies on long-term memory to support both immediate generalization and sustained adaptation. However, existing approaches face a dilemma: modular systems rely on explicit mapping but lack flexibility, while Transformer-based end-to-end models are...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

Dual-Criterion Curriculum Learning: Application to Temporal Data

arXiv:2603.23573v1 Announce Type: new Abstract: Curriculum Learning (CL) is a meta-learning paradigm that trains a model by feeding the data instances incrementally according to a schedule, which is based on difficulty progression. Defining meaningful difficulty assessment measures is crucial and...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

arXiv:2603.23574v1 Announce Type: new Abstract: Federated Learning (FL), as a popular distributed learning paradigm, has shown outstanding performance in improving computational efficiency and protecting data privacy, and is widely applied in industrial image classification. However, due to its distributed nature,...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations

arXiv:2603.23577v1 Announce Type: new Abstract: Large language models (LLMs) generalize smoothly across continuous semantic spaces, yet strict logical reasoning demands the formation of discrete decision boundaries. Prevailing theories relying on linear isometric projections fail to resolve this fundamental tension. In...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

Residual Attention Physics-Informed Neural Networks for Robust Multiphysics Simulation of Steady-State Electrothermal Energy Systems

arXiv:2603.23578v1 Announce Type: new Abstract: Efficient thermal management and precise field prediction are critical for the design of advanced energy systems, including electrohydrodynamic transport, microfluidic energy harvesters, and electrically driven thermal regulators. However, the steady-state simulation of these electrothermal coupled...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis

arXiv:2603.23580v1 Announce Type: new Abstract: Existing LLM-based Kubernetes diagnostic systems cannot learn from operational experience, operating on static knowledge bases without improving from past resolutions. We present MetaKube, an experience-aware LLM framework through three synergistic innovations: (1) an Episodic Pattern...

1 min 3 weeks, 5 days ago

ear

LOW Academic International

AI Generalisation Gap In Comorbid Sleep Disorder Staging

arXiv:2603.23582v1 Announce Type: new Abstract: Accurate sleep staging is essential for diagnosing OSA and hypopnea in stroke patients. Although PSG is reliable, it is costly, labor-intensive, and manually scored. While deep learning enables automated EEG-based sleep staging in healthy subjects,...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

LineMVGNN: Anti-Money Laundering with Line-Graph-Assisted Multi-View Graph Neural Networks

arXiv:2603.23584v1 Announce Type: new Abstract: Anti-money laundering (AML) systems are important for protecting the global economy. However, conventional rule-based methods rely on domain knowledge, leading to suboptimal accuracy and a lack of scalability. Graph neural networks (GNNs) for digraphs (directed...

1 min 3 weeks, 5 days ago

ear

LOW Academic European Union

Steering Code LLMs with Activation Directions for Language and Library Control

arXiv:2603.23629v1 Announce Type: new Abstract: Code LLMs often default to particular programming languages and libraries under neutral prompts. We investigate whether these preferences are encoded as approximately linear directions in activation space that can be manipulated at inference time. Using...

1 min 3 weeks, 5 days ago

ear

MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?

Internal Safety Collapse in Frontier Large Language Models

Navigating the Concept Space of Language Models

Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems

Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking

Revisiting Real-Time Digging-In Effects: No Evidence from NP/Z Garden-Paths

Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges

The Diminishing Returns of Early-Exit Decoding in Modern LLMs

Infrequent Child-Directed Speech Is Bursty and May Draw Infant Vocalizations

Perturbation: A simple and efficient adversarial tracer for representation learning in language models

Language Model Planners do not Scale, but do Formalizers?

BeliefShift: Benchmarking Temporal Belief Consistency and Opinion Drift in LLM Agents

OmniACBench: A Benchmark for Evaluating Context-Grounded Acoustic Control in Omni-Modal Models

Argument Mining as a Text-to-Text Generation Task

From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents

Thinking with Tables: Enhancing Multi-Modal Tabular Understanding via Neuro-Symbolic Reasoning

Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction

Upper Entropy for 2-Monotone Lower Probabilities

Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG

Safe Reinforcement Learning with Preference-based Constraint Inference

AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization

StateLinFormer: Stateful Training Enhancing Long-term Memory in Navigation

Dual-Criterion Curriculum Learning: Application to Temporal Data

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations

Residual Attention Physics-Informed Neural Networks for Robust Multiphysics Simulation of Steady-State Electrothermal Energy Systems

MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis

AI Generalisation Gap In Comorbid Sleep Disorder Staging

LineMVGNN: Anti-Money Laundering with Line-Graph-Assisted Multi-View Graph Neural Networks

Steering Code LLMs with Activation Directions for Language and Library Control

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.