Confidence Before Answering: A Paradigm Shift for Efficient LLM Uncertainty Estimation
arXiv:2603.05881v1 Announce Type: new Abstract: Reliable deployment of large language models (LLMs) requires accurate uncertainty estimation. Existing methods are predominantly answer-first, producing confidence only after generating an answer, which measure the correctness of a specific response and limits practical usability....
VerChol -- Grammar-First Tokenization for Agglutinative Languages
arXiv:2603.05883v1 Announce Type: new Abstract: Tokenization is the foundational step in all large language model (LLM) pipelines, yet the dominant approach Byte Pair Encoding (BPE) and its variants is inherently script agnostic and optimized for English like morphology. For agglutinative...
Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language Models
arXiv:2603.05953v1 Announce Type: new Abstract: Mental health is not a fixed trait but a dynamic process shaped by the interplay between individual dispositions and situational contexts. Building on interactionist and constructionist psychological theories, we develop interpretable models to predict well-being...
MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe Graphing
arXiv:2603.06007v1 Announce Type: new Abstract: Large language model-based (LLM-based) multi-agent systems (MAS) are increasingly used to extend agentic problem solving via role specialization and collaboration. MAS workflows can be naturally modeled as directed computation graphs, where nodes execute agents/sub-workflows and...
Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality
arXiv:2603.06088v1 Announce Type: new Abstract: Human problem-solving is enriched by a diversity of styles and personality traits, yet the development of Large Language Models (LLMs) has largely prioritized uniform performance benchmarks that favour specific behavioural tendencies such as assertiveness. To...
Diffusion Language Models Are Natively Length-Aware
arXiv:2603.06123v1 Announce Type: new Abstract: Unlike autoregressive language models, which terminate variable-length generation upon predicting an End-of-Sequence (EoS) token, Diffusion Language Models (DLMs) operate over a fixed maximum-length context window for a predetermined number of denoising steps. However, this process...
CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation
arXiv:2603.06183v1 Announce Type: new Abstract: We introduce CRIMSON, a clinically grounded evaluation framework for chest X-ray report generation that assesses reports based on diagnostic correctness, contextual relevance, and patient safety. Unlike prior metrics, CRIMSON incorporates full clinical context, including patient...
MAPO: Mixed Advantage Policy Optimization for Long-Horizon Multi-Turn Dialogue
arXiv:2603.06194v1 Announce Type: new Abstract: Subjective multi-turn dialogue tasks, such as emotional support, require conversational policies that adapt to evolving user states and optimize long-horizon interaction quality. However, reinforcement learning (RL) for such settings remains challenging due to the absence...
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling
arXiv:2603.06199v1 Announce Type: new Abstract: Long-context modeling is a pivotal capability for Large Language Models, yet the quadratic complexity of attention remains a critical bottleneck, particularly during the compute-intensive prefilling phase. While various sparse attention mechanisms have been explored, they...
SPOT: Span-level Pause-of-Thought for Efficient and Interpretable Latent Reasoning in Large Language Models
arXiv:2603.06222v1 Announce Type: new Abstract: Explicit Chain-of-Thought improves the reasoning performance of large language models but often incurs high inference cost due to verbose token-level traces. While recent approaches reduce this overhead via concise prompting or step pruning, they largely...
PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations
arXiv:2603.06485v1 Announce Type: new Abstract: Explainable Artificial Intelligence (XAI) seeks to enhance the transparency and accountability of machine learning systems, yet most methods follow a one-size-fits-all paradigm that neglects user differences in expertise, goals, and cognitive needs. Although Large Language...
Aligning the True Semantics: Constrained Decoupling and Distribution Sampling for Cross-Modal Alignment
arXiv:2603.05566v1 Announce Type: new Abstract: Cross-modal alignment is a crucial task in multimodal learning aimed at achieving semantic consistency between vision and language. This requires that image-text pairs exhibit similar semantics. Traditional algorithms pursue embedding consistency to achieve semantic consistency,...
FuseDiff: Symmetry-Preserving Joint Diffusion for Dual-Target Structure-Based Drug Design
arXiv:2603.05567v1 Announce Type: new Abstract: Dual-target structure-based drug design aims to generate a single ligand together with two pocket-specific binding poses, each compatible with a corresponding target pocket, enabling polypharmacological therapies with improved efficacy and reduced resistance. Existing approaches typically...
Bias In, Bias Out? Finding Unbiased Subnetworks in Vanilla Models
arXiv:2603.05582v1 Announce Type: new Abstract: The issue of algorithmic biases in deep learning has led to the development of various debiasing techniques, many of which perform complex training procedures or dataset manipulation. However, an intriguing question arises: is it possible...
Warm Starting State-Space Models with Automata Learning
arXiv:2603.05694v1 Announce Type: new Abstract: We prove that Moore machines can be exactly realized as state-space models (SSMs), establishing a formal correspondence between symbolic automata and these continuous machine learning architectures. These Moore-SSMs preserve both the complete symbolic structure and...
Unsupervised domain adaptation for radioisotope identification in gamma spectroscopy
arXiv:2603.05719v1 Announce Type: new Abstract: Training machine learning models for radioisotope identification using gamma spectroscopy remains an elusive challenge for many practical applications, largely stemming from the difficulty of acquiring and labeling large, diverse experimental datasets. Simulations can mitigate this...
MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation
arXiv:2603.05760v1 Announce Type: new Abstract: Multi-objective reinforcement learning (MORL) is effective for multi-echelon combinatorial supply chain optimisation, where tasks involve high dimensionality, uncertainty, and competing objectives. However, its deployment in dynamic environments is hindered by the need for task-specific retraining...
Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation
arXiv:2603.05822v1 Announce Type: new Abstract: Adapting foundation models to new clinical sites remains challenging in practice. Domain shift and scarce annotations must be handled by experts, yet many clinical groups do not have ready access to skilled AI engineers to...
Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls
arXiv:2603.05829v1 Announce Type: new Abstract: Test-time adaptation enables large language models (LLMs) to modify their behavior at inference without updating model parameters. A common approach is many-shot prompting, where large numbers of in-context learning (ICL) examples are injected as an...
Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments
arXiv:2603.06009v1 Announce Type: new Abstract: Plateaus, where an agent's performance stagnates at a suboptimal level, are a common problem in deep on-policy RL. Focusing on PPO due to its widespread adoption, we show that plateaus in certain regimes arise not...
Latent Diffusion-Based 3D Molecular Recovery from Vibrational Spectra
arXiv:2603.06113v1 Announce Type: new Abstract: Infrared (IR) spectroscopy, a type of vibrational spectroscopy, is widely used for molecular structure determination and provides critical structural information for chemists. However, existing approaches for recovering molecular structures from IR spectra typically rely on...
Ensemble Graph Neural Networks for Probabilistic Sea Surface Temperature Forecasting via Input Perturbations
arXiv:2603.06153v1 Announce Type: new Abstract: Accurate regional ocean forecasting requires models that are both computationally efficient and capable of representing predictive uncertainty. This work investigates ensemble learning strategies for sea surface temperature (SST) forecasting using Graph Neural Networks (GNNs), with...
FedSCS-XGB -- Federated Server-centric surrogate XGBoost for continual health monitoring
arXiv:2603.06224v1 Announce Type: new Abstract: Wearable sensors with local data processing can detect health threats early, enhance documentation, and support personalized therapy. In the context of spinal cord injury (SCI), which involves risks such as pressure injuries and blood pressure...
DC-Merge: Improving Model Merging with Directional Consistency
arXiv:2603.06242v1 Announce Type: new Abstract: Model merging aims to integrate multiple task-adapted models into a unified model that preserves the knowledge of each task. In this paper, we identify that the key to this knowledge retention lies in maintaining the...
An Adaptive Conceptualisation of Artificial Intelligence and the Law, Regulation and Ethics
The description of a combination of technologies as ‘artificial intelligence’ (AI) is misleading. To ascribe intelligence to a statistical model without human attribution points towards an attempt at shifting legal, social, and ethical responsibilities to machines. This paper exposes the...
Current Issue - Minnesota Law Review
Articles, Essays, & Tributes Notes Headnotes Volume 110: Fall Issue Volume 108: Symposium Supplement De Novo Blog Tweets by MinnesotaLawRev barne102 - Minnesota Law Review
Rethinking copyright exceptions in the era of generative AI: Balancing innovation and intellectual property protection
AbstractGenerative artificial intelligence (AI) systems, together with text and data mining (TDM), introduce complex challenges at the junction of data utilization and copyright laws. The inherent reliance of AI on large quantities of data, often encompassing copyrighted materials, results in...
Generative artificial intelligence empowers educational reform: current status, issues, and prospects
The emergence of Chat GPT has once again sparked a wave of information revolution in generative artificial intelligence. This article provides a detailed overview of the development and technical support of generative artificial intelligence. It conducts an in-depth analysis of...
Fall 2025 Book Symposium – Serena Mayeri’s Marital Privilege: Marriage, Inequality, and the Transformation of American Law | Law Review
Critical perspectives on AI in education: political economy, discrimination, commercialization, governance and ethics
AI in education is not only a challenging area of technical development and educational innovation, but increasingly the focus of critical analysis informed by the social sciences, philosophy and theory. This chapter provides an overview of critical perspectives on AI...