Apparent Age Estimation: Challenges and Outcomes
arXiv:2604.03335v1 Announce Type: new Abstract: Apparent age estimation is a valuable tool for business personalization, yet current models frequently exhibit demographic biases. We review prior works on the DEX method by applying distribution learning techniques such as Mean-Variance Loss (MVL)...
Automated Conjecture Resolution with Formal Verification
arXiv:2604.03789v1 Announce Type: new Abstract: Recent advances in large language models have significantly improved their ability to perform mathematical reasoning, extending from elementary problem solving to increasingly capable performance on research-level problems. However, reliably solving and verifying such problems remains...
SoLA: Leveraging Soft Activation Sparsity and Low-Rank Decomposition for Large Language Model Compression
arXiv:2604.03258v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated impressive capabilities across various tasks, but the billion-scale parameters pose deployment challenges. Although existing methods attempt to reduce the scale of LLMs, they require either special hardware support or...
Researchers waste 80% of LLM annotation costs by classifying one text at a time
arXiv:2604.03684v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly being used for text classification across the social sciences, yet researchers overwhelmingly classify one text per variable per prompt. Coding 100,000 texts on four variables requires 400,000 API calls....
Structured Multi-Criteria Evaluation of Large Language Models with Fuzzy Analytic Hierarchy Process and DualJudge
arXiv:2604.03742v1 Announce Type: new Abstract: Effective evaluation of large language models (LLMs) remains a critical bottleneck, as conventional direct scoring often yields inconsistent and opaque judgments. In this work, we adapt the Analytic Hierarchy Process (AHP) to LLM-based evaluation and,...
CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection
arXiv:2604.04174v1 Announce Type: new Abstract: The proliferation of fake news across diverse domains highlights critical limitations in current detection systems, which often exhibit narrow domain specificity and poor generalization. Existing cross-domain approaches face two key challenges: (1) reliance on labelled...
Why Attend to Everything? Focus is the Key
arXiv:2604.03260v1 Announce Type: new Abstract: We introduce Focus, a method that learns which token pairs matter rather than approximating all of them. Learnable centroids assign tokens to groups; distant attention is restricted to same-group pairs while local attention operates at...
Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents
arXiv:2604.04131v1 Announce Type: new Abstract: Large language model agents that use external tools are often implemented through reactive execution, in which reasoning is repeatedly recomputed after each observation, increasing latency and sensitivity to error propagation. This work introduces Profile--Then--Reason (PTR),...
Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting
arXiv:2604.04145v1 Announce Type: new Abstract: Photovoltaic (PV) power forecasting plays a critical role in power system dispatch and market participation. Because PV generation is highly sensitive to weather conditions and cloud motion, accurate forecasting requires effective modeling of complex spatiotemporal...
Provable Multi-Task Reinforcement Learning: A Representation Learning Framework with Low Rank Rewards
arXiv:2604.03891v1 Announce Type: new Abstract: Multi-task representation learning (MTRL) is an approach that learns shared latent representations across related tasks, facilitating collaborative learning that improves the overall learning efficiency. This paper studies MTRL for multi-task reinforcement learning (RL), where multiple...
Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference
arXiv:2604.03950v1 Announce Type: new Abstract: Transformer-based large language models (LLMs) have demonstrated remarkable performance across a wide range of real-world tasks, but their inference cost remains prohibitively high due to the quadratic complexity of attention and the memory bandwidth limitations...
Shorter, but Still Trustworthy? An Empirical Study of Chain-of-Thought Compression
arXiv:2604.04120v1 Announce Type: new Abstract: Long chain-of-thought (Long-CoT) reasoning models have motivated a growing body of work on compressing reasoning traces to reduce inference cost, yet existing evaluations focus almost exclusively on task accuracy and token savings. Trustworthiness properties, whether...
Automated Attention Pattern Discovery at Scale in Large Language Models
arXiv:2604.03764v1 Announce Type: new Abstract: Large language models have found success by scaling up capabilities to work in general settings. The same can unfortunately not be said for interpretability methods. The current trend in mechanistic interpretability is to provide precise...
TABQAWORLD: Optimizing Multimodal Reasoning for Multi-Turn Table Question Answering
arXiv:2604.03393v1 Announce Type: new Abstract: Multimodal reasoning has emerged as a powerful framework for enhancing reasoning capabilities of reasoning models. While multi-turn table reasoning methods have improved reasoning accuracy through tool use and reward modeling, they rely on fixed text...
Comparative reversal learning reveals rigid adaptation in LLMs under non-stationary uncertainty
arXiv:2604.04182v1 Announce Type: new Abstract: Non-stationary environments require agents to revise previously learned action values when contingencies change. We treat large language models (LLMs) as sequential decision policies in a two-option probabilistic reversal-learning task with three latent states and switch...
Adaptive Threshold-Driven Continuous Greedy Method for Scalable Submodular Optimization
arXiv:2604.03419v1 Announce Type: new Abstract: Submodular maximization under matroid constraints is a fundamental problem in combinatorial optimization with applications in sensing, data summarization, active learning, and resource allocation. While the Sequential Greedy (SG) algorithm achieves only a $\frac{1}{2}$-approximation due to...
MultiPress: A Multi-Agent Framework for Interpretable Multimodal News Classification
arXiv:2604.03586v1 Announce Type: new Abstract: With the growing prevalence of multimodal news content, effective news topic classification demands models capable of jointly understanding and reasoning over heterogeneous data such as text and images. Existing methods often process modalities independently or...
Where to Steer: Input-Dependent Layer Selection for Steering Improves LLM Alignment
arXiv:2604.03867v1 Announce Type: new Abstract: Steering vectors have emerged as a lightweight and effective approach for aligning large language models (LLMs) at inference time, enabling modulation over model behaviors by shifting LLM representations towards a target behavior. However, existing methods...
Multirate Stein Variational Gradient Descent for Efficient Bayesian Sampling
arXiv:2604.03981v1 Announce Type: new Abstract: Many particle-based Bayesian inference methods use a single global step size for all parts of the update. In Stein variational gradient descent (SVGD), however, each update combines two qualitatively different effects: attraction toward high-posterior regions...
Embedding Enhancement via Fine-Tuned Language Models for Learner-Item Cognitive Modeling
arXiv:2604.04088v1 Announce Type: new Abstract: Learner-item cognitive modeling plays a central role in the web-based online intelligent education system by enabling cognitive diagnosis (CD) across diverse online educational scenarios. Although ID embedding remains the mainstream approach in cognitive modeling due...
AdaptFuse: Training-Free Sequential Preference Learning via Externalized Bayesian Inference
arXiv:2604.03925v1 Announce Type: new Abstract: Large language models struggle to accumulate evidence across multiple rounds of user interaction, failing to update their beliefs in a manner consistent with Bayesian inference. Existing solutions require fine-tuning on sensitive user interaction data, limiting...
RUQuant: Towards Refining Uniform Quantization for Large Language Models
arXiv:2604.04013v1 Announce Type: new Abstract: The increasing size and complexity of large language models (LLMs) have raised significant challenges in deployment efficiency, particularly under resource constraints. Post-training quantization (PTQ) has emerged as a practical solution by compressing models without requiring...
Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison
arXiv:2604.04064v1 Announce Type: new Abstract: Small language models (SLMs) in the 100M-10B parameter range increasingly power production systems, yet whether they possess the internal emotion representations recently discovered in frontier models remains unknown. We present the first comparative analysis of...
AdaHOP: Fast and Accurate Low-Precision Training via Outlier-Pattern-Aware Rotation
arXiv:2604.02525v1 Announce Type: new Abstract: Low-precision training (LPT) commonly employs Hadamard transforms to suppress outliers and mitigate quantization error in large language models (LLMs). However, prior methods apply a fixed transform uniformly, despite substantial variation in outlier structures across tensors....
DIGITAL DIPLOMACY AND ARTIFICIAL INTELLIGENCE: REGULATION ASPECTS IN INTERNATIONAL LAW
The article examines the legal aspects of regulating artificial intelligence in the context of digital diplomacy. The author examines the process of transformation of traditional diplomatic institutions under the influence of digitalization and the introduction of artificial intelligence technologies, analyzes...
Analytic Drift Resister for Non-Exemplar Continual Graph Learning
arXiv:2604.02633v1 Announce Type: new Abstract: Non-Exemplar Continual Graph Learning (NECGL) seeks to eliminate the privacy risks intrinsic to rehearsal-based paradigms by retaining solely class-level prototype representations rather than raw graph examples for mitigating catastrophic forgetting. However, this design choice inevitably...
Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models
arXiv:2604.02485v1 Announce Type: new Abstract: Confirmation bias, the tendency to seek evidence that supports rather than challenges one's belief, hinders one's reasoning ability. We examine whether large language models (LLMs) exhibit confirmation bias by adapting the rule-discovery study from human...
Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models
arXiv:2604.03157v1 Announce Type: new Abstract: The recent advancements in Vision Language Models (VLMs) have demonstrated progress toward true intelligence requiring robust reasoning capabilities. Beyond pattern recognition, linguistic reasoning must integrate with visual comprehension, particularly for Chart Question Answering (CQA) tasks...
Automatic Textbook Formalization
arXiv:2604.03071v1 Announce Type: new Abstract: We present a case study where an automatic AI system formalizes a textbook with more than 500 pages of graduate-level algebraic combinatorics to Lean. The resulting formalization represents a new milestone in textbook formalization scale...
Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains
arXiv:2604.02343v1 Announce Type: cross Abstract: We study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where more compression is possible at the cost of more compute. For lossless compression, domain-adapted LoRA adapters can improve...