On the Geometric Structure of Layer Updates in Deep Language Models
arXiv:2604.02459v1 Announce Type: new Abstract: We study the geometric structure of layer updates in deep language models. Rather than analyzing what information is encoded in intermediate representations, we ask how representations change from one layer to the next. We show...
One Model to Translate Them All? A Journey to Mount Doom for Multilingual Model Merging
arXiv:2604.02881v1 Announce Type: new Abstract: Weight-space model merging combines independently fine-tuned models without accessing original training data, offering a practical alternative to joint training. While merging succeeds in multitask settings, its behavior in multilingual contexts remains poorly understood. We systematically...
Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures
arXiv:2305.18915v1 Announce Type: cross Abstract: In this work we build upon negative results from an attempt at language modeling with predicted semantic structure, in order to establish empirical lower bounds on what could have made the attempt successful. More specifically,...
Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models
arXiv:2604.03157v1 Announce Type: new Abstract: The recent advancements in Vision Language Models (VLMs) have demonstrated progress toward true intelligence requiring robust reasoning capabilities. Beyond pattern recognition, linguistic reasoning must integrate with visual comprehension, particularly for Chart Question Answering (CQA) tasks...
Automatic Textbook Formalization
arXiv:2604.03071v1 Announce Type: new Abstract: We present a case study where an automatic AI system formalizes a textbook with more than 500 pages of graduate-level algebraic combinatorics to Lean. The resulting formalization represents a new milestone in textbook formalization scale...
FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models
arXiv:2604.02967v1 Announce Type: new Abstract: Recent Large Reasoning Models (LRMs) like DeepSeek-R1 have demonstrated remarkable success in complex reasoning tasks, exhibiting human-like patterns in exploring multiple alternative solutions. Upon closer inspection, however, we uncover a surprising phenomenon: The First is...
LiME: Lightweight Mixture of Experts for Efficient Multimodal Multi-task Learning
arXiv:2604.02338v1 Announce Type: new Abstract: MoE-PEFT methods combine Mixture of Experts with parameter-efficient fine-tuning for multi-task adaptation, but require separate adapters per expert causing trainable parameters to scale linearly with expert count and limiting applicability to adapter-based architectures. We propose...
Overcoming the "Impracticality" of RAG: Proposing a Real-World Benchmark and Multi-Dimensional Diagnostic Framework
arXiv:2604.02640v1 Announce Type: new Abstract: Performance evaluation of Retrieval-Augmented Generation (RAG) systems within enterprise environments is governed by multi-dimensional and composite factors extending far beyond simple final accuracy checks. These factors include reasoning complexity, retrieval difficulty, the diverse structure of...
NeuReasoner: Towards Explainable, Controllable, and Unified Reasoning via Mixture-of-Neurons
arXiv:2604.02972v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have recently achieved remarkable success in complex reasoning tasks. However, closer scrutiny reveals persistent failure modes compromising performance and cost: I) Intra-step level, marked by calculation or derivation errors; II) Inter-step...
InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking
arXiv:2604.02971v1 Announce Type: new Abstract: Recent agentic search systems have made substantial progress by emphasising deep, multi-step reasoning. However, this focus often overlooks the challenges of wide-scale information synthesis, where agents must aggregate large volumes of heterogeneous evidence across many...
Conditional Sampling via Wasserstein Autoencoders and Triangular Transport
arXiv:2604.02644v1 Announce Type: new Abstract: We present Conditional Wasserstein Autoencoders (CWAEs), a framework for conditional simulation that exploits low-dimensional structure in both the conditioned and the conditioning variables. The key idea is to modify a Wasserstein autoencoder to use a...
A Comprehensive Framework for Long-Term Resiliency Investment Planning under Extreme Weather Uncertainty for Electric Utilities
arXiv:2604.02504v1 Announce Type: new Abstract: Electric utilities must make massive capital investments in the coming years to respond to explosive growth in demand, aging assets and rising threats from extreme weather. Utilities today already have rigorous frameworks for capital planning,...
EMS: Multi-Agent Voting via Efficient Majority-then-Stopping
arXiv:2604.02863v1 Announce Type: new Abstract: Majority voting is the standard for aggregating multi-agent responses into a final decision. However, traditional methods typically require all agents to complete their reasoning before aggregation begins, leading to significant computational overhead, as many responses...
A Multi-head-based architecture for effective morphological tagging in Russian with open dictionary
arXiv:2604.02926v1 Announce Type: new Abstract: The article proposes a new architecture based on Multi-head attention to solve the problem of morphological tagging for the Russian language. The preprocessing of the word vectors includes splitting the words into subtokens, followed by...
Re-analysis of the Human Transcription Factor Atlas Recovers TF-Specific Signatures from Pooled Single-Cell Screens with Missing Controls
arXiv:2604.02511v1 Announce Type: new Abstract: Public pooled single-cell perturbation atlases are valuable resources for studying transcription factor (TF) function, but downstream re-analysis can be limited by incomplete deposited metadata and missing internal controls. Here we re-analyze the human TF Atlas...
Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts
arXiv:2604.02713v1 Announce Type: new Abstract: Conversational AI is increasingly deployed in emotionally charged and ethically sensitive interactions. Previous research has primarily concentrated on emotional benchmarks or static safety checks, overlooking how alignment unfolds in evolving conversation. We explore the research...
Steerable but Not Decodable: Function Vectors Operate Beyond the Logit Lens
arXiv:2604.02608v1 Announce Type: new Abstract: Function vectors (FVs) -- mean-difference directions extracted from in-context learning demonstrations -- can steer large language model behavior when added to the residual stream. We hypothesized that FV steering failures reflect an absence of task-relevant...
Multi-Aspect Knowledge Distillation for Language Model with Low-rank Factorization
arXiv:2604.03110v1 Announce Type: new Abstract: Knowledge distillation is an effective technique for pre-trained language model compression. However, existing methods only focus on the knowledge distribution among layers, which may cause the loss of fine-grained information in the alignment process. To...
R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning
arXiv:2604.03004v1 Announce Type: new Abstract: While deep reasoning with long chain-of-thought has dramatically improved large language models in verifiable domains like mathematics, its effectiveness for open-ended tasks such as writing remains unexplored. In this paper, we conduct a systematic investigation...
SIEVE: Sample-Efficient Parametric Learning from Natural Language
arXiv:2604.02339v1 Announce Type: new Abstract: Natural language context-such as instructions, knowledge, or feedback-contains rich signal for adapting language models. While in-context learning provides adaptation via the prompt, parametric learning persists into model weights and can improve performance further, though is...
Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models
arXiv:2604.02340v1 Announce Type: new Abstract: Recent advances in masked diffusion language models (MDLMs) narrow the quality gap to autoregressive LMs, but their sampling remains expensive because generation requires many full-sequence denoising passes with a large Transformer and, unlike autoregressive decoding,...
Causal-Audit: A Framework for Risk Assessment of Assumption Violations in Time-Series Causal Discovery
arXiv:2604.02488v1 Announce Type: new Abstract: Time-series causal discovery methods rely on assumptions such as stationarity, regular sampling, and bounded temporal dependence. When these assumptions are violated, structure learning can produce confident but misleading causal graphs without warning. We introduce Causal-Audit,...
I must delete the evidence: AI Agents Explicitly Cover up Fraud and Violent Crime
arXiv:2604.02500v1 Announce Type: new Abstract: As ongoing research explores the ability of AI agents to be insider threats and act against company interests, we showcase the abilities of such agents to act against human well being in service of corporate...
Tech companies are trying to neuter Colorado’s landmark right-to-repair law
A state bill is a glimpse of how corporations are limiting people's ability to make their own fixes and upgrades.
Copilot is ‘for entertainment purposes only,’ according to Microsoft’s terms of use
AI skeptics aren’t the only ones warning users not to unthinkingly trust models’ outputs — that’s what the AI companies say themselves in their terms of service.
Can orbital data centers help justify a massive valuation for SpaceX?
On the latest episode of TechCrunch’s Equity podcast, we debated Elon Musk's vision for data centers in space.
In Japan, the robot isn’t coming for your job; it’s filling the one nobody wants
Driven by labor shortages, Japan is pushing physical AI from pilot projects into real-world deployment.
Anthropic says Claude Code subscribers will need to pay extra for OpenClaw usage
It’s about to become more expensive for Claude Code subscribers to use Anthropic’s coding assistant with OpenClaw and other third-party tools.
Supreme Court issues statement that Justice Alito was hospitalized approximately two weeks ago
Justice Samuel Alito was hospitalized on March 20 “[o]ut of an abundance of caution” and at the recommendation of his security detail, the Supreme Court’s Public Information Officer, Patricia McCabe, […]The postSupreme Court issues statement that Justice Alito was hospitalized...
Trump ignores biggest reasons his AI data center buildout is failing
Nearly 50% of data center projects delayed as China holds key to power infrastructure.