Spain’s Xoople raises $130 million Series B to map the Earth for AI
The company is also announcing a deal with L3Harris to build the sensors for Xoople's spacecraft.
Embedding Enhancement via Fine-Tuned Language Models for Learner-Item Cognitive Modeling
arXiv:2604.04088v1 Announce Type: new Abstract: Learner-item cognitive modeling plays a central role in the web-based online intelligent education system by enabling cognitive diagnosis (CD) across diverse online educational scenarios. Although ID embedding remains the mainstream approach in cognitive modeling due...
Iran threatens ‘Stargate’ AI data centers
Iran said it will target U.S.-linked data centers with new missile strikes, as the war between the U.S. and Iran escalates.
Knowledge Packs: Zero-Token Knowledge Delivery via KV Cache Injection
arXiv:2604.03270v1 Announce Type: new Abstract: RAG wastes tokens. We propose Knowledge Packs: pre-computed KV caches that deliver the same knowledge at zero token cost. For causal transformers, the KV cache from a forward pass on text F is identical to...
Google quietly launched an AI dictation app that works offline
Google's new offline-first dictation app uses Gemma AI models to take on the apps like Wispr Flow.
Why Attend to Everything? Focus is the Key
arXiv:2604.03260v1 Announce Type: new Abstract: We introduce Focus, a method that learns which token pairs matter rather than approximating all of them. Learnable centroids assign tokens to groups; distant attention is restricted to same-group pairs while local attention operates at...
A Bayesian Information-Theoretic Approach to Data Attribution
arXiv:2604.03858v1 Announce Type: new Abstract: Training Data Attribution (TDA) seeks to trace model predictions back to influential training examples, enhancing interpretability and safety. We formulate TDA as a Bayesian information-theoretic problem: subsets are scored by the information loss they induce...
OpenAI alums have been quietly investing from a new, potentially $100M fund
Zero Shot, a new venture capital fund with deep ties to OpenAI, is aiming to raise $100 million for its first fund. It has already written some checks.
Entropy and Attention Dynamics in Small Language Models: A Trace-Level Structural Analysis on the TruthfulQA Benchmark
arXiv:2604.03589v1 Announce Type: new Abstract: Small language models (SLMs) have been increasingly deployed in edge devices and other resource-constrained settings. However, these models make confident mispredictions and produce unstable output, making them risky for factual and decision-critical tasks. Current evaluation...
The Tool Illusion: Rethinking Tool Use in Web Agents
arXiv:2604.03465v1 Announce Type: new Abstract: As web agents rapidly evolve, an increasing body of work has moved beyond conventional atomic browser interactions and explored tool use as a higher-level action paradigm. Although prior studies have shown the promise of tools,...
MetaSAEs: Joint Training with a Decomposability Penalty Produces More Atomic Sparse Autoencoder Latents
arXiv:2604.03436v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) are increasingly used for safety-relevant applications including alignment detection and model steering. These use cases require SAE latents to be as atomic as possible. Each latent should represent a single coherent concept...
Noise Steering for Controlled Text Generation: Improving Diversity and Reading-Level Fidelity in Arabic Educational Story Generation
arXiv:2604.03380v1 Announce Type: new Abstract: Generating diverse, pedagogically valid stories for Arabic early-grade reading assessments requires balancing tight constraints on vocabulary, reading level, and narrative structure against the need to avoid repetitive plots that undermine assessment validity. We investigate noise...
OpenAI’s vision for the AI economy: public wealth funds, robot taxes, and a four-day workweek
OpenAI proposes taxes on AI profits, public wealth funds, and expanded safety nets to address job loss and inequality, blending redistribution with capitalism as policymakers debate AI’s economic impact.
Extracting and Steering Emotion Representations in Small Language Models: A Methodological Comparison
arXiv:2604.04064v1 Announce Type: new Abstract: Small language models (SLMs) in the 100M-10B parameter range increasingly power production systems, yet whether they possess the internal emotion representations recently discovered in frontier models remains unknown. We present the first comparative analysis of...
Towards the AI Historian: Agentic Information Extraction from Primary Sources
arXiv:2604.03553v1 Announce Type: new Abstract: AI is supporting, accelerating, and automating scientific discovery across a diverse set of fields. However, AI adoption in historical research remains limited due to the lack of solutions designed for historians. In this technical progress...
Earth Embeddings Reveal Diverse Urban Signals from Space
arXiv:2604.03456v1 Announce Type: new Abstract: Conventional urban indicators derived from censuses, surveys, and administrative records are often costly, spatially inconsistent, and slow to update. Recent geospatial foundation models enable Earth embeddings, compact satellite image representations transferable across downstream tasks, but...
Neural Operators for Multi-Task Control and Adaptation
arXiv:2604.03449v1 Announce Type: new Abstract: Neural operator methods have emerged as powerful tools for learning mappings between infinite-dimensional function spaces, yet their potential in optimal control remains largely unexplored. We focus on multi-task control problems, whose solution is a mapping...
LPC-SM: Local Predictive Coding and Sparse Memory for Long-Context Language Modeling
arXiv:2604.03263v1 Announce Type: new Abstract: Most current long-context language models still rely on attention to handle both local interaction and long-range state, which leaves relatively little room to test alternative decompositions of sequence modeling. We propose LPC-SM, a hybrid autoregressive...
SODA: Semi On-Policy Black-Box Distillation for Large Language Models
arXiv:2604.03873v1 Announce Type: new Abstract: Black-box knowledge distillation for large language models presents a strict trade-off. Simple off-policy methods (e.g., sequence-level knowledge distillation) struggle to correct the student's inherent errors. Fully on-policy methods (e.g., Generative Adversarial Distillation) solve this via...
Automated Attention Pattern Discovery at Scale in Large Language Models
arXiv:2604.03764v1 Announce Type: new Abstract: Large language models have found success by scaling up capabilities to work in general settings. The same can unfortunately not be said for interpretability methods. The current trend in mechanistic interpretability is to provide precise...
Text Summarization With Graph Attention Networks
arXiv:2604.03583v1 Announce Type: new Abstract: This study aimed to leverage graph information, particularly Rhetorical Structure Theory (RST) and Co-reference (Coref) graphs, to enhance the performance of our baseline summarization models. Specifically, we experimented with a Graph Attention Network architecture to...
Simple yet Effective: Low-Rank Spatial Attention for Neural Operators
arXiv:2604.03582v1 Announce Type: new Abstract: Neural operators have emerged as data-driven surrogates for solving partial differential equations (PDEs), and their success hinges on efficiently modeling the long-range, global coupling among spatial points induced by the underlying physics. In many PDE...
GeoBrowse: A Geolocation Benchmark for Agentic Tool Use with Expert-Annotated Reasoning Traces
arXiv:2604.04017v1 Announce Type: new Abstract: Deep research agents integrate fragmented evidence through multi-step tool use. BrowseComp offers a text-only testbed for such agents, but existing multimodal benchmarks rarely require both weak visual cues composition and BrowseComp-style multi-hop verification. Geolocation is...
RL-Driven Sustainable Land-Use Allocation for the Lake Malawi Basin
arXiv:2604.03768v1 Announce Type: new Abstract: Unsustainable land-use practices in ecologically sensitive regions threaten biodiversity, water resources, and the livelihoods of millions. This paper presents a deep reinforcement learning (RL) framework for optimizing land-use allocation in the Lake Malawi Basin to...
Autoencoder-Based Parameter Estimation for Superposed Multi-Component Damped Sinusoidal Signals
arXiv:2604.03985v1 Announce Type: new Abstract: Damped sinusoidal oscillations are widely observed in many physical systems, and their analysis provides access to underlying physical properties. However, parameter estimation becomes difficult when the signal decays rapidly, multiple components are superposed, and observational...
InsTraj: Instructing Diffusion Models with Travel Intentions to Generate Real-world Trajectories
arXiv:2604.04106v1 Announce Type: new Abstract: The generation of realistic and controllable GPS trajectories is a fundamental task for applications in urban planning, mobility simulation, and privacy-preserving data sharing. However, existing methods face a two-fold challenge: they lack the deep semantic...
Profile-Then-Reason: Bounded Semantic Complexity for Tool-Augmented Language Agents
arXiv:2604.04131v1 Announce Type: new Abstract: Large language model agents that use external tools are often implemented through reactive execution, in which reasoning is repeatedly recomputed after each observation, increasing latency and sensitivity to error propagation. This work introduces Profile--Then--Reason (PTR),...
AI startup Rocket offers vibe McKinsey-style reports at a fraction of the cost
Rocket's new AI platform combines strategy, product building, and competitive intelligence, aiming to move beyond code generation.
Solar-VLM: Multimodal Vision-Language Models for Augmented Solar Power Forecasting
arXiv:2604.04145v1 Announce Type: new Abstract: Photovoltaic (PV) power forecasting plays a critical role in power system dispatch and market participation. Because PV generation is highly sensitive to weather conditions and cloud motion, accurate forecasting requires effective modeling of complex spatiotemporal...
Shorter, but Still Trustworthy? An Empirical Study of Chain-of-Thought Compression
arXiv:2604.04120v1 Announce Type: new Abstract: Long chain-of-thought (Long-CoT) reasoning models have motivated a growing body of work on compressing reasoning traces to reduce inference cost, yet existing evaluations focus almost exclusively on task accuracy and token savings. Trustworthiness properties, whether...