Gumloop lands $50M from Benchmark to turn every employee into an AI agent builder
As companies race to adopt AI, Benchmark general partner Everett Randle believes the key to success lies in empowering every worker with AI superpowers, and Gumloop’s intuitive agent builder is an example of the kind of tool that will unlock...
Alexa+ gets a new ‘adults only’ personality option that curses but won’t do NSFW content
The new Sassy style can curse and roast you, but the fun ends there.
Wonderful raises $150M Series B at $2B valuation
The funding round, led by Insight Partners, comes just four months after Wonderful raised a $100 million Series A.
Google Maps is getting an AI ‘Ask Maps’ feature and upgraded ‘immersive’ navigation
The tech giant says the "Immersive Navigation" launch is the biggest update to Maps in over a decade.
Empathy Is Not What Changed: Clinical Assessment of Psychological Safety Across GPT Model Generations
arXiv:2603.09997v1 Announce Type: cross Abstract: When OpenAI deprecated GPT-4o in early 2026, thousands of users protested under #keep4o, claiming newer models had "lost their empathy." No published study has tested this claim. We conducted the first clinical measurement, evaluating three...
AraModernBERT: Transtokenized Initialization and Long-Context Encoder Modeling for Arabic
arXiv:2603.09982v1 Announce Type: cross Abstract: Encoder-only transformer models remain widely used for discriminative NLP tasks, yet recent architectural advances have largely focused on English. In this work, we present AraModernBERT, an adaptation of the ModernBERT encoder architecture to Arabic, and...
Agentic Control Center for Data Product Optimization
arXiv:2603.10133v1 Announce Type: new Abstract: Data products enable end users to gain greater insights about their data by providing supporting assets, such as example question-SQL pairs which can be answered using the data or views over the database tables. However,...
SENS-ASR: Semantic Embedding injection in Neural-transducer for Streaming Automatic Speech Recognition
arXiv:2603.10005v1 Announce Type: cross Abstract: Many Automatic Speech Recognition (ASR) applications require streaming processing of the audio data. In streaming mode, ASR systems need to start transcribing the input stream before it is complete, i.e., the systems have to process...
Hybrid Self-evolving Structured Memory for GUI Agents
arXiv:2603.10291v1 Announce Type: new Abstract: The remarkable progress of vision-language models (VLMs) has enabled GUI agents to interact with computers in a human-like manner. Yet real-world computer-use tasks remain difficult due to long-horizon workflows, diverse interfaces, and frequent intermediate errors....
HEAL: Hindsight Entropy-Assisted Learning for Reasoning Distillation
arXiv:2603.10359v1 Announce Type: new Abstract: Distilling reasoning capabilities from Large Reasoning Models (LRMs) into smaller models is typically constrained by the limitation of rejection sampling. Standard methods treat the teacher as a static filter, discarding complex "corner-case" problems where the...
Quantifying Hallucinations in Language Language Models on Medical Textbooks
arXiv:2603.09986v1 Announce Type: cross Abstract: Hallucinations, the tendency for large language models to provide responses with factually incorrect and unsupported claims, is a serious problem within natural language processing for which we do not yet have an effective solution to...
FERRET: Framework for Expansion Reliant Red Teaming
arXiv:2603.10010v1 Announce Type: cross Abstract: We introduce a multi-faceted automated red teaming framework in which the goal is to generate multi-modal adversarial conversations that would break a target model and introduce various expansions that would result in more effective and...
A Governance and Evaluation Framework for Deterministic, Rule-Based Clinical Decision Support in Empiric Antibiotic Prescribing
arXiv:2603.10027v1 Announce Type: cross Abstract: Empiric antibiotic prescribing in high-risk clinical contexts often requires decision making under conditions of incomplete information, where inappropriate coverage or unjustified escalation may compromise safety and antimicrobial stewardship. While clinical decision-support systems have been proposed...
The DMA Streaming Framework: Kernel-Level Buffer Orchestration for High-Performance AI Data Paths
arXiv:2603.10030v1 Announce Type: cross Abstract: AI transport libraries move bytes efficiently, but they commonly assume that buffers are already correctly allocated, placed, shared, registered, and safe under completion and teardown pressure. This paper presents dmaplane, a Linux kernel module that...
GATech at AbjadGenEval Shared Task: Multilingual Embeddings for Arabic Machine-Generated Text Classification
arXiv:2603.10007v1 Announce Type: new Abstract: We present our approach to the AbjadGenEval shared task on detecting AI-generated Arabic text. We fine-tuned the multilingual E5-large encoder for binary classification, and we explored several pooling strategies to pool token representations, including weighted...
Evaluating Progress in Graph Foundation Models: A Comprehensive Benchmark and New Insights
arXiv:2603.10033v1 Announce Type: new Abstract: Graph foundation models (GFM) aim to acquire transferable knowledge by pre-training on diverse graphs, which can be adapted to various downstream tasks. However, domain shift in graphs is inherently two-dimensional: graphs differ not only in...
TriageSim: A Conversational Emergency Triage Simulation Framework from Structured Electronic Health Records
arXiv:2603.10035v1 Announce Type: new Abstract: Research in emergency triage is restricted to structured electronic health records (EHR) due to regulatory constraints on nurse-patient interactions. We introduce TriageSim, a simulation framework for generating persona-conditioned triage conversations from structured records. TriageSim enables...
The Prediction-Measurement Gap: Toward Meaning Representations as Scientific Instruments
arXiv:2603.10130v1 Announce Type: new Abstract: Text embeddings have become central to computational social science and psychology, enabling scalable measurement of meaning and mixed-method inference. Yet most representation learning is optimized and evaluated for prediction and retrieval, yielding a prediction-measurement gap:...
The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory
arXiv:2603.10139v1 Announce Type: new Abstract: Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar...
OpenClaw-RL: Train Any Agent Simply by Talking
arXiv:2603.10165v1 Announce Type: new Abstract: Every agent interaction generates a next-state signal, namely the user reply, tool output, terminal or GUI state change that follows each action, yet no existing agentic RL system recovers it as a live, online learning...
Adaptive Activation Cancellation for Hallucination Mitigation in Large Language Models
arXiv:2603.10195v1 Announce Type: new Abstract: Large Language Models frequently generate fluent but factually incorrect text. We propose Adaptive Activation Cancellation (AAC), a real-time inference-time framework that treats hallucination-associated neural activations as structured interference within the transformer residual stream, drawing an...
ViDia2Std: A Parallel Corpus and Methods for Low-Resource Vietnamese Dialect-to-Standard Translation
arXiv:2603.10211v1 Announce Type: new Abstract: Vietnamese exhibits extensive dialectal variation, posing challenges for NLP systems trained predominantly on standard Vietnamese. Such systems often underperform on dialectal inputs, especially from underrepresented Central and Southern regions. Previous work on dialect normalization has...
Sabi\'a-4 Technical Report
arXiv:2603.10213v1 Announce Type: new Abstract: This technical report presents Sabi\'a-4 and Sabiazinho-4, a new generation of Portuguese language models with a focus on Brazilian Portuguese language. The models were developed through a four-stage training pipeline: continued pre-training on Portuguese and...
Large language models can disambiguate opioid slang on social media
arXiv:2603.10313v1 Announce Type: new Abstract: Social media text shows promise for monitoring trends in the opioid overdose crisis; however, the overwhelming majority of social media text is unrelated to opioids. When leveraging social media text to monitor trends in the...
Dynamic Knowledge Fusion for Multi-Domain Dialogue State Tracking
arXiv:2603.10367v1 Announce Type: new Abstract: The performance of task-oriented dialogue models is strongly tied to how well they track dialogue states, which records and updates user information across multi-turn interactions. However, current multi-domain DST encounters two key challenges: the difficulty...
LWM-Temporal: Sparse Spatio-Temporal Attention for Wireless Channel Representation Learning
arXiv:2603.10024v1 Announce Type: new Abstract: LWM-Temporal is a new member of the Large Wireless Models (LWM) family that targets the spatiotemporal nature of wireless channels. Designed as a task-agnostic foundation model, LWM-Temporal learns universal channel embeddings that capture mobility-induced evolution...
Gated Adaptation for Continual Learning in Human Activity Recognition
arXiv:2603.10046v1 Announce Type: new Abstract: Wearable sensors in Internet of Things (IoT) ecosystems increasingly support applications such as remote health monitoring, elderly care, and smart home automation, all of which rely on robust human activity recognition (HAR). Continual learning systems...
Revisiting Sharpness-Aware Minimization: A More Faithful and Effective Implementation
arXiv:2603.10048v1 Announce Type: new Abstract: Sharpness-Aware Minimization (SAM) enhances generalization by minimizing the maximum training loss within a predefined neighborhood around the parameters. However, its practical implementation approximates this as gradient ascent(s) followed by applying the gradient at the ascent...
Dissecting Chronos: Sparse Autoencoders Reveal Causal Feature Hierarchies in Time Series Foundation Models
arXiv:2603.10071v1 Announce Type: new Abstract: Time series foundation models (TSFMs) are increasingly deployed in high-stakes domains, yet their internal representations remain opaque. We present the first application of sparse autoencoders (SAEs) to a TSFM, training TopK SAEs on activations of...
Large Spikes in Stochastic Gradient Descent: A Large-Deviations View
arXiv:2603.10079v1 Announce Type: new Abstract: We analyse SGD training of a shallow, fully connected network in the NTK scaling and provide a quantitative theory of the catapult phase. We identify an explicit criterion separating two behaviours: When an explicit function...