AV1’s open, royalty-free promise in question as Dolby sues Snapchat over codec
Big Tech declaring AV1 royalty-free “doesn't mean that it is."
Senators want US energy information agency to monitor data center electricity usage
In a letter, senators press for mandated annual electricity disclosure for data centers.
Anthropic’s Claude popularity with paying consumers is skyrocketing
Estimates for total Claude consumer users are all over the map (we've seen figures ranging from 18 million to 30 million). Anthropic hasn't disclosed this data, but a spokesperson did tell TechCrunch that Claude paid subscriptions have more than doubled...
Why SoftBank’s new $40B loan points to a 2026 OpenAI IPO
Wall Street giants JPMorgan and Goldman Sachs are extending a 12-month, unsecured loan to the Japanese conglomerate.
Memory chip giant SK hynix could help end ‘RAMmageddon’ with blockbuster US IPO
SK hynix’s potential U.S. listing could raise $10-$14 billion to help it build more capacity, encourage others to follow, and end the 'RAMmageddon' memory shortage.
VCs are betting billions on AI’s next wave, so why is OpenAI killing Sora?
When an 82-year-old Kentucky woman was offered $26 million from an AI company that wanted to build a data center on her land, she said no. Sure, that same company can try to rezone 2,000 acres nearby anyway, but as...
Wikipedia cracks down on the use of AI in article writing
The site, whose policies are subject to change, has struggled with the issue of AI-generated writing.
Cohere launches an open source voice model specifically for transcription
Relatively light at just 2 billion parameters, the model is meant for use with consumer-grade GPUs for those who want to self-host it. It currently supports 14 languages.
Corrigendum to “Generating dynamic lip-syncing using target audio in a multimedia environment” [Natural Language Processing Journal, Volume 8, 2024]
Chitrakshara: A Large Multilingual Multimodal Dataset for Indian languages
arXiv:2603.23521v1 Announce Type: new Abstract: Multimodal research has predominantly focused on single-image reasoning, with limited exploration of multi-image scenarios. Recent models have sought to enhance multi-image understanding through large-scale pretraining on interleaved image-text datasets. However, most Vision-Language Models (VLMs) are...
Navigating the Concept Space of Language Models
arXiv:2603.23524v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) trained on large language model activations output thousands of features that enable mapping to human-interpretable concepts. The current practice for analyzing these features primarily relies on inspecting top-activating examples, manually browsing individual...
Cluster-R1: Large Reasoning Models Are Instruction-following Clustering Agents
arXiv:2603.23518v1 Announce Type: new Abstract: General-purpose embedding models excel at recognizing semantic similarities but fail to capture the characteristics of texts specified by user instructions. In contrast, instruction-tuned embedders can align embeddings with textual instructions yet cannot autonomously infer latent...
DISCO: Document Intelligence Suite for COmparative Evaluation
arXiv:2603.23511v1 Announce Type: new Abstract: Document intelligence requires accurate text extraction and reliable reasoning over document content. We introduce \textbf{DISCO}, a \emph{Document Intelligence Suite for COmparative Evaluation}, that evaluates optical character recognition (OCR) pipelines and vision-language models (VLMs) separately on...
DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models
arXiv:2603.23514v1 Announce Type: new Abstract: Large Language Models appear competent when answering general questions but often fail when pushed into domain-specific details. No existing methodology provides an out-of-the-box solution for measuring how deeply LLMs can sustain accurate responses under adaptive...
S-Path-RAG: Semantic-Aware Shortest-Path Retrieval Augmented Generation for Multi-Hop Knowledge Graph Question Answering
arXiv:2603.23512v1 Announce Type: new Abstract: We present S-Path-RAG, a semantic-aware shortest-path Retrieval-Augmented Generation framework designed to improve multi-hop question answering over large knowledge graphs. S-Path-RAG departs from one-shot, text-heavy retrieval by enumerating bounded-length, semantically weighted candidate paths using a hybrid...
Compression Method Matters: Benchmark-Dependent Output Dynamics in LLM Prompt Compression
arXiv:2603.23527v1 Announce Type: new Abstract: Prompt compression is often evaluated by input-token reduction, but its real deployment impact depends on how compression changes output length and total inference cost. We present a controlled replication and extension study of benchmark-dependent output...
Training a Large Language Model for Medical Coding Using Privacy-Preserving Synthetic Clinical Data
arXiv:2603.23515v1 Announce Type: new Abstract: Improving the accuracy and reliability of medical coding reduces clinician burnout and supports revenue cycle processes, freeing providers to focus more on patient care. However, automating the assignment of ICD-10-CM and CPT codes from clinical...
Fast and Faithful: Real-Time Verification for Long-Document Retrieval-Augmented Generation Systems
arXiv:2603.23508v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) is increasingly deployed in enterprise search and document-centric assistants, where responses must be grounded in long and complex source materials. In practice, verifying that generated answers faithfully reflect retrieved documents is difficult:...
Qworld: Question-Specific Evaluation Criteria for LLMs
arXiv:2603.23522v1 Announce Type: new Abstract: Evaluating large language models (LLMs) on open-ended questions is difficult because response quality depends on the question's context. Binary scores and static rubrics fail to capture these context-dependent requirements. Existing methods define criteria at the...
Do 3D Large Language Models Really Understand 3D Spatial Relationships?
arXiv:2603.23523v1 Announce Type: new Abstract: Recent 3D Large-Language Models (3D-LLMs) claim to understand 3D worlds, especially spatial relationships among objects. Yet, we find that simply fine-tuning a language model on text-only question-answer pairs can perform comparably or even surpass these...
Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language
arXiv:2603.23529v1 Announce Type: new Abstract: Large Language Models (LLMs) consistently under perform in low-resource linguistic contexts such as Konkani. This performance deficit stems from acute training data scarcity compounded by high script diversity across Devanagari, Romi and Kannada orthographies. To...
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens
arXiv:2603.23516v1 Announce Type: new Abstract: Long-term memory is a cornerstone of human intelligence. Enabling AI to process lifetime-scale information remains a long-standing pursuit in the field. Due to the constraints of full-attention architectures, the effective context length of large language...
Leveraging Computerized Adaptive Testing for Cost-effective Evaluation of Large Language Models in Medical Benchmarking
arXiv:2603.23506v1 Announce Type: new Abstract: The rapid proliferation of large language models (LLMs) in healthcare creates an urgent need for scalable and psychometrically sound evaluation methods. Conventional static benchmarks are costly to administer repeatedly, vulnerable to data contamination, and lack...
Did You Forget What I Asked? Prospective Memory Failures in Large Language Models
arXiv:2603.23530v1 Announce Type: new Abstract: Large language models often fail to satisfy formatting instructions when they must simultaneously perform demanding tasks. We study this behaviour through a prospective memory inspired lens from cognitive psychology, using a controlled paradigm that combines...
Swiss-Bench SBP-002: A Frontier Model Comparison on Swiss Legal and Regulatory Tasks
arXiv:2603.23646v1 Announce Type: new Abstract: While recent work has benchmarked large language models on Swiss legal translation (Niklaus et al., 2025) and academic legal reasoning from university exams (Fan et al., 2025), no existing benchmark evaluates frontier model performance on...
Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges
arXiv:2603.23659v1 Announce Type: new Abstract: When large language models make ethical judgments, do their internal representations distinguish between normative frameworks, or collapse ethics into a single acceptability dimension? We probe hidden representations across five ethical frameworks (deontology, utilitarianism, virtue, justice,...
PLACID: Privacy-preserving Large language models for Acronym Clinical Inference and Disambiguation
arXiv:2603.23678v1 Announce Type: new Abstract: Large Language Models (LLMs) offer transformative solutions across many domains, but healthcare integration is hindered by strict data privacy constraints. Clinical narratives are dense with ambiguous acronyms, misinterpretation these abbreviations can precipitate severe outcomes like...
IslamicMMLU: A Benchmark for Evaluating LLMs on Islamic Knowledge
arXiv:2603.23750v1 Announce Type: new Abstract: Large language models are increasingly consulted for Islamic knowledge, yet no comprehensive benchmark evaluates their performance across core Islamic disciplines. We introduce IslamicMMLU, a benchmark of 10,013 multiple-choice questions spanning three tracks: Quran (2,013 questions),...
Perturbation: A simple and efficient adversarial tracer for representation learning in language models
arXiv:2603.23821v1 Announce Type: new Abstract: Linguistic representation learning in deep neural language models (LMs) has been studied for decades, for both practical and theoretical reasons. However, finding representations in LMs remains an unsolved problem, in part due to a dilemma...
PoliticsBench: Benchmarking Political Values in Large Language Models with Multi-Turn Roleplay
arXiv:2603.23841v1 Announce Type: new Abstract: While Large Language Models (LLMs) are increasingly used as primary sources of information, their potential for political bias may impact their objectivity. Existing benchmarks of LLM social bias primarily evaluate gender and racial stereotypes. When...