A Supreme Court status report
In early January, as the country eagerly awaited a tariffs ruling that – as it turned out – was still more than a month away, …
Quality follows upgrading
All Articles
In early January, as the country eagerly awaited a tariffs ruling that – as it turned out – was still more than a month away, …
arXiv:2604.06505v1 Announce Type: new Abstract: Large language models (LLMs) are widely explored for reasoning-intensive research tasks, yet resources for testing whether they can infer scientific …
arXiv:2604.06650v1 Announce Type: new Abstract: Existing prompt-based fine-tuning methods typically learn task-specific prompts independently, imposing significant computing and storage overhead at scale when deploying multiple …
Poke brings AI agents to everyday users via text message by handling tasks and automations without complex setup, apps, or technical know-how.
arXiv:2604.06422v1 Announce Type: new Abstract: Understanding when Vision-Language Models (VLMs) will behave unexpectedly, whether models can reliably predict their own behavior, and if models adhere …
arXiv:2604.06552v1 Announce Type: new Abstract: Misinformation is on the rise, and the strong writing capabilities of LLMs lower the barrier for malicious actors to produce …
arXiv:2604.06507v1 Announce Type: new Abstract: Pashto is absent from Whisper's pre-training corpus despite being one of CommonVoice's largest language collections, leaving off-the-shelf models unusable: all …
arXiv:2604.06267v1 Announce Type: new Abstract: Multimodal variational autoencoders (VAEs) have emerged as a powerful framework for survival risk modeling in multiple myeloma by integrating heterogeneous …
arXiv:2604.06256v1 Announce Type: new Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably …
arXiv:2604.06196v1 Announce Type: new Abstract: Three-way logical question answering (QA) assigns $True/False/Unknown$ to a hypothesis $H$ given a premise set $S$. While modern large language …
arXiv:2604.06208v1 Announce Type: new Abstract: A significant amount of data held in Oncology Electronic Medical Records (EMRs) is contained in unstructured provider notes -- including …
arXiv:2604.06202v1 Announce Type: new Abstract: Large language models (LLMs) have transformed natural language processing, yet their capabilities remain uneven across languages. Most multilingual models are …