LLM-Augmented Knowledge Base Construction For Root Cause Analysis
arXiv:2604.06171v1 Announce Type: new Abstract: Communications networks now form the backbone of our digital world, with fast and reliable connectivity. However, even with appropriate redundancy and failover mechanisms, it is difficult to guarantee "five 9s" (99.999 %) reliability, requiring rapid...
DataSTORM: Deep Research on Large-Scale Databases using Exploratory Data Analysis and Data Storytelling
arXiv:2604.06474v1 Announce Type: new Abstract: Deep research with Large Language Model (LLM) agents is emerging as a powerful paradigm for multi-step information discovery, synthesis, and analysis. However, existing approaches primarily focus on unstructured web data, while the challenges of conducting...
VLMShield: Efficient and Robust Defense of Vision-Language Models against Malicious Prompts
arXiv:2604.06502v1 Announce Type: new Abstract: Vision-Language Models (VLMs) face significant safety vulnerabilities from malicious prompt attacks due to weakened alignment during visual integration. Existing defenses suffer from efficiency and robustness. To address these challenges, we first propose the Multimodal Aggregated...
LLM-based Schema-Guided Extraction and Validation of Missing-Person Intelligence from Heterogeneous Data Sources
arXiv:2604.06571v1 Announce Type: new Abstract: Missing-person and child-safety investigations rely on heterogeneous case documents, including structured forms, bulletin-style posters, and narrative web profiles. Variations in layout, terminology, and data quality impede rapid triage, large-scale analysis, and search-planning workflows. This paper...
Fine-tuning Whisper for Pashto ASR: strategies and scale
arXiv:2604.06507v1 Announce Type: new Abstract: Pashto is absent from Whisper's pre-training corpus despite being one of CommonVoice's largest language collections, leaving off-the-shelf models unusable: all Whisper sizes output Arabic, Dari, or Urdu script on Pashto audio, achieving word error rates...
Asymptotic-Preserving Neural Networks for Viscoelastic Parameter Identification in Multiscale Blood Flow Modeling
arXiv:2604.06287v1 Announce Type: new Abstract: Mathematical models and numerical simulations offer a non-invasive way to explore cardiovascular phenomena, providing access to quantities that cannot be measured directly. In this study, we start with a one-dimensional multiscale blood flow model that...
ART: Attention Replacement Technique to Improve Factuality in LLMs
arXiv:2604.06393v1 Announce Type: new Abstract: Hallucination in large language models (LLMs) continues to be a significant issue, particularly in tasks like question answering, where models often generate plausible yet incorrect or irrelevant information. Although various methods have been proposed to...
Severity-Aware Weighted Loss for Arabic Medical Text Generation
arXiv:2604.06346v1 Announce Type: new Abstract: Large language models have shown strong potential for Arabic medical text generation; however, traditional fine-tuning objectives treat all medical cases uniformly, ignoring differences in clinical severity. This limitation is particularly critical in healthcare settings, where...
AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent
arXiv:2604.06296v1 Announce Type: new Abstract: AI agents are increasingly deployed in real-world applications, including systems such as Manus, OpenClaw, and coding agents. Existing research has primarily focused on \emph{server-side} efficiency, proposing methods such as caching, speculative execution, traffic scheduling, and...
Distributed Interpretability and Control for Large Language Models
arXiv:2604.06483v1 Announce Type: new Abstract: Large language models that require multiple GPU cards to host are usually the most capable models. It is necessary to understand and steer these models, but the current technologies do not support the interpretability and...
Astropad’s Workbench reimagines remote desktop for AI agents, not IT support
Astropad’s Workbench lets users remotely monitor and control AI agents on Mac Minis from iPhone or iPad, with low-latency streaming and mobile access.
Weighted Bayesian Conformal Prediction
arXiv:2604.06464v1 Announce Type: new Abstract: Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally...
Bi-level Heterogeneous Learning for Time Series Foundation Models: A Federated Learning Approach
arXiv:2604.06727v1 Announce Type: new Abstract: Heterogeneity in time series data is more pronounced than in vision or language, as temporal dynamics vary substantially across domains and tasks. Existing efforts on training time series foundation models (TSFMs) from scratch are often...
Optimal Rates for Pure {\varepsilon}-Differentially Private Stochastic Convex Optimization with Heavy Tails
arXiv:2604.06492v1 Announce Type: new Abstract: We study stochastic convex optimization (SCO) with heavy-tailed gradients under pure epsilon-differential privacy (DP). Instead of assuming a bound on the worst-case Lipschitz parameter of the loss, we assume only a bounded k-th moment. This...
Bi-Level Optimization for Single Domain Generalization
arXiv:2604.06349v1 Announce Type: new Abstract: Generalizing from a single labeled source domain to unseen target domains, without access to any target data during training, remains a fundamental challenge in robust machine learning. We address this underexplored setting, known as Single...
FLeX: Fourier-based Low-rank EXpansion for multilingual transfer
arXiv:2604.06253v1 Announce Type: new Abstract: Cross-lingual code generation is critical in enterprise environments where multiple programming languages coexist. However, fine-tuning large language models (LLMs) individually for each language is computationally prohibitive. This paper investigates whether parameter-efficient fine-tuning methods and optimizer...
Unsupervised Neural Network for Automated Classification of Surgical Urgency Levels in Medical Transcriptions
arXiv:2604.06214v1 Announce Type: new Abstract: Efficient classification of surgical procedures by urgency is paramount to optimize patient care and resource allocation within healthcare systems. This study introduces an unsupervised neural network approach to automatically categorize surgical transcriptions into three urgency...
Conformal Margin Risk Minimization: An Envelope Framework for Robust Learning under Label Noise
arXiv:2604.06468v1 Announce Type: new Abstract: Most methods for learning with noisy labels require privileged knowledge such as noise transition matrices, clean subsets or pretrained feature extractors, resources typically unavailable when robustness is most needed. We propose Conformal Margin Risk Minimization...
Temporally Phenotyping GLP-1RA Case Reports with Large Language Models: A Textual Time Series Corpus and Risk Modeling
arXiv:2604.06197v1 Announce Type: new Abstract: Type 2 diabetes case reports describe complex clinical courses, but their timelines are often expressed in language that is difficult to reuse in longitudinal modeling. To address this gap, we developed a textual time-series corpus...
FMI@SU ToxHabits: Evaluating LLMs Performance on Toxic Habit Extraction in Spanish Clinical Texts
arXiv:2604.06403v1 Announce Type: new Abstract: The paper presents an approach for the recognition of toxic habits named entities in Spanish clinical texts. The approach was developed for the ToxHabits Shared Task. Our team participated in subtask 1, which aims to...
Bi-Lipschitz Autoencoder With Injectivity Guarantee
arXiv:2604.06701v1 Announce Type: new Abstract: Autoencoders are widely used for dimensionality reduction, based on the assumption that high-dimensional data lies on low-dimensional manifolds. Regularized autoencoders aim to preserve manifold geometry during dimensionality reduction, but existing approaches often suffer from non-injective...
SMT-AD: a scalable quantum-inspired anomaly detection approach
arXiv:2604.06265v1 Announce Type: new Abstract: Quantum-inspired tensor networks algorithms have shown to be effective and efficient models for machine learning tasks, including anomaly detection. Here, we propose a highly parallelizable quantum-inspired approach which we call SMT-AD from Superposition of Multiresolution...
The Illusion of Superposition? A Principled Analysis of Latent Thinking in Language Models
arXiv:2604.06374v1 Announce Type: new Abstract: Latent reasoning via continuous chain-of-thoughts (Latent CoT) has emerged as a promising alternative to discrete CoT reasoning. Operating in continuous space increases expressivity and has been hypothesized to enable superposition: the ability to maintain multiple...
Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents
arXiv:2604.05549v1 Announce Type: new Abstract: With the widespread application of LLM-based agents across various domains, their complexity has introduced new security threats. Existing red-team methods mostly rely on modifying user prompts, which lack adaptability to new data and may impact...
Bivariate Causal Discovery Using Rate-Distortion MDL: An Information Dimension Approach
arXiv:2604.05829v1 Announce Type: new Abstract: Approaches to bivariate causal discovery based on the minimum description length (MDL) principle approximate the (uncomputable) Kolmogorov complexity of the models in each causal direction, selecting the one with the lower total complexity. The premise...
Memory Dial: A Training Framework for Controllable Memorization in Language Models
arXiv:2604.05074v1 Announce Type: new Abstract: Memorization in language models is widely studied but remains difficult to isolate and control. Understanding when and what models memorize is essential for explaining their predictions, yet existing approaches are post-hoc: they can detect memorization...
Human Values Matter: Investigating How Misalignment Shapes Collective Behaviors in LLM Agent Communities
arXiv:2604.05339v1 Announce Type: new Abstract: As LLMs become increasingly integrated into human society, evaluating their orientations on human values from social science has drawn growing attention. Nevertheless, it is still unclear why human values matter for LLMs, especially in LLM-based...
DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects
arXiv:2604.05318v1 Announce Type: new Abstract: Harmful content detectors-particularly disinformation classifiers-are predominantly developed and evaluated on Standard American English (SAE), leaving their robustness to dialectal variation unexplored. We present DIA-HARM, the first benchmark for evaluating disinformation detection robustness across 50 English...
ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces
arXiv:2604.05172v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed to automate productivity tasks (e.g., email, scheduling, document management), but evaluating them on live services is risky due to potentially irreversible changes. Existing benchmarks rely on simplified...
Controllable Image Generation with Composed Parallel Token Prediction
arXiv:2604.05730v1 Announce Type: new Abstract: Conditional discrete generative models struggle to faithfully compose multiple input conditions. To address this, we derive a theoretically-grounded formulation for composing discrete probabilistic generative processes, with masked generation (absorbing diffusion) as a special case. Our...