Automated Thematic Analysis for Clinical Qualitative Data: Iterative Codebook Refinement with Full Provenance
arXiv:2603.08989v1 Announce Type: new Abstract: Thematic analysis (TA) is widely used in health research to extract patterns from patient interviews, yet manual TA faces challenges in scalability and reproducibility. LLM-based automation can help, but existing approaches produce codebooks with limited...
AI Act Evaluation Benchmark: An Open, Transparent, and Reproducible Evaluation Dataset for NLP and RAG Systems
arXiv:2603.09435v1 Announce Type: new Abstract: The rapid rollout of AI in heterogeneous public and societal sectors has subsequently escalated the need for compliance with regulatory standards and frameworks. The EU AI Act has emerged as a landmark in the regulatory...
Cognitively Layered Data Synthesis for Domain Adaptation of LLMs to Space Situational Awareness
arXiv:2603.09231v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate exceptional performance on general-purpose tasks. however, transferring them to complex engineering domains such as space situational awareness (SSA) remains challenging owing to insufficient structural alignment with mission chains, the absence...
Curveball Steering: The Right Direction To Steer Isn't Always Linear
arXiv:2603.09313v1 Announce Type: new Abstract: Activation steering is a widely used approach for controlling large language model (LLM) behavior by intervening on internal representations. Existing methods largely rely on the Linear Representation Hypothesis, assuming behavioral attributes can be manipulated using...
Are Expressive Encoders Necessary for Discrete Graph Generation?
arXiv:2603.08825v1 Announce Type: new Abstract: Discrete graph generation has emerged as a powerful paradigm for modeling graph data, often relying on highly expressive neural backbones such as transformers or higher-order architectures. We revisit this design choice by introducing GenGNN, a...
MAcPNN: Mutual Assisted Learning on Data Streams with Temporal Dependence
arXiv:2603.08972v1 Announce Type: new Abstract: Internet of Things (IoT) Analytics often involves applying machine learning (ML) models on data streams. In such scenarios, traditional ML paradigms face obstacles related to continuous learning while dealing with concept drifts, temporal dependence, and...
MAPLE: Elevating Medical Reasoning from Statistical Consensus to Process-Led Alignment
arXiv:2603.08987v1 Announce Type: new Abstract: Recent advances in medical large language models have explored Test-Time Reinforcement Learning (TTRL) to enhance reasoning. However, standard TTRL often relies on majority voting (MV) as a heuristic supervision signal, which can be unreliable in...
An accurate flatness measure to estimate the generalization performance of CNN models
arXiv:2603.09016v1 Announce Type: new Abstract: Flatness measures based on the spectrum or the trace of the Hessian of the loss are widely used as proxies for the generalization ability of deep networks. However, most existing definitions are either tailored to...
$P^2$GNN: Two Prototype Sets to boost GNN Performance
arXiv:2603.09195v1 Announce Type: new Abstract: Message Passing Graph Neural Networks (MP-GNNs) have garnered attention for addressing various industry challenges, such as user recommendation and fraud detection. However, they face two major hurdles: (1) heavy reliance on local context, often lacking...
Strategically Robust Multi-Agent Reinforcement Learning with Linear Function Approximation
arXiv:2603.09208v1 Announce Type: new Abstract: Provably efficient and robust equilibrium computation in general-sum Markov games remains a core challenge in multi-agent reinforcement learning. Nash equilibrium is computationally intractable in general and brittle due to equilibrium multiplicity and sensitivity to approximation...
Lying to Win: Assessing LLM Deception through Human-AI Games and Parallel-World Probing
arXiv:2603.07202v1 Announce Type: new Abstract: As Large Language Models (LLMs) transition into autonomous agentic roles, the risk of deception-defined behaviorally as the systematic provision of false information to satisfy external incentives-poses a significant challenge to AI safety. Existing benchmarks often...
RILEC: Detection and Generation of L1 Russian Interference Errors in English Learner Texts
arXiv:2603.07366v1 Announce Type: new Abstract: Many errors in student essays can be explained by influence from the native language (L1). L1 interference refers to errors influenced by a speaker's first language, such as using stadion instead of stadium, reflecting lexical...
A Joint Neural Baseline for Concept, Assertion, and Relation Extraction from Clinical Text
arXiv:2603.07487v1 Announce Type: new Abstract: Clinical information extraction (e.g., 2010 i2b2/VA challenge) usually presents tasks of concept recognition, assertion classification, and relation extraction. Jointly modeling the multi-stage tasks in the clinical domain is an underexplored topic. The existing independent task...
Bolbosh: Script-Aware Flow Matching for Kashmiri Text-to-Speech
arXiv:2603.07513v1 Announce Type: new Abstract: Kashmiri is spoken by around 7 million people but remains critically underserved in speech technology, despite its official status and rich linguistic heritage. The lack of robust Text-to-Speech (TTS) systems limits digital accessibility and inclusive...
HEARTS: Benchmarking LLM Reasoning on Health Time Series
arXiv:2603.06638v1 Announce Type: new Abstract: The rise of large language models (LLMs) has shifted time series analysis from narrow analytics to general-purpose reasoning. Yet, existing benchmarks cover only a small set of health time series modalities and tasks, failing to...
Orion: Characterizing and Programming Apple's Neural Engine for LLM Training and Inference
arXiv:2603.06728v1 Announce Type: new Abstract: Over two billion Apple devices ship with a Neural Processing Unit (NPU) - the Apple Neural Engine (ANE) - yet this accelerator remains largely unused for large language model workloads. CoreML, Apple's public ML framework,...
Qualcomm’s partnership with Neura Robotics is just the beginning
Neura Robotics is going to build new robots on top of Qualcomm's new IQ10 processors that were released at CES.
Boosting deep Reinforcement Learning using pretraining with Logical Options
arXiv:2603.06565v1 Announce Type: new Abstract: Deep reinforcement learning agents are often misaligned, as they over-exploit early reward signals. Recently, several symbolic approaches have addressed these challenges by encoding sparse objectives along with aligned plans. However, purely symbolic architectures are complex...
PRISM: Personalized Refinement of Imitation Skills for Manipulation via Human Instructions
arXiv:2603.05574v1 Announce Type: cross Abstract: This paper presents PRISM: an instruction-conditioned refinement method for imitation policies in robotic manipulation. This approach bridges Imitation Learning (IL) and Reinforcement Learning (RL) frameworks into a seamless pipeline, such that an imitation policy on...
Relational Semantic Reasoning on 3D Scene Graphs for Open World Interactive Object Search
arXiv:2603.05642v1 Announce Type: cross Abstract: Open-world interactive object search in household environments requires understanding semantic relationships between objects and their surrounding context to guide exploration efficiently. Prior methods either rely on vision-language embeddings similarity, which does not reliably capture task-relevant...
CBR-to-SQL: Rethinking Retrieval-based Text-to-SQL using Case-based Reasoning in the Healthcare Domain
arXiv:2603.05569v1 Announce Type: cross Abstract: Extracting insights from Electronic Health Record (EHR) databases often requires SQL expertise, creating a barrier for healthcare decision-making and research. While a promising approach is to use Large Language Models (LLMs) to translate natural language...
The Fragility Of Moral Judgment In Large Language Models
arXiv:2603.05651v1 Announce Type: cross Abstract: People increasingly use large language models (LLMs) for everyday moral and interpersonal guidance, yet these systems cannot interrogate missing context and judge dilemmas as presented. We introduce a perturbation framework for testing the stability and...
Offline Materials Optimization with CliqueFlowmer
arXiv:2603.06082v1 Announce Type: new Abstract: Recent advances in deep learning inspired neural network-based approaches to computational materials discovery (CMD). A plethora of problems in this field involve finding materials that optimize a target property. Nevertheless, the increasingly popular generative modeling...
The EpisTwin: A Knowledge Graph-Grounded Neuro-Symbolic Architecture for Personal AI
arXiv:2603.06290v1 Announce Type: new Abstract: Personal Artificial Intelligence is currently hindered by the fragmentation of user data across isolated silos. While Retrieval-Augmented Generation offers a partial remedy, its reliance on unstructured vector similarity fails to capture the latent semantic topology...
JAWS: Enhancing Long-term Rollout of Neural Operators via Spatially-Adaptive Jacobian Regularization
arXiv:2603.05538v1 Announce Type: cross Abstract: Data-driven surrogate models improve the efficiency of simulating continuous dynamical systems, yet their autoregressive rollouts are often limited by instability and spectral blow-up. While global regularization techniques can enforce contractive dynamics, they uniformly damp high-frequency...
InfoGatherer: Principled Information Seeking via Evidence Retrieval and Strategic Questioning
arXiv:2603.05909v1 Announce Type: new Abstract: LLMs are increasingly deployed in high-stakes domains such as medical triage and legal assistance, often as document-grounded QA systems in which a user provides a description, relevant sources are retrieved, and an LLM generates a...
Making Implicit Premises Explicit in Logical Understanding of Enthymemes
arXiv:2603.06114v1 Announce Type: new Abstract: Real-world arguments in text and dialogues are normally enthymemes (i.e. some of their premises and/or claims are implicit). Natural language processing (NLP) methods for handling enthymemes can potentially identify enthymemes in text but they do...
IntSeqBERT: Learning Arithmetic Structure in OEIS via Modulo-Spectrum Embeddings
arXiv:2603.05556v1 Announce Type: new Abstract: Integer sequences in the OEIS span values from single-digit constants to astronomical factorials and exponentials, making prediction challenging for standard tokenised models that cannot handle out-of-vocabulary values or exploit periodic arithmetic structure. We present IntSeqBERT,...
A Novel Hybrid Heuristic-Reinforcement Learning Optimization Approach for a Class of Railcar Shunting Problems
arXiv:2603.05579v1 Announce Type: new Abstract: Railcar shunting is a core planning task in freight railyards, where yard planners need to disassemble and reassemble groups of railcars to form outbound trains. Classification tracks with access from one side only can be...
Warm Starting State-Space Models with Automata Learning
arXiv:2603.05694v1 Announce Type: new Abstract: We prove that Moore machines can be exactly realized as state-space models (SSMs), establishing a formal correspondence between symbolic automata and these continuous machine learning architectures. These Moore-SSMs preserve both the complete symbolic structure and...