ProtAlign: Contrastive learning paradigm for Sequence and structure alignment
arXiv:2603.06722v1 Announce Type: new Abstract: Protein language models often take into consideration the alignment between a protein sequence and its textual description. However, they do not take structural information into consideration. Traditional methods treat sequence and structure separately, limiting the...
Property-driven Protein Inverse Folding With Multi-Objective Preference Alignment
arXiv:2603.06748v1 Announce Type: new Abstract: Protein sequence design must balance designability, defined as the ability to recover a target backbone, with multiple, often competing, developability properties such as solubility, thermostability, and expression. Existing approaches address these properties through post hoc...
US blindsides states with surprise settlement in Live Nation/Ticketmaster trial
States seek mistrial, saying "sudden disappearance" of US will influence jury.
OpenAI and Google employees rush to Anthropic’s defense in DOD lawsuit
More than 30 OpenAI and Google DeepMind employees signed onto a statement supporting Anthropic's lawsuit against the Defense Department after the agency labeled the AI firm a supply-chain risk, according to court filings.
Anthropic sues Defense Department over supply-chain risk designation
Anthropic filed suit against the Department of Defense on Monday after the agency labeled it a supply-chain risk. The complaint calls the DOD's actions "unprecedented and unlawful."
An Interactive Multi-Agent System for Evaluation of New Product Concepts
arXiv:2603.05980v1 Announce Type: new Abstract: Product concept evaluation is a critical stage that determines strategic resource allocation and project success in enterprises. However, traditional expert-led approaches face limitations such as subjective bias and high time and cost requirements. To support...
Relational Semantic Reasoning on 3D Scene Graphs for Open World Interactive Object Search
arXiv:2603.05642v1 Announce Type: cross Abstract: Open-world interactive object search in household environments requires understanding semantic relationships between objects and their surrounding context to guide exploration efficiently. Prior methods either rely on vision-language embeddings similarity, which does not reliably capture task-relevant...
Offline Materials Optimization with CliqueFlowmer
arXiv:2603.06082v1 Announce Type: new Abstract: Recent advances in deep learning inspired neural network-based approaches to computational materials discovery (CMD). A plethora of problems in this field involve finding materials that optimize a target property. Nevertheless, the increasingly popular generative modeling...
Talk Freely, Execute Strictly: Schema-Gated Agentic AI for Flexible and Reproducible Scientific Workflows
arXiv:2603.06394v1 Announce Type: new Abstract: Large language models (LLMs) can now translate a researcher's plain-language goal into executable computation, yet scientific workflows demand determinism, provenance, and governance that are difficult to guarantee when an LLM decides what runs. Semi-structured interviews...
DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality
arXiv:2603.05912v1 Announce Type: new Abstract: Search-augmented LLM agents can produce deep research reports (DRRs), but verifying claim-level factuality remains challenging. Existing fact-checkers are primarily designed for general-domain, factoid-style atomic claims, and there is no benchmark to test whether such verifiers...
VDCook:DIY video data cook your MLLMs
arXiv:2603.05539v1 Announce Type: cross Abstract: We introduce VDCook: a self-evolving video data operating system, a configurable video data construction platform for researchers and vertical domain teams. Users initiate data requests via natural language queries and adjustable parameters (scale, retrieval-synthesis ratio,...
Reasoning Models Struggle to Control their Chains of Thought
arXiv:2603.05706v1 Announce Type: new Abstract: Chain-of-thought (CoT) monitoring is a promising tool for detecting misbehaviors and understanding the motivations of modern reasoning models. However, if models can control what they verbalize in their CoT, it could undermine CoT monitorability. To...
Spatiotemporal Heterogeneity of AI-Driven Traffic Flow Patterns and Land Use Interaction: A GeoAI-Based Analysis of Multimodal Urban Mobility
arXiv:2603.05581v1 Announce Type: cross Abstract: Urban traffic flow is governed by the complex, nonlinear interaction between land use configuration and spatiotemporally heterogeneous mobility demand. Conventional global regression and time-series models cannot simultaneously capture these multi-scale dynamics across multiple travel modes....
On the Reliability of AI Methods in Drug Discovery: Evaluation of Boltz-2 for Structure and Binding Affinity Prediction
arXiv:2603.05532v1 Announce Type: cross Abstract: Despite continuing hype about the role of AI in drug discovery, no "AI-discovered drugs" have so far received regulatory approval. Here we assess one of the latest AI based tools in this domain. The ability...
Autonomous Algorithm Discovery for Ptychography via Evolutionary LLM Reasoning
arXiv:2603.05696v1 Announce Type: cross Abstract: Ptychography is a computational imaging technique widely used for high-resolution materials characterization, but high-quality reconstructions often require the use of regularization functions that largely remain manually designed. We introduce Ptychi-Evolve, an autonomous framework that uses...
Verify as You Go: An LLM-Powered Browser Extension for Fake News Detection
arXiv:2603.05519v1 Announce Type: new Abstract: The rampant spread of fake news in the digital age poses serious risks to public trust and democratic institutions, underscoring the need for effective, transparent, and user-centered detection tools. Existing browser extensions often fall short...
NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution
arXiv:2603.05617v1 Announce Type: new Abstract: We present NOTAI.AI, an explainable framework for machine-generated text detection that extends Fast-DetectGPT by integrating curvature-based signals with neural and stylometric features in a supervised setting. The system combines 17 interpretable features, including Conditional Probability...
Let's Talk, Not Type: An Oral-First Multi-Agent Architecture for Guaran\'i
arXiv:2603.05743v1 Announce Type: new Abstract: Although artificial intelligence (AI) and Human-Computer Interaction (HCI) systems are often presented as universal solutions, their design remains predominantly text-first, underserving primarily oral languages and indigenous communities. This position paper uses Guaran\'i, an official and...
NERdME: a Named Entity Recognition Dataset for Indexing Research Artifacts in Code Repositories
arXiv:2603.05750v1 Announce Type: new Abstract: Existing scholarly information extraction (SIE) datasets focus on scientific papers and overlook implementation-level details in code repositories. README files describe datasets, source code, and other implementation-level artifacts, however, their free-form Markdown offers little semantic structure,...
HART: Data-Driven Hallucination Attribution and Evidence-Based Tracing for Large Language Models
arXiv:2603.05828v1 Announce Type: new Abstract: Large language models (LLMs) have demonstrated remarkable performance in text generation and knowledge-intensive question answering. Nevertheless, they are prone to producing hallucinated content, which severely undermines their reliability in high-stakes application domains. Existing hallucination attribution...
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs
arXiv:2603.05890v1 Announce Type: new Abstract: What happens when a storyteller forgets its own story? Large Language Models (LLMs) can now generate narratives spanning tens of thousands of words, but they often fail to maintain consistency throughout. When generating long-form narratives,...
InfoGatherer: Principled Information Seeking via Evidence Retrieval and Strategic Questioning
arXiv:2603.05909v1 Announce Type: new Abstract: LLMs are increasingly deployed in high-stakes domains such as medical triage and legal assistance, often as document-grounded QA systems in which a user provides a description, relevant sources are retrieved, and an LLM generates a...
Making Implicit Premises Explicit in Logical Understanding of Enthymemes
arXiv:2603.06114v1 Announce Type: new Abstract: Real-world arguments in text and dialogues are normally enthymemes (i.e. some of their premises and/or claims are implicit). Natural language processing (NLP) methods for handling enthymemes can potentially identify enthymemes in text but they do...
Diffusion Language Models Are Natively Length-Aware
arXiv:2603.06123v1 Announce Type: new Abstract: Unlike autoregressive language models, which terminate variable-length generation upon predicting an End-of-Sequence (EoS) token, Diffusion Language Models (DLMs) operate over a fixed maximum-length context window for a predetermined number of denoising steps. However, this process...
MAPO: Mixed Advantage Policy Optimization for Long-Horizon Multi-Turn Dialogue
arXiv:2603.06194v1 Announce Type: new Abstract: Subjective multi-turn dialogue tasks, such as emotional support, require conversational policies that adapt to evolving user states and optimize long-horizon interaction quality. However, reinforcement learning (RL) for such settings remains challenging due to the absence...
LIT-RAGBench: Benchmarking Generator Capabilities of Large Language Models in Retrieval-Augmented Generation
arXiv:2603.06198v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) is a framework in which a Generator, such as a Large Language Model (LLM), produces answers by retrieving documents from an external collection using a Retriever. In practice, Generators must integrate evidence...
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling
arXiv:2603.06199v1 Announce Type: new Abstract: Long-context modeling is a pivotal capability for Large Language Models, yet the quadratic complexity of attention remains a critical bottleneck, particularly during the compute-intensive prefilling phase. While various sparse attention mechanisms have been explored, they...
The Art That Poses Back: Assessing AI Pastiches after Contemporary Artworks
arXiv:2603.06324v1 Announce Type: new Abstract: This study explores artificial visual creativity, focusing on ChatGPT's ability to generate new images intentionally pastiching original artworks such as paintings, drawings, sculptures and installations. The process involved twelve artists from Romania, Bulgaria, France, Austria,...
Transparent AI for Mathematics: Transformer-Based Large Language Models for Mathematical Entity Relationship Extraction with XAI
arXiv:2603.06348v1 Announce Type: new Abstract: Mathematical text understanding is a challenging task due to the presence of specialized entities and complex relationships between them. This study formulates mathematical problem interpretation as a Mathematical Entity Relation Extraction (MERE) task, where operands...
Evaluation of Deontic Conditional Reasoning in Large Language Models: The Case of Wason's Selection Task
arXiv:2603.06416v1 Announce Type: new Abstract: As large language models (LLMs) advance in linguistic competence, their reasoning abilities are gaining increasing attention. In humans, reasoning often performs well in domain specific settings, particularly in normative rather than purely formal contexts. Although...