Benchmarking GNN Models on Molecular Regression Tasks with CKA-Based Representation Analysis
arXiv:2602.20573v1 Announce Type: new Abstract: Molecules are commonly represented as SMILES strings, which can be readily converted to fixed-size molecular fingerprints. These fingerprints serve as feature vectors to train ML/DL models for molecular property prediction tasks in the field of...
Upper-Linearizability of Online Non-Monotone DR-Submodular Maximization over Down-Closed Convex Sets
arXiv:2602.20578v1 Announce Type: new Abstract: We study online maximization of non-monotone Diminishing-Return(DR)-submodular functions over down-closed convex sets, a regime where existing projection-free online methods suffer from suboptimal regret and limited feedback guarantees. Our main contribution is a new structural result...
Justices send litigation about tainted baby food back to state court
Yesterday’s decision in The Hain Celestial Group v Palmquist resolves a technical problem about what to do when district courts make a mistaken ruling about their own jurisdiction. The final […]The postJustices send litigation about tainted baby food back to...
Justices reveal little about whether the deadline for removing cases to federal court can be excused
When a plaintiff files a lawsuit in state court asserting a claim that could be brought in federal court, federal law gives the defendant 30 days to remove the case […]The postJustices reveal little about whether the deadline for removing...
SCOTUStoday for Wednesday, February 25: SCOTUS and the State of the Union
Another day, another live blog. Join us to discuss the possible announcement of opinions this morning beginning at 9:30 a.m. EST.The postSCOTUStoday for Wednesday, February 25: SCOTUS and the State of the Unionappeared first onSCOTUSblog.
Judge doesn't trust DOJ with search of devices seized from Wash. Post reporter
Court to search devices itself instead of letting government have full access.
Salesforce CEO Marc Benioff: This isn’t our first SaaSpocalypse
Salesforce reported a solid year-end earnings and then pulled out all the stops to ward off more talk of the death of its business to AI.
Gushwork bets on AI search for customer leads — and early results are emerging
Gushwork has raised $9 million in a seed round led by SIG and Lightspeed. The startup has seen early customer traction from AI search tools like ChatGPT.
Nvidia has another record quarter amid record capex spends
"The demand for tokens in the world has gone completely exponential," Nvidia CEO Jensen Huang said about the company's earnings.
Alphabet-owned robotics software company Intrinsic joins Google
Nearly five years after graduating into an independent Alphabet company, Intrinsic is moving under Google's domain.
Wearable startup CUDIS launches a new health ring line with an AI-fueled ‘coach’
The wearable incentivizes healthy behavior with points that can be redeemed for health products.
OpenClaw creator’s advice to AI builders is to be more playful and allow yourself time to improve
Peter Steinberger talks about the creation of his viral AI agent OpenClaw and how being more "playful" makes for a better way to learn AI coding.
About 12% of US teens turn to AI for emotional support or advice
General-purpose tools like ChatGPT, Claude, and Grok are not designed for this use, making mental health professionals wary.
US tells diplomats to lobby against foreign data sovereignty laws
The Trump administration has ordered U.S. diplomats to lobby against countries' attempts to regulate how American tech companies handle foreigners' data.
TriTopic: Tri-Modal Graph-Based Topic Modeling with Iterative Refinement and Archetypes
arXiv:2602.19079v1 Announce Type: new Abstract: Topic modeling extracts latent themes from large text collections, but leading approaches like BERTopic face critical limitations: stochastic instability, loss of lexical precision ("Embedding Blur"), and reliance on a single data perspective. We present TriTopic,...
How Do LLMs Encode Scientific Quality? An Empirical Study Using Monosemantic Features from Sparse Autoencoders
arXiv:2602.19115v1 Announce Type: new Abstract: In recent years, there has been a growing use of generative AI, and large language models (LLMs) in particular, to support both the assessment and generation of scientific work. Although some studies have shown that...
AgenticRAGTracer: A Hop-Aware Benchmark for Diagnosing Multi-Step Retrieval Reasoning in Agentic RAG
arXiv:2602.19127v1 Announce Type: new Abstract: With the rapid advancement of agent-based methods in recent years, Agentic RAG has undoubtedly become an important research direction. Multi-hop reasoning, which requires models to engage in deliberate thinking and multi-step interaction, serves as a...
Facet-Level Persona Control by Trait-Activated Routing with Contrastive SAE for Role-Playing LLMs
arXiv:2602.19157v1 Announce Type: new Abstract: Personality control in Role-Playing Agents (RPAs) is commonly achieved via training-free methods that inject persona descriptions and memory through prompts or retrieval-augmented generation, or via supervised fine-tuning (SFT) on persona-specific corpora. While SFT can be...
Next Reply Prediction X Dataset: Linguistic Discrepancies in Naively Generated Content
arXiv:2602.19177v1 Announce Type: new Abstract: The increasing use of Large Language Models (LLMs) as proxies for human participants in social science research presents a promising, yet methodologically risky, paradigm shift. While LLMs offer scalability and cost-efficiency, their "naive" application, where...
Retrieval Augmented Enhanced Dual Co-Attention Framework for Target Aware Multimodal Bengali Hateful Meme Detection
arXiv:2602.19212v1 Announce Type: new Abstract: Hateful content on social media increasingly appears as multimodal memes that combine images and text to convey harmful narratives. In low-resource languages such as Bengali, automated detection remains challenging due to limited annotated data, class...
Learning to Reason for Multi-Step Retrieval of Personal Context in Personalized Question Answering
arXiv:2602.19317v1 Announce Type: new Abstract: Personalization in Question Answering (QA) requires answers that are both accurate and aligned with users' background, preferences, and historical context. Existing state-of-the-art methods primarily rely on retrieval-augmented generation (RAG) solutions that construct personal context by...
PerSoMed: A Large-Scale Balanced Dataset for Persian Social Media Text Classification
arXiv:2602.19333v1 Announce Type: new Abstract: This research introduces the first large-scale, well-balanced Persian social media text classification dataset, specifically designed to address the lack of comprehensive resources in this domain. The dataset comprises 36,000 posts across nine categories (Economic, Artistic,...
How to Train Your Deep Research Agent? Prompt, Reward, and Policy Optimization in Search-R1
arXiv:2602.19526v1 Announce Type: new Abstract: Deep Research agents tackle knowledge-intensive tasks through multi-round retrieval and decision-oriented generation. While reinforcement learning (RL) has been shown to improve performance in this paradigm, its contributions remain underexplored. To fully understand the role of...
Sculpting the Vector Space: Towards Efficient Multi-Vector Visual Document Retrieval via Prune-then-Merge Framework
arXiv:2602.19549v1 Announce Type: new Abstract: Visual Document Retrieval (VDR), which aims to retrieve relevant pages within vast corpora of visually-rich documents, is of significance in current multimodal retrieval applications. The state-of-the-art multi-vector paradigm excels in performance but suffers from prohibitive...
DEEP: Docker-based Execution and Evaluation Platform
arXiv:2602.19583v1 Announce Type: new Abstract: Comparative evaluation of several systems is a recurrent task in researching. It is a key step before deciding which system to use for our work, or, once our research has been conducted, to demonstrate the...
Eye-Tracking-while-Reading: A Living Survey of Datasets with Open Library Support
arXiv:2602.19598v1 Announce Type: new Abstract: Eye-tracking-while-reading corpora are a valuable resource for many different disciplines and use cases. Use cases range from studying the cognitive processes underlying reading to machine-learning-based applications, such as gaze-based assessments of reading comprehension. The past...
Anatomy of Unlearning: The Dual Impact of Fact Salience and Model Fine-Tuning
arXiv:2602.19612v1 Announce Type: new Abstract: Machine Unlearning (MU) enables Large Language Models (LLMs) to remove unsafe or outdated information. However, existing work assumes that all facts are equally forgettable and largely ignores whether the forgotten knowledge originates from pretraining or...
Revisiting the Seasonal Trend Decomposition for Enhanced Time Series Forecasting
arXiv:2602.18465v1 Announce Type: new Abstract: Time series forecasting presents significant challenges in real-world applications across various domains. Building upon the decomposition of the time series, we enhance the architecture of machine learning models for better multivariate time series forecasting. To...
Physiologically Informed Deep Learning: A Multi-Scale Framework for Next-Generation PBPK Modeling
arXiv:2602.18472v1 Announce Type: new Abstract: Physiologically Based Pharmacokinetic (PBPK) modeling is a cornerstone of model-informed drug development (MIDD), providing a mechanistic framework to predict drug absorption, distribution, metabolism, and excretion (ADME). Despite its utility, adoption is hindered by high computational...
Decentralized Attention Fails Centralized Signals: Rethinking Transformers for Medical Time Series
arXiv:2602.18473v1 Announce Type: new Abstract: Accurate analysis of medical time series (MedTS) data, such as electroencephalography (EEG) and electrocardiography (ECG), plays a pivotal role in healthcare applications, including the diagnosis of brain and heart diseases. MedTS data typically exhibit two...