CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models
arXiv:2602.20648v1 Announce Type: new Abstract: Client perceptions of the therapeutic alliance are critical for counseling effectiveness. Accurately capturing these perceptions remains challenging, as traditional post-session questionnaires are burdensome and often delayed, while existing computational approaches produce coarse scores, lack interpretable...
CAMEL: Confidence-Gated Reflection for Reward Modeling
arXiv:2602.20670v1 Announce Type: new Abstract: Reward models play a fundamental role in aligning large language models with human preferences. Existing methods predominantly follow two paradigms: scalar discriminative preference models, which are efficient but lack interpretability, and generative judging models, which...
ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition
arXiv:2602.20727v1 Announce Type: new Abstract: LoRA has become a universal Parameter-Efficient Fine-Tuning (PEFT) technique that equips Large Language Models (LLMs) to adapt quickly to new tasks. However, when these models are scaled up, even the latest LoRA variants still introduce...
The Art of Efficient Reasoning: Data, Reward, and Optimization
arXiv:2602.20945v1 Announce Type: new Abstract: Large Language Models (LLMs) consistently benefit from scaled Chain-of-Thought (CoT) reasoning, but also suffer from heavy computational overhead. To address this issue, efficient reasoning aims to incentivize short yet accurate thinking trajectories, typically through reward...
Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving
arXiv:2602.20973v1 Announce Type: new Abstract: To comprehensively evaluate the mathematical reasoning capabilities of Large Language Models (LLMs), researchers have introduced abundant mathematical reasoning datasets. However, most existing datasets primarily focus on linear reasoning, neglecting other parts such as proof by...
Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning
arXiv:2602.21103v1 Announce Type: new Abstract: Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. The standard alternative, fine-tuning smaller models, often sacrifices interpretability while introducing significant resource and operational overhead. To...
Protein Language Models Diverge from Natural Language: Comparative Analysis and Improved Inference
arXiv:2602.20449v1 Announce Type: cross Abstract: Modern Protein Language Models (PLMs) apply transformer-based model architectures from natural language processing to biological sequences, predicting a variety of protein functions and properties. However, protein language has key differences from natural language, such as...
HiSAC: Hierarchical Sparse Activation Compression for Ultra-long Sequence Modeling in Recommenders
arXiv:2602.21009v1 Announce Type: cross Abstract: Modern recommender systems leverage ultra-long user behavior sequences to capture dynamic preferences, but end-to-end modeling is infeasible in production due to latency and memory constraints. While summarizing history via interest centers offers a practical alternative,...
IMOVNO+: A Regional Partitioning and Meta-Heuristic Ensemble Framework for Imbalanced Multi-Class Learning
arXiv:2602.20199v1 Announce Type: new Abstract: Class imbalance, overlap, and noise degrade data quality, reduce model reliability, and limit generalization. Although widely studied in binary classification, these issues remain underexplored in multi-class settings, where complex inter-class relationships make minority-majority structures unclear...
KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem
arXiv:2602.20217v1 Announce Type: new Abstract: Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on static heuristics that ignore the dynamic computational overhead of attention in long-context scenarios. We...
The Truthfulness Spectrum Hypothesis
arXiv:2602.20273v1 Announce Type: new Abstract: Large language models (LLMs) have been reported to linearly encode truthfulness, yet recent work questions this finding's generality. We reconcile these views with the truthfulness spectrum hypothesis: the representational space contains directions ranging from broadly...
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
arXiv:2602.20309v1 Announce Type: new Abstract: Vision-language-action (VLA) models unify perception, language, and control for embodied agents but face significant challenges in practical deployment due to rapidly increasing compute and memory demands, especially as models scale to longer horizons and larger...
cc-Shapley: Measuring Multivariate Feature Importance Needs Causal Context
arXiv:2602.20396v1 Announce Type: new Abstract: Explainable artificial intelligence promises to yield insights into relevant features, thereby enabling humans to examine and scrutinize machine learning models or even facilitating scientific discovery. Considering the widespread technique of Shapley values, we find that...
CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks
arXiv:2602.20419v1 Announce Type: new Abstract: Machine Learning as a Service (MLaaS) has emerged as a widely adopted paradigm for providing access to deep neural network (DNN) models, enabling users to conveniently leverage these models through standardized APIs. However, such services...
Imputation of Unknown Missingness in Sparse Electronic Health Records
arXiv:2602.20442v1 Announce Type: new Abstract: Machine learning holds great promise for advancing the field of medicine, with electronic health records (EHRs) serving as a primary data source. However, EHRs are often sparse and contain missing data due to various challenges...
Nonparametric Teaching of Attention Learners
arXiv:2602.20461v1 Announce Type: new Abstract: Attention learners, neural networks built on the attention mechanism, e.g., transformers, excel at learning the implicit relationships that relate sequences to their corresponding properties, e.g., mapping a given sequence of tokens to the probability of...
Elimination-compensation pruning for fully-connected neural networks
arXiv:2602.20467v1 Announce Type: new Abstract: The unmatched ability of Deep Neural Networks in capturing complex patterns in large and noisy datasets is often associated with their large hypothesis space, and consequently to the vast amount of parameters that characterize model...
Benchmarking GNN Models on Molecular Regression Tasks with CKA-Based Representation Analysis
arXiv:2602.20573v1 Announce Type: new Abstract: Molecules are commonly represented as SMILES strings, which can be readily converted to fixed-size molecular fingerprints. These fingerprints serve as feature vectors to train ML/DL models for molecular property prediction tasks in the field of...
Justices reveal little about whether the deadline for removing cases to federal court can be excused
When a plaintiff files a lawsuit in state court asserting a claim that could be brought in federal court, federal law gives the defendant 30 days to remove the case […]The postJustices reveal little about whether the deadline for removing...
Musk has no proof OpenAI stole xAI trade secrets, judge rules, tossing lawsuit
Even twisting an ex-employee's text to favor xAI's reading fails to sway judge.
Judge doesn't trust DOJ with search of devices seized from Wash. Post reporter
Court to search devices itself instead of letting government have full access.
Gushwork bets on AI search for customer leads — and early results are emerging
Gushwork has raised $9 million in a seed round led by SIG and Lightspeed. The startup has seen early customer traction from AI search tools like ChatGPT.
The White House wants AI companies to cover rate hikes. Most have already said they would.
Many hyperscalers have already made public commitments to cover electricity cost increases.
The public opposition to AI infrastructure is heating up
Public backlash over the data center boom is leading to a variety of draconian policies — including bans on new construction.
Have hard-won scaling lessons to share? Take the stage at TechCrunch Founder Summit 2026
Apply to speak at TechCrunch Founder Summit 2026 by April 17 for a chance to lead a roundtable or breakout session for 1,000 founders and investors. If you’ve built, backed, or operated inside high-growth startups, your experience could shape how...
3 days left: Save up to $680 on your TechCrunch Disrupt 2026 ticket
Just 3 days left to save up to $680 on your TechCrunch Disrupt 2026 ticket. Offer ends on Friday, February 27 at 11:59 p.m. PT. Don't miss unparalleled, curated networking and valuable insights from 250+ tech leaders, and discover 300+...
IAPO: Information-Aware Policy Optimization for Token-Efficient Reasoning
arXiv:2602.19049v1 Announce Type: new Abstract: Large language models increasingly rely on long chains of thought to improve accuracy, yet such gains come with substantial inference-time costs. We revisit token-efficient post-training and argue that existing sequence-level reward-shaping methods offer limited control...
Do LLMs and VLMs Share Neurons for Inference? Evidence and Mechanisms of Cross-Modal Transfer
arXiv:2602.19058v1 Announce Type: new Abstract: Large vision-language models (LVLMs) have rapidly advanced across various domains, yet they still lag behind strong text-only large language models (LLMs) on tasks that require multi-step inference and compositional decision-making. Motivated by their shared transformer...
TriTopic: Tri-Modal Graph-Based Topic Modeling with Iterative Refinement and Archetypes
arXiv:2602.19079v1 Announce Type: new Abstract: Topic modeling extracts latent themes from large text collections, but leading approaches like BERTopic face critical limitations: stochastic instability, loss of lexical precision ("Embedding Blur"), and reliance on a single data perspective. We present TriTopic,...
Astra: Activation-Space Tail-Eigenvector Low-Rank Adaptation of Large Language Models
arXiv:2602.19111v1 Announce Type: new Abstract: Parameter-Efficient Fine-Tuning (PEFT) methods, especially LoRA, are widely used for adapting pre-trained models to downstream tasks due to their computational and storage efficiency. However, in the context of LoRA and its variants, the potential of...