Welcome to ICWSM 2026
ICWSM 2026: International AAAI Conference on Web and Social Media
AAAI Fall Symposia - AAAI
The AAAI Fall Symposium series affords participants a setting where they can learn from each other’s artificial intelligence research.
The 40th Annual AAAI Conference on Artificial Intelligence
The Fortieth AAAI Conference on Artificial Intelligence will be held in Singapore in 2026.
Artificial Intelligence, Ethics, and Society - AAAI
The AAAI/ACM Conference on AI, Ethics, and Society (AIES) is a multi-disciplinary effort to promote discussion and intellectual interchange about AI and its impact on society, ethical concerns, and challenges regarding issues.
Contribute to AAAI
The AAAI divisions responsible for publications are AI Magazine and AAAI Press. Learn about how to contribute to AAAI publications.
Request to Reproduce Copyrighted Materials - AAAI
Materials published by AAAI Press, AAAI, and AI Magazine are subject to copyright both individually and as compilations.
Upcoming Submission Deadlines
Databases and Information Systems Integration, Artificial Intelligence and Decision Support Systems, Information Systems Analysis and Specification, Software Agents and Internet Computing, Human-Computer Interaction, Enterprise Architecture
AAAI Publication Policies & Guidelines - AAAI
AAAI publication policies provides instructions and forms for authors with an accepted paper that will be published by AAAI Press.
GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory
arXiv:2602.12316v1 Announce Type: new Abstract: Frontier AI systems are increasingly capable and deployed in high-stakes multi-agent environments. However, existing AI safety benchmarks largely evaluate single agents, leaving multi-agent risks such as coordination failure and conflict poorly understood. We introduce GT-HarmBench,...
Evolving Beyond Snapshots: Harmonizing Structure and Sequence via Entity State Tuning for Temporal Knowledge Graph Forecasting
arXiv:2602.12389v1 Announce Type: new Abstract: Temporal knowledge graph (TKG) forecasting requires predicting future facts by jointly modeling structural dependencies within each snapshot and temporal evolution across snapshots. However, most existing methods are stateless: they recompute entity representations at each timestamp...
Intent-Driven Smart Manufacturing Integrating Knowledge Graphs and Large Language Models
arXiv:2602.12419v1 Announce Type: new Abstract: The increasing complexity of smart manufacturing environments demands interfaces that can translate high-level human intents into machine-executable actions. This paper presents a unified framework that integrates instruction-tuned Large Language Models (LLMs) with ontology-aligned Knowledge Graphs...
Scaling Web Agent Training through Automatic Data Generation and Fine-grained Evaluation
arXiv:2602.12544v1 Announce Type: new Abstract: We present a scalable pipeline for automatically generating high-quality training data for web agents. In particular, a major challenge in identifying high-quality training instances is trajectory evaluation - quantifying how much progress was made towards...
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics
arXiv:2602.12617v1 Announce Type: new Abstract: This paper presents GeoAgent, a model capable of reasoning closely with humans and deriving fine-grained address conclusions. Previous RL-based methods have achieved breakthroughs in performance and interpretability but still remain concerns because of their reliance...
Consistency of Large Reasoning Models Under Multi-Turn Attacks
arXiv:2602.13093v2 Announce Type: new Abstract: Large reasoning models with reasoning capabilities achieve state-of-the-art performance on complex tasks, but their robustness under multi-turn adversarial pressure remains underexplored. We evaluate nine frontier reasoning models under adversarial attacks. Our findings reveal that reasoning...
A Lightweight LLM Framework for Disaster Humanitarian Information Classification
arXiv:2602.12284v1 Announce Type: cross Abstract: Timely classification of humanitarian information from social media is critical for effective disaster response. However, deploying large language models (LLMs) for this task faces challenges in resource-constrained emergency settings. This paper develops a lightweight, cost-effective...
Energy-Aware Reinforcement Learning for Robotic Manipulation of Articulated Components in Infrastructure Operation and Maintenance
arXiv:2602.12288v1 Announce Type: cross Abstract: With the growth of intelligent civil infrastructure and smart cities, operation and maintenance (O&M) increasingly requires safe, efficient, and energy-conscious robotic manipulation of articulated components, including access doors, service drawers, and pipeline valves. However, existing...
Perceptual Self-Reflection in Agentic Physics Simulation Code Generation
arXiv:2602.12311v1 Announce Type: cross Abstract: We present a multi-agent framework for generating physics simulation code from natural language descriptions, featuring a novel perceptual self-reflection mechanism for validation. The system employs four specialized agents: a natural language interpreter that converts user...
Visible and Hyperspectral Imaging for Quality Assessment of Milk: Property Characterisation and Identification
arXiv:2602.12313v1 Announce Type: cross Abstract: Rapid and non-destructive assessment of milk quality is crucial to ensuring both nutritional value and food safety. In this study, we investigated the potential of visible and hyperspectral imaging as cost-effective and quick-response alternatives to...
Free Lunch in Medical Image Foundation Model Pre-training via Randomized Synthesis and Disentanglement
arXiv:2602.12317v1 Announce Type: cross Abstract: Medical image foundation models (MIFMs) have demonstrated remarkable potential for a wide range of clinical tasks, yet their development is constrained by the scarcity, heterogeneity, and high cost of large-scale annotated datasets. Here, we propose...
ForeAct: Steering Your VLA with Efficient Visual Foresight Planning
arXiv:2602.12322v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models convert high-level language instructions into concrete, executable actions, a task that is especially challenging in open-world environments. We present Visual Foresight Planning (ForeAct), a general and efficient planner that guides a VLA...
Reproducing DragDiffusion: Interactive Point-Based Editing with Diffusion Models
arXiv:2602.12393v1 Announce Type: cross Abstract: DragDiffusion is a diffusion-based method for interactive point-based image editing that enables users to manipulate images by directly dragging selected points. The method claims that accurate spatial control can be achieved by optimizing a single...
Correctness, Artificial Intelligence, and the Epistemic Value of Mathematical Proof
arXiv:2602.12463v1 Announce Type: cross Abstract: We argue that it is neither necessary nor sufficient for a mathematical proof to have epistemic value that it be "correct", in the sense of formalizable in a formal proof system. We then present a...
propella-1: Multi-Property Document Annotation for LLM Data Curation at Scale
arXiv:2602.12414v1 Announce Type: new Abstract: Since FineWeb-Edu, data curation for LLM pretraining has predominantly relied on single scalar quality scores produced by small classifiers. A single score conflates multiple quality dimensions, prevents flexible filtering, and offers no interpretability. We introduce...
RBCorr: Response Bias Correction in Language Models
arXiv:2602.12445v1 Announce Type: new Abstract: Language models (LMs) are known to be prone to response biases, which present as option preference biases in fixed-response questions. It is therefore imperative to develop low-cost and effective response bias correction methods to improve...
Discovering Semantic Latent Structures in Psychological Scales: A Response-Free Pathway to Efficient Simplification
arXiv:2602.12575v1 Announce Type: new Abstract: Psychological scale refinement traditionally relies on response-based methods such as factor analysis, item response theory, and network psychometrics to optimize item composition. Although rigorous, these approaches require large samples and may be constrained by data...
ReFilter: Improving Robustness of Retrieval-Augmented Generation via Gated Filter
arXiv:2602.12709v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) has become a dominant paradigm for grounding large language models (LLMs) with external evidence in knowledge-intensive question answering. A core design choice is how to fuse retrieved samples into the LLMs, where...
Towards a Diagnostic and Predictive Evaluation Methodology for Sequence Labeling Tasks
arXiv:2602.12759v1 Announce Type: new Abstract: Standard evaluation in NLP typically indicates that system A is better on average than system B, but it provides little info on how to improve performance and, what is worse, it should not come as...
Left-right asymmetry in predicting brain activity from LLMs' representations emerges with their formal linguistic competence
arXiv:2602.12811v1 Announce Type: new Abstract: When humans and large language models (LLMs) process the same text, activations in the LLMs correlate with brain activity measured, e.g., with functional magnetic resonance imaging (fMRI). Moreover, it has been shown that, as the...
MentalBench: A Benchmark for Evaluating Psychiatric Diagnostic Capability of Large Language Models
arXiv:2602.12871v1 Announce Type: new Abstract: We introduce MentalBench, a benchmark for evaluating psychiatric diagnostic decision-making in large language models (LLMs). Existing mental health benchmarks largely rely on social media data, limiting their ability to assess DSM-grounded diagnostic judgments. At the...
When Words Don't Mean What They Say: Figurative Understanding in Bengali Idioms
arXiv:2602.12921v1 Announce Type: new Abstract: Figurative language understanding remains a significant challenge for Large Language Models (LLMs), especially for low-resource languages. To address this, we introduce a new idiom dataset, a large-scale, culturally-grounded corpus of 10,361 Bengali idioms. Each idiom...