SensorPersona: An LLM-Empowered System for Continual Persona Extraction from Longitudinal Mobile Sensor Streams
arXiv:2604.06204v1 Announce Type: new Abstract: Personalization is essential for Large Language Model (LLM)-based agents to adapt to users' preferences and improve response quality and task performance. However, most existing approaches infer personas from chat histories, which capture only self-disclosed information rather than users' everyday behaviors in the physical world, limiting the ability to infer comprehensive user personas. In this work, we introduce SensorPersona, an LLM-empowered system that continuously infers stable user personas from multimodal longitudinal sensor streams unobtrusively collected from users' mobile devices. SensorPersona first performs person-oriented context encoding on continuous sensor streams to enrich the semantics of sensor contexts. It then employs hierarchical persona reasoning that integrates intra- and inter-episode reasoning to infer personas spanning physical patterns, psychosocial traits, and life experiences. Finally, it employs clustering
arXiv:2604.06204v1 Announce Type: new Abstract: Personalization is essential for Large Language Model (LLM)-based agents to adapt to users' preferences and improve response quality and task performance. However, most existing approaches infer personas from chat histories, which capture only self-disclosed information rather than users' everyday behaviors in the physical world, limiting the ability to infer comprehensive user personas. In this work, we introduce SensorPersona, an LLM-empowered system that continuously infers stable user personas from multimodal longitudinal sensor streams unobtrusively collected from users' mobile devices. SensorPersona first performs person-oriented context encoding on continuous sensor streams to enrich the semantics of sensor contexts. It then employs hierarchical persona reasoning that integrates intra- and inter-episode reasoning to infer personas spanning physical patterns, psychosocial traits, and life experiences. Finally, it employs clustering-aware incremental verification and temporal evidence-aware updating to adapt to evolving personas. We evaluate SensorPersona on a self-collected dataset containing 1,580 hours of sensor data from 20 participants, collected over up to 3 months across 17 cities on 3 continents. Results show that SensorPersona achieves up to 31.4% higher recall in persona extraction, an 85.7% win rate in persona-aware agent responses, and notable improvements in user satisfaction compared to state-of-the-art baselines.
Executive Summary
SensorPersona introduces an innovative LLM-empowered system for continuous, unobtrusive persona extraction from longitudinal mobile sensor data. Moving beyond self-disclosed chat histories, it leverages physical world behaviors to infer comprehensive user personas encompassing physical patterns, psychosocial traits, and life experiences. The system employs person-oriented context encoding, hierarchical persona reasoning (intra- and inter-episode), and adaptive updating mechanisms. Evaluated on a substantial dataset, SensorPersona demonstrates superior recall in persona extraction, improved LLM agent response quality, and enhanced user satisfaction, marking a significant advancement in personalized AI agents by grounding personalization in real-world behavioral data.
Key Points
- ▸ SensorPersona extracts comprehensive user personas from longitudinal mobile sensor data, not just chat histories.
- ▸ It employs person-oriented context encoding and hierarchical reasoning (intra- and inter-episode) for robust persona inference.
- ▸ The system continuously adapts to evolving personas through incremental verification and temporal evidence-aware updating.
- ▸ Evaluated on a diverse, self-collected dataset, it shows significant performance gains over state-of-the-art baselines in persona recall and agent response quality.
Merits
Novelty in Data Source
Shifts persona inference from self-disclosed chat data to unobtrusive, longitudinal mobile sensor streams, capturing richer, real-world behavioral insights.
Comprehensive Persona Scope
Infers personas spanning physical patterns, psychosocial traits, and life experiences, offering a more holistic user profile than current methods.
Robust Methodological Design
Integrates person-oriented context encoding, hierarchical reasoning, and adaptive updating, addressing the complexities of continuous, evolving persona extraction.
Strong Empirical Validation
Evaluated on a substantial, diverse self-collected dataset across multiple continents, demonstrating significant improvements in key metrics (recall, agent win rate, user satisfaction).
Demerits
Ethical and Privacy Concerns
The unobtrusive collection of longitudinal sensor data raises significant privacy implications, necessitating robust ethical safeguards and transparency.
Generalizability of Psychosocial Traits
Inferring complex psychosocial traits solely from sensor data, while innovative, may lack the nuance and accuracy of self-reported or clinically assessed measures.
Interpretability and Explainability
The LLM-empowered, hierarchical reasoning process may obscure how specific sensor inputs map to inferred persona traits, challenging interpretability and potential bias detection.
Data Collection Burden and Scalability
Self-collecting 1,580 hours of diverse sensor data is substantial but scaling this for broader deployment presents significant logistical and consent challenges.
Expert Commentary
SensorPersona marks a compelling paradigm shift in personalization, moving beyond declarative user data to deeply inferred behavioral profiles from ambient sensor streams. This innovation promises unprecedented levels of contextual awareness for LLM agents, transforming interaction paradigms. However, the academic rigor of its methodology, particularly the hierarchical reasoning and adaptive updating, must be weighed against profound ethical and legal challenges. The transition from raw sensor data to 'psychosocial traits' involves significant inferential leaps, raising questions about accuracy, interpretability, and potential for algorithmic bias. While the technical achievements are noteworthy, the societal implications, particularly concerning privacy, autonomy, and the potential for discriminatory profiling, demand immediate and comprehensive legal and ethical scrutiny. This work is a crucial inflection point, highlighting the urgent need for robust regulatory frameworks and a societal dialogue on the acceptable limits of AI-driven behavioral profiling.
Recommendations
- ✓ Conduct a thorough legal and ethical impact assessment, prioritizing user privacy, data security, and the potential for misuse of inferred persona data.
- ✓ Develop robust interpretability and explainability mechanisms for the persona inference process, allowing users and regulators to understand how traits are derived from sensor data.
- ✓ Implement 'privacy-by-design' principles throughout the system, including anonymization, differential privacy techniques, and granular user consent controls for data sharing and persona usage.
- ✓ Explore methods for 'persona auditing' to detect and mitigate algorithmic biases, ensuring fair and equitable treatment across diverse user groups.
- ✓ Engage with policymakers and legal scholars to develop appropriate governance models and regulatory standards for AI systems that leverage inferred behavioral profiles.
Sources
Original: arXiv - cs.CL