Skip to main content
Academic

Projective Psychological Assessment of Large Multimodal Models Using Thematic Apperception Tests

arXiv:2602.17108v1 Announce Type: new Abstract: Thematic Apperception Test (TAT) is a psychometrically grounded, multidimensional assessment framework that systematically differentiates between cognitive-representational and affective-relational components of personality-like functioning. This test is a projective psychological framework designed to uncover unconscious aspects of personality. This study examines whether the personality traits of Large Multimodal Models (LMMs) can be assessed through non-language-based modalities, using the Social Cognition and Object Relations Scale - Global (SCORS-G). LMMs are employed in two distinct roles: as subject models (SMs), which generate stories in response to TAT images, and as evaluator models (EMs), who assess these narratives using the SCORS-G framework. Evaluators demonstrated an excellent ability to understand and analyze TAT responses. Their interpretations are highly consistent with those of human experts. Assessment results highlig

arXiv:2602.17108v1 Announce Type: new Abstract: Thematic Apperception Test (TAT) is a psychometrically grounded, multidimensional assessment framework that systematically differentiates between cognitive-representational and affective-relational components of personality-like functioning. This test is a projective psychological framework designed to uncover unconscious aspects of personality. This study examines whether the personality traits of Large Multimodal Models (LMMs) can be assessed through non-language-based modalities, using the Social Cognition and Object Relations Scale - Global (SCORS-G). LMMs are employed in two distinct roles: as subject models (SMs), which generate stories in response to TAT images, and as evaluator models (EMs), who assess these narratives using the SCORS-G framework. Evaluators demonstrated an excellent ability to understand and analyze TAT responses. Their interpretations are highly consistent with those of human experts. Assessment results highlight that all models understand interpersonal dynamics very well and have a good grasp of the concept of self. However, they consistently fail to perceive and regulate aggression. Performance varied systematically across model families, with larger and more recent models consistently outperforming smaller and earlier ones across SCORS-G dimensions.

Executive Summary

The study titled 'Projective Psychological Assessment of Large Multimodal Models Using Thematic Apperception Tests' explores the application of the Thematic Apperception Test (TAT) and the Social Cognition and Object Relations Scale - Global (SCORS-G) to assess the personality traits of Large Multimodal Models (LMMs). The research evaluates LMMs in dual roles: as subject models (SMs) generating stories from TAT images and as evaluator models (EMs) analyzing these narratives. The findings indicate that LMMs exhibit a strong understanding of interpersonal dynamics and self-concept but struggle with perceiving and regulating aggression. Performance varies across model families, with larger and more recent models showing superior capabilities. The study highlights the potential of projective psychological frameworks in evaluating AI models' cognitive and affective traits.

Key Points

  • LMMs demonstrate proficiency in understanding interpersonal dynamics and self-concept.
  • LMMs consistently fail to perceive and regulate aggression.
  • Performance varies systematically across model families, with larger and more recent models outperforming smaller and earlier ones.

Merits

Innovative Application of Psychological Frameworks

The study innovatively applies the TAT and SCORS-G, traditionally used in human psychology, to assess AI models, providing a novel approach to understanding AI personality traits.

High Consistency with Human Expert Interpretations

The evaluator models' interpretations of TAT responses are highly consistent with those of human experts, validating the effectiveness of the method.

Demerits

Limited Scope of Aggression Perception

The study's finding that LMMs struggle with perceiving and regulating aggression suggests a potential limitation in the models' emotional and behavioral understanding.

Potential Bias in Model Evaluation

The evaluation process relies on the models' self-assessment capabilities, which may introduce biases or inaccuracies in the results.

Expert Commentary

The study represents a significant step forward in the intersection of psychology and AI research. By leveraging projective psychological frameworks, the researchers have provided a robust method for assessing the cognitive and affective traits of LMMs. The consistency of the models' interpretations with human expert assessments is particularly noteworthy, as it validates the reliability of the approach. However, the consistent failure to perceive and regulate aggression raises important questions about the emotional and behavioral limitations of current AI models. This limitation suggests that while LMMs excel in certain cognitive domains, they may require further development to achieve a more comprehensive understanding of human-like emotional responses. The systematic performance variations across model families also highlight the importance of continuous advancements in AI development, emphasizing the need for ongoing research and innovation in this field. The study's implications extend beyond technical advancements, touching on ethical and policy considerations that are crucial for the responsible deployment of AI technologies.

Recommendations

  • Further research should explore methods to improve LMMs' capabilities in perceiving and regulating aggression, potentially through enhanced training datasets or advanced algorithms.
  • Developers and policymakers should collaborate to establish ethical guidelines and regulatory frameworks that address the assessment and management of AI personality traits, ensuring their responsible and beneficial integration into society.

Sources