Skip to main content
Academic

Extracting Consumer Insight from Text: A Large Language Model Approach to Emotion and Evaluation Measurement

arXiv:2602.15312v1 Announce Type: new Abstract: Accurately measuring consumer emotions and evaluations from unstructured text remains a core challenge for marketing research and practice. This study introduces the Linguistic eXtractor (LX), a fine-tuned, large language model trained on consumer-authored text that also has been labeled with consumers' self-reported ratings of 16 consumption-related emotions and four evaluation constructs: trust, commitment, recommendation, and sentiment. LX consistently outperforms leading models, including GPT-4 Turbo, RoBERTa, and DeepSeek, achieving 81% macro-F1 accuracy on open-ended survey responses and greater than 95% accuracy on third-party-annotated Amazon and Yelp reviews. An application of LX to online retail data, using seemingly unrelated regression, affirms that review-expressed emotions predict product ratings, which in turn predict purchase behavior. Most emotional effects are mediated by product ratings, though some emotions, such as d

arXiv:2602.15312v1 Announce Type: new Abstract: Accurately measuring consumer emotions and evaluations from unstructured text remains a core challenge for marketing research and practice. This study introduces the Linguistic eXtractor (LX), a fine-tuned, large language model trained on consumer-authored text that also has been labeled with consumers' self-reported ratings of 16 consumption-related emotions and four evaluation constructs: trust, commitment, recommendation, and sentiment. LX consistently outperforms leading models, including GPT-4 Turbo, RoBERTa, and DeepSeek, achieving 81% macro-F1 accuracy on open-ended survey responses and greater than 95% accuracy on third-party-annotated Amazon and Yelp reviews. An application of LX to online retail data, using seemingly unrelated regression, affirms that review-expressed emotions predict product ratings, which in turn predict purchase behavior. Most emotional effects are mediated by product ratings, though some emotions, such as discontent and peacefulness, influence purchase directly, indicating that emotional tone provides meaningful signals beyond star ratings. To support its use, a no-code, cost-free, LX web application is available, enabling scalable analyses of consumer-authored text. In establishing a new methodological foundation for consumer perception measurement, this research demonstrates new methods for leveraging large language models to advance marketing research and practice, thereby achieving validated detection of marketing constructs from consumer data.

Executive Summary

This study introduces the Linguistic eXtractor (LX), a fine-tuned large language model that accurately measures consumer emotions and evaluations from unstructured text. LX outperforms leading models on various benchmarks, achieving 81% macro-F1 accuracy on open-ended survey responses and greater than 95% accuracy on third-party-annotated reviews. The study demonstrates the application of LX in predicting product ratings and purchase behavior, highlighting the significance of emotional tone beyond star ratings. The LX web application enables scalable analyses of consumer-authored text, providing a new methodological foundation for consumer perception measurement. The research has far-reaching implications for marketing research and practice, enabling validated detection of marketing constructs from consumer data.

Key Points

  • Introduction of the Linguistic eXtractor (LX), a fine-tuned large language model
  • LX achieves superior performance on various benchmarks compared to leading models
  • Application of LX in predicting product ratings and purchase behavior
  • Availability of the LX web application for scalable analyses of consumer-authored text

Merits

Methodological innovation

The introduction of LX provides a novel approach to measuring consumer emotions and evaluations, addressing a long-standing challenge in marketing research.

High accuracy and scalability

LX achieves high accuracy on various benchmarks and is designed to enable scalable analyses of consumer-authored text, making it a valuable tool for marketing research and practice.

Empirical validation

The study provides empirical evidence for the application of LX in predicting product ratings and purchase behavior, demonstrating its practical significance.

Demerits

Limited generalizability

The study's findings may not be generalizable to all consumer populations or contexts, highlighting the need for further research.

Dependence on labeled data

The performance of LX relies on the availability of labeled data, which may not be readily available for all marketing research applications.

Expert Commentary

This study represents a significant advancement in the field of marketing research, leveraging the power of large language models to accurately measure consumer emotions and evaluations. The introduction of LX provides a novel approach to addressing the long-standing challenge of measuring consumer emotions, and the study's empirical validation demonstrates its practical significance. However, it is essential to acknowledge the limitations of the study, including the potential for limited generalizability and dependence on labeled data. Future research should aim to address these limitations and further explore the applications of LX in marketing research and practice.

Recommendations

  • Future research should investigate the generalizability of LX across different consumer populations and contexts.
  • Developers should prioritize the creation of a user-friendly interface for the LX web application to facilitate widespread adoption.

Sources