Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification
arXiv:2602.21082v1 Announce Type: new Abstract: Customer-provided reviews have become an important source of information for business owners and other customers alike. However, effectively analyzing millions of unstructured reviews remains challenging. While large language models (LLMs) show promise for natural language understanding, their application to large-scale review analysis has been limited by computational costs and scalability concerns. This study proposes a hybrid approach that uses LLMs for aspect identification while employing classic machine-learning methods for sentiment classification at scale. Using ChatGPT to analyze sampled restaurant reviews, we identified key aspects of dining experiences and developed sentiment classifiers using human-labeled reviews, which we subsequently applied to 4.7 million reviews collected over 17 years from a major online platform. Regression analysis reveals that our machine-labeled aspects significantly explain variance in overall rest
arXiv:2602.21082v1 Announce Type: new Abstract: Customer-provided reviews have become an important source of information for business owners and other customers alike. However, effectively analyzing millions of unstructured reviews remains challenging. While large language models (LLMs) show promise for natural language understanding, their application to large-scale review analysis has been limited by computational costs and scalability concerns. This study proposes a hybrid approach that uses LLMs for aspect identification while employing classic machine-learning methods for sentiment classification at scale. Using ChatGPT to analyze sampled restaurant reviews, we identified key aspects of dining experiences and developed sentiment classifiers using human-labeled reviews, which we subsequently applied to 4.7 million reviews collected over 17 years from a major online platform. Regression analysis reveals that our machine-labeled aspects significantly explain variance in overall restaurant ratings across different aspects of dining experiences, cuisines, and geographical regions. Our findings demonstrate that combining LLMs with traditional machine learning approaches can effectively automate aspect-based sentiment analysis of large-scale customer feedback, suggesting a practical framework for both researchers and practitioners in the hospitality industry and potentially, other service sectors.
Executive Summary
This study proposes a scalable framework for aspect-based sentiment analysis of customer reviews using large language models (LLMs) and traditional machine learning approaches. The authors demonstrate the effectiveness of their hybrid approach by analyzing 4.7 million restaurant reviews and identifying key aspects of dining experiences. The results show that machine-labeled aspects significantly explain variance in overall restaurant ratings, offering a practical framework for both researchers and practitioners in the hospitality industry and beyond.
Key Points
- ▸ Hybrid approach combining LLMs and traditional machine learning for aspect-based sentiment analysis
- ▸ Analysis of 4.7 million restaurant reviews collected over 17 years from a major online platform
- ▸ Regression analysis reveals significant explanation of variance in overall restaurant ratings by machine-labeled aspects
Merits
Scalability
The proposed framework allows for efficient analysis of large-scale customer feedback, addressing computational costs and scalability concerns associated with LLMs.
Demerits
Dependence on Human-Labeled Data
The study relies on human-labeled reviews for training sentiment classifiers, which may limit the applicability of the framework to domains with scarce labeled data.
Expert Commentary
The study's hybrid approach offers a promising solution for large-scale aspect-based sentiment analysis, addressing the limitations of LLMs in terms of computational costs and scalability. The use of traditional machine learning methods for sentiment classification at scale is a key strength of the framework, allowing for efficient analysis of vast amounts of customer feedback. However, the reliance on human-labeled data for training sentiment classifiers may limit the framework's applicability to domains with scarce labeled data. Further research is needed to explore the potential of unsupervised or semi-supervised learning approaches to mitigate this limitation.
Recommendations
- ✓ Future studies should investigate the application of the proposed framework to other service sectors and domains, exploring its potential for improving customer experience and informing business decisions.
- ✓ Researchers should also explore the development of unsupervised or semi-supervised learning approaches to reduce the reliance on human-labeled data and improve the framework's applicability to domains with scarce labeled data.