Academic

Aspect-Based Sentiment Analysis for Future Tourism Experiences: A BERT-MoE Framework for Persian User Reviews

arXiv:2602.12778v1 Announce Type: new Abstract: This study advances aspect-based sentiment analysis (ABSA) for Persian-language user reviews in the tourism domain, addressing challenges of low-resource languages. We propose a hybrid BERT-based model with Top-K routing and auxiliary losses to mitigate routing collapse and improve efficiency. The pipeline includes: (1) overall sentiment classification using BERT on 9,558 labeled reviews, (2) multi-label aspect extraction for six tourism-related aspects (host, price, location, amenities, cleanliness, connectivity), and (3) integrated ABSA with dynamic routing. The dataset consists of 58,473 preprocessed reviews from the Iranian accommodation platform Jabama, manually annotated for aspects and sentiments. The proposed model achieves a weighted F1-score of 90.6% for ABSA, outperforming baseline BERT (89.25%) and a standard hybrid approach (85.7%). Key efficiency gains include a 39% reduction in GPU power consumption compared to dense BERT,

H
Hamidreza Kazemi Taskooh, Taha Zare Harofte
· · 1 min read · 10 views

arXiv:2602.12778v1 Announce Type: new Abstract: This study advances aspect-based sentiment analysis (ABSA) for Persian-language user reviews in the tourism domain, addressing challenges of low-resource languages. We propose a hybrid BERT-based model with Top-K routing and auxiliary losses to mitigate routing collapse and improve efficiency. The pipeline includes: (1) overall sentiment classification using BERT on 9,558 labeled reviews, (2) multi-label aspect extraction for six tourism-related aspects (host, price, location, amenities, cleanliness, connectivity), and (3) integrated ABSA with dynamic routing. The dataset consists of 58,473 preprocessed reviews from the Iranian accommodation platform Jabama, manually annotated for aspects and sentiments. The proposed model achieves a weighted F1-score of 90.6% for ABSA, outperforming baseline BERT (89.25%) and a standard hybrid approach (85.7%). Key efficiency gains include a 39% reduction in GPU power consumption compared to dense BERT, supporting sustainable AI deployment in alignment with UN SDGs 9 and 12. Analysis reveals high mention rates for cleanliness and amenities as critical aspects. This is the first ABSA study focused on Persian tourism reviews, and we release the annotated dataset to facilitate future multilingual NLP research in tourism.

Executive Summary

This study introduces a novel BERT-MoE (Mixture of Experts) framework for aspect-based sentiment analysis (ABSA) in Persian-language tourism reviews, addressing the challenges posed by low-resource languages. The research employs a hybrid model incorporating Top-K routing and auxiliary losses to enhance efficiency and performance. The dataset, comprising 58,473 reviews from the Iranian accommodation platform Jabama, is manually annotated for six tourism-related aspects. The proposed model achieves a weighted F1-score of 90.6%, outperforming baseline BERT and standard hybrid approaches. The study also highlights significant reductions in GPU power consumption, aligning with sustainable AI deployment goals. This work represents the first ABSA study focused on Persian tourism reviews and contributes an annotated dataset to advance multilingual NLP research in tourism.

Key Points

  • Introduction of a BERT-MoE framework for ABSA in Persian tourism reviews.
  • Achievement of a weighted F1-score of 90.6%, outperforming baseline models.
  • Significant reduction in GPU power consumption by 39%.
  • First ABSA study focused on Persian tourism reviews, with an annotated dataset released.
  • Alignment with UN SDGs 9 and 12 for sustainable AI deployment.

Merits

Innovative Model Architecture

The hybrid BERT-based model with Top-K routing and auxiliary losses represents a significant advancement in addressing the challenges of low-resource languages, particularly in the context of Persian tourism reviews.

High Performance Metrics

The model achieves a weighted F1-score of 90.6%, demonstrating superior performance compared to baseline BERT (89.25%) and standard hybrid approaches (85.7%).

Efficiency Gains

The study highlights a 39% reduction in GPU power consumption, making the model more sustainable and efficient, which is crucial for large-scale deployments.

First ABSA Study in Persian Tourism

This research is the first to focus on ABSA for Persian tourism reviews, providing valuable insights and an annotated dataset for future research.

Demerits

Limited Generalizability

The study is focused on Persian tourism reviews, which may limit the generalizability of the findings to other languages or domains.

Dataset Specificity

The dataset is specific to the Iranian accommodation platform Jabama, which may not fully represent the broader tourism sector or other regions.

Model Complexity

The hybrid model, while efficient, may be more complex to implement and require specialized knowledge for effective deployment.

Expert Commentary

This study represents a significant contribution to the field of aspect-based sentiment analysis, particularly in the context of low-resource languages like Persian. The innovative use of a BERT-MoE framework with Top-K routing and auxiliary losses addresses critical challenges in NLP for languages with limited resources. The high performance metrics, coupled with substantial efficiency gains, make this research particularly noteworthy. The focus on Persian tourism reviews fills a gap in the literature and provides valuable data for future research. However, the study's specificity to Persian and the tourism domain may limit its immediate applicability to other contexts. The release of the annotated dataset is a commendable step towards fostering multilingual NLP research. Overall, this work sets a strong foundation for further advancements in ABSA and sustainable AI deployment.

Recommendations

  • Future research should explore the applicability of this framework to other low-resource languages and domains to enhance generalizability.
  • The model's efficiency and performance should be tested on larger and more diverse datasets to validate its robustness and scalability.

Sources