Academic

TopicENA: Enabling Epistemic Network Analysis at Scale through Automated Topic-Based Coding

arXiv:2603.03307v1 Announce Type: cross Abstract: Epistemic Network Analysis (ENA) is a method for investigating the relational structure of concepts in text by representing co-occurring concepts as networks. Traditional ENA, however, relies heavily on manual expert coding, which limits its scalability and real-world applicability to large text corpora. Topic modeling provides an automated approach to extracting concept-level representations from text and can serve as an alternative to manual coding. To tackle this limitation, the present study merges BERTopic with ENA and introduces TopicENA, a topic-based epistemic network analysis framework. TopicENA substitutes manual concept coding with automatically generated topics while maintaining ENA's capacity for modeling structural associations among concepts. To explain the impact of modeling choices on TopicENA outcomes, three analysis cases are presented. The first case assesses the effect of topic granularity, indicating that coarse-g

O
Owen H. T. Lu, Tiffany T. Y. Hsu
· · 1 min read · 9 views

arXiv:2603.03307v1 Announce Type: cross Abstract: Epistemic Network Analysis (ENA) is a method for investigating the relational structure of concepts in text by representing co-occurring concepts as networks. Traditional ENA, however, relies heavily on manual expert coding, which limits its scalability and real-world applicability to large text corpora. Topic modeling provides an automated approach to extracting concept-level representations from text and can serve as an alternative to manual coding. To tackle this limitation, the present study merges BERTopic with ENA and introduces TopicENA, a topic-based epistemic network analysis framework. TopicENA substitutes manual concept coding with automatically generated topics while maintaining ENA's capacity for modeling structural associations among concepts. To explain the impact of modeling choices on TopicENA outcomes, three analysis cases are presented. The first case assesses the effect of topic granularity, indicating that coarse-grained topics are preferable for large datasets, whereas fine-grained topics are more effective for smaller datasets. The second case examines topic inclusion thresholds and finds that threshold values should be adjusted according to topic quality indicators to balance network consistency and interpretability. The third case tests TopicENA's scalability by applying it to a substantially larger dataset than those used in previous ENA studies. Collectively, these cases illustrate that TopicENA facilitates practical and interpretable ENA analysis at scale and offers concrete guidance for configuring topic-based ENA pipelines in large-scale text analysis.

Executive Summary

This study proposes TopicENA, a novel framework for Epistemic Network Analysis (ENA) that leverages topic modeling to automate concept coding. By merging BERTopic with ENA, TopicENA enables scalable and interpretable analysis of large text corpora. Three analysis cases demonstrate the impact of modeling choices on TopicENA outcomes, providing concrete guidance for configuring topic-based ENA pipelines. The study's findings suggest that TopicENA is a valuable tool for large-scale text analysis, offering practical and interpretable insights into the relational structure of concepts. However, its effectiveness depends on careful consideration of topic granularity and inclusion thresholds. Overall, TopicENA represents a significant advancement in ENA methodology, with potential applications in various fields, including information retrieval, natural language processing, and knowledge discovery.

Key Points

  • TopicENA is a novel framework that automates concept coding in ENA through topic modeling
  • TopicENA enables scalable and interpretable analysis of large text corpora
  • Analysis cases demonstrate the impact of modeling choices on TopicENA outcomes

Merits

Strength in Scalability

TopicENA enables the analysis of large text corpora, making it a valuable tool for real-world applications.

Improved Interpretability

TopicENA provides concrete guidance for configuring topic-based ENA pipelines, enhancing the interpretability of ENA outcomes.

Demerits

Limited Topic Granularity

TopicENA's effectiveness depends on careful consideration of topic granularity, which may be challenging to determine, particularly for large datasets.

Threshold Adjustments Required

TopicENA requires adjustments to topic inclusion thresholds based on topic quality indicators, which may add complexity to the analysis process.

Expert Commentary

The study's findings are significant, as they demonstrate the potential of TopicENA to revolutionize ENA methodology. However, its effectiveness depends on careful consideration of various factors, including topic granularity and inclusion thresholds. To fully realize the potential of TopicENA, researchers and practitioners must develop a deeper understanding of these factors and their impact on ENA outcomes. Furthermore, the study's results highlight the need for further research on the application of TopicENA in various domains, including policy and decision-making.

Recommendations

  • Researchers and practitioners should develop a deeper understanding of topic granularity and its impact on ENA outcomes.
  • Further research is needed to explore the application of TopicENA in various domains, including policy and decision-making.

Sources