Skip to main content
Academic

Archetypal Graph Generative Models: Explainable and Identifiable Communities via Anchor-Dominant Convex Hulls

arXiv:2602.21342v1 Announce Type: new Abstract: Representation learning has been essential for graph machine learning tasks such as link prediction, community detection, and network visualization. Despite recent advances in achieving high performance on these downstream tasks, little progress has been made toward self-explainable models. Understanding the patterns behind predictions is equally important, motivating recent interest in explainable machine learning. In this paper, we present GraphHull, an explainable generative model that represents networks using two levels of convex hulls. At the global level, the vertices of a convex hull are treated as archetypes, each corresponding to a pure community in the network. At the local level, each community is refined by a prototypical hull whose vertices act as representative profiles, capturing community-specific variation. This two-level construction yields clear multi-scale explanations: a node's position relative to global archetypes

arXiv:2602.21342v1 Announce Type: new Abstract: Representation learning has been essential for graph machine learning tasks such as link prediction, community detection, and network visualization. Despite recent advances in achieving high performance on these downstream tasks, little progress has been made toward self-explainable models. Understanding the patterns behind predictions is equally important, motivating recent interest in explainable machine learning. In this paper, we present GraphHull, an explainable generative model that represents networks using two levels of convex hulls. At the global level, the vertices of a convex hull are treated as archetypes, each corresponding to a pure community in the network. At the local level, each community is refined by a prototypical hull whose vertices act as representative profiles, capturing community-specific variation. This two-level construction yields clear multi-scale explanations: a node's position relative to global archetypes and its local prototypes directly accounts for its edges. The geometry is well-behaved by design, while local hulls are kept disjoint by construction. To further encourage diversity and stability, we place principled priors, including determinantal point processes, and fit the model under MAP estimation with scalable subsampling. Experiments on real networks demonstrate the ability of GraphHull to recover multi-level community structure and to achieve competitive or superior performance in link prediction and community detection, while naturally providing interpretable predictions.

Executive Summary

This paper proposes GraphHull, an explainable generative model for graph machine learning tasks. GraphHull represents networks using two levels of convex hulls: global archetypes corresponding to pure communities and local prototypes capturing community-specific variation. The model's geometry is well-behaved, and local hulls are disjoint by construction. Principled priors, including determinantal point processes, are used to encourage diversity and stability. GraphHull achieves competitive or superior performance in link prediction and community detection, while naturally providing interpretable predictions. Experiments on real networks demonstrate the model's ability to recover multi-level community structure. GraphHull addresses the growing need for explainable machine learning models and has significant implications for graph machine learning tasks.

Key Points

  • GraphHull represents networks using two levels of convex hulls: global archetypes and local prototypes.
  • The model's geometry is well-behaved, and local hulls are disjoint by construction.
  • Principled priors, including determinantal point processes, are used to encourage diversity and stability.
  • GraphHull achieves competitive or superior performance in link prediction and community detection.

Merits

Strength in Explainability

GraphHull provides clear multi-scale explanations for node positions and edges, addressing the need for explainable machine learning models.

Competitive Performance

GraphHull achieves competitive or superior performance in link prediction and community detection, making it a viable option for graph machine learning tasks.

Demerits

Computational Complexity

GraphHull's MAP estimation with scalable subsampling may be computationally expensive, limiting its applicability to large-scale networks.

Limited Generalizability

GraphHull's performance and explainability may be specific to the types of networks and tasks used in the experiments, limiting its generalizability to other domains.

Expert Commentary

GraphHull is a significant contribution to the field of graph machine learning, addressing the need for explainable models and achieving competitive performance in link prediction and community detection. However, its computational complexity and limited generalizability are notable limitations. The use of principled priors and scalable subsampling is innovative and may be applicable to other generative models. Further research is needed to explore the model's generalizability and scalability.

Recommendations

  • Future research should investigate GraphHull's performance on larger-scale networks and more diverse types of networks.
  • The development of more efficient algorithms for MAP estimation and scalable subsampling would enhance GraphHull's applicability to large-scale networks.

Sources