Beauty in the Eye of AI: Aligning LLMs and Vision Models with Human Aesthetics in Network Visualization
arXiv:2604.03417v1 Announce Type: new Abstract: Network visualization has traditionally relied on heuristic metrics, such as stress, under the assumption that optimizing them leads to aesthetic and informative layouts. However, no single metric consistently produces the most effective results. A data-driven alternative is to learn from human preferences, where annotators select their favored visualization among multiple layouts of the same graphs. These human-preference labels can then be used to train a generative model that approximates human aesthetic preferences. However, obtaining human labels at scale is costly and time-consuming. As a result, this generative approach has so far been tested only with machine-labeled data. In this paper, we explore the use of large language models (LLMs) and vision models (VMs) as proxies for human judgment. Through a carefully designed user study involving 27 participants, we curated a large set of human preference labels. We used this data both
arXiv:2604.03417v1 Announce Type: new Abstract: Network visualization has traditionally relied on heuristic metrics, such as stress, under the assumption that optimizing them leads to aesthetic and informative layouts. However, no single metric consistently produces the most effective results. A data-driven alternative is to learn from human preferences, where annotators select their favored visualization among multiple layouts of the same graphs. These human-preference labels can then be used to train a generative model that approximates human aesthetic preferences. However, obtaining human labels at scale is costly and time-consuming. As a result, this generative approach has so far been tested only with machine-labeled data. In this paper, we explore the use of large language models (LLMs) and vision models (VMs) as proxies for human judgment. Through a carefully designed user study involving 27 participants, we curated a large set of human preference labels. We used this data both to better understand human preferences and to bootstrap LLM/VM labelers. We show that prompt engineering that combines few-shot examples and diverse input formats, such as image embeddings, significantly improves LLM-human alignment, and additional filtering by the confidence score of the LLM pushes the alignment to human-human levels. Furthermore, we demonstrate that carefully trained VMs can achieve VM-human alignment at a level comparable to that between human annotators. Our results suggest that AI can feasibly serve as a scalable proxy for human labelers.
Executive Summary
This study explores the use of large language models (LLMs) and vision models (VMs) as proxies for human judgment in network visualization. The authors train LLMs and VMs using human-preference labels obtained from a carefully designed user study involving 27 participants. The results show that prompt engineering and confidence score filtering improve LLM-human alignment, achieving human-human levels. VMs also demonstrate comparable alignment to human annotators. The study suggests that AI can serve as a scalable proxy for human labelers. This finding has significant implications for network visualization and the development of AI-powered tools for graph layout and visualization.
Key Points
- ▸ The study uses human-preference labels to train LLMs and VMs as proxies for human judgment in network visualization.
- ▸ Prompt engineering and confidence score filtering improve LLM-human alignment to human-human levels.
- ▸ VMs demonstrate comparable alignment to human annotators, suggesting their potential as scalable proxies for human labelers.
Merits
Strength in Methodology
The study's carefully designed user study involving 27 participants provides a robust dataset for training LLMs and VMs.
Transferability of Findings
The study's results suggest that the approach can be applied to other domains beyond network visualization.
Demerits
Limited Generalizability
The study's results may not generalize to other scenarios or datasets, requiring further validation.
Dependence on Human Annotators
The study relies on human annotators for label generation, which can be time-consuming and costly.
Expert Commentary
This study makes a significant contribution to the field of network visualization by demonstrating the potential of LLMs and VMs as scalable proxies for human labelers. The results suggest that AI can be used to augment human decision-making in various domains, including graph layout and visualization. However, the study's limitations, such as the dependence on human annotators and the limited generalizability of the findings, should be addressed in future research. The study's methodology and findings have implications for the development of AI-powered tools that can assist human decision-makers in various domains.
Recommendations
- ✓ Future studies should investigate the transferability of the approach to other domains and scenarios.
- ✓ The development of LLMs and VMs as scalable proxies for human labelers should be further explored in the context of graph layout and visualization.
Sources
Original: arXiv - cs.LG