AI Guides the Eye: Smarter Data Visualization with Language Models

Author: Denis Avetisyan

A new approach uses artificial intelligence to automatically refine data visualizations, making complex datasets easier to understand and interpret.

The system iteratively refines data visualization through an agentic pipeline, where dimensionality reduction hyperparameters are optimized based on metrics extracted from both embedded data and raw inputs, with an LLM agent evaluating each iteration and providing JSON-formatted recommendations until a convergence criterion-defined by the agent’s quality score or explicit weighting-is met, effectively allowing the visualization to evolve toward an optimal state as determined by the system itself.

This work presents an agentic AI pipeline leveraging large language models for automated hyperparameter optimization in dimensionality reduction, improving data visualization and interpretability, particularly in single-cell RNA sequencing.

Effectively visualizing high-dimensional data relies on dimensionality reduction, yet achieving optimal algorithm configuration remains a significant challenge for pattern discovery. This paper, ‘Explainable Iterative Data Visualisation Refinement via an LLM Agent’, introduces an agentic AI pipeline that leverages large language models to automate hyperparameter tuning and provide explainable refinements to data visualizations. By framing visualization evaluation as a semantic task, the system generates contextualized reports and actionable recommendations, rapidly producing high-quality plots. Could this approach unlock new levels of insight in complex datasets and accelerate scientific discovery through more intuitive data exploration?

The Erosion of Intuition in High-Dimensional Space

The advent of modern biological technologies, notably Single-Cell RNA Sequencing (scRNA-seq), has ushered in an era of remarkably High-Dimensional Data. Each cell profiled yields measurements for tens of thousands of genes, creating datasets with dimensions far exceeding intuitive human visualization capabilities. This presents a fundamental challenge: while data points may only represent cells in a biological space, they exist in a computational space with so many variables that patterns become obscured, and meaningful relationships are difficult to discern. Effectively representing and interpreting this complexity necessitates novel approaches to dimensionality reduction and visualization, lest crucial biological insights remain hidden within the sheer volume of data. The difficulty isn’t simply about displaying a large number of features; it’s about preserving the essential structure and relationships within the data while reducing it to a form humans can readily grasp.

Conventional dimensionality reduction techniques, while designed to simplify complex datasets, frequently fail to accurately represent the intrinsic relationships within high-dimensional data. Methods like Principal Component Analysis (PCA), for instance, assume linearity and may distort non-linear manifolds inherent in datasets generated by modern technologies such as scRNA-seq. This simplification can lead to the artificial separation of truly similar data points or, conversely, the clustering of disparate ones, ultimately obscuring biological signals and driving misleading conclusions. The loss of nuanced structure impacts downstream analyses-from clustering and classification to the identification of novel biomarkers-necessitating the development of more sophisticated techniques capable of faithfully preserving the data’s underlying geometry and avoiding spurious interpretations.

An LLM agent (GPT 5.2) successfully analyzed single-cell RNA sequencing data from healthy human kidneys, generating dendrograms using both high-dimensional Principal Component Analysis and low-dimensional t-distributed Stochastic Neighbor Embedding [latex]t-SNE[/latex] with initial hyperparameter settings.

Balancing Preservation and Reduction: The Geometry of Data

Dimensionality reduction techniques – including Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and PaCMAP – differ in their ability to preserve data relationships. PCA excels at maintaining global structure by maximizing variance, but may not accurately represent local neighborhoods. t-SNE and UMAP prioritize preserving local structure, effectively clustering similar data points but potentially distorting global distances. PaCMAP aims to balance both local and global preservation through a different optimization strategy, often resulting in more faithful representations of both relationships compared to t-SNE or UMAP, particularly for complex datasets. The optimal choice depends on the specific application and the relative importance of accurately representing local versus global data characteristics.

The selection of a dimensionality reduction (DR) technique-including Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and PaCMAP-directly influences the fidelity of the resulting data visualization. Datasets with strongly linear relationships are generally well-suited to PCA, which maximizes variance preservation; however, PCA struggles with non-linear manifolds. t-SNE and UMAP excel at preserving local structure but can distort global distances, particularly with high-dimensional data or differing density regions. PaCMAP prioritizes both local and global structure preservation, potentially offering a more balanced representation. Consequently, appropriate method selection requires analyzing the dataset’s intrinsic dimensionality, the presence of non-linear relationships, and the relative importance of preserving either local or global neighborhood structures to generate meaningful and accurate visualizations.

Optimizing hyperparameters is crucial for effective Dimensionality Reduction (DR) as these settings directly control the algorithm’s behavior and impact the fidelity of the reduced representation. Parameters such as the number of components, neighborhood size, and learning rate govern how the DR algorithm balances preserving local versus global structure within the data. Incorrectly configured hyperparameters can lead to substantial information loss, manifesting as distorted distances between data points in the lower-dimensional space or a failure to capture meaningful data clusters. Techniques for hyperparameter optimization include grid search, random search, and Bayesian optimization, each aiming to identify the parameter combination that minimizes a defined cost function-typically a measure of reconstruction error or preservation of inter-point distances-and thereby reduces distortion during the dimensionality reduction process.

Using an LLM agent (GPT 5.2) to optimize t-SNE hyperparameters successfully generated a clear embedding of healthy human kidney scRNA-seq data, as demonstrated by both PCA and t-SNE dendrogram visualizations.

Automated Refinement: Guiding Intelligence Through Data Landscapes

Agentic AI Pipelines employ Large Language Model (LLM) Agents to automate the process of hyperparameter optimization within Dimensionality Reduction (DR) techniques. This automated exploration significantly improves both the efficiency and effectiveness of DR by systematically testing various parameter combinations. Unlike traditional grid or random search methods, LLM Agents utilize reasoning and iterative refinement to intelligently navigate the hyperparameter space, focusing on configurations most likely to yield optimal results. This approach reduces the computational cost associated with exhaustive searches and accelerates the identification of high-performing DR models tailored to specific datasets, ultimately delivering visualizations with improved clarity and interpretability.

Agentic AI pipelines employ a dual scoring system to evaluate the quality of generated visualizations. Explicit scoring relies on predefined quantitative metrics – such as Trustworthiness and Stress – to provide objective assessments of the visualization’s characteristics. Complementing this, implicit scoring utilizes the Large Language Model (LLM) to perform a holistic evaluation, considering factors beyond the scope of the defined metrics. This LLM-driven assessment provides a subjective, yet informed, judgment of overall visualization quality, allowing the pipeline to balance objective data with nuanced considerations during refinement.

The Agentic AI Pipeline enhances visualization quality by integrating quantitative metrics – specifically, Trustworthiness and Stress – with subjective evaluations performed by a Large Language Model (LLM). Trustworthiness assesses the fidelity of the visualization to the underlying data, while Stress quantifies potential distortions or misleading representations. The LLM provides a holistic, contextual assessment beyond these metrics, evaluating aspects like clarity and interpretability. This combined approach allows the pipeline to rapidly converge on optimal visualization parameters, consistently achieving refined results within five iterations, as demonstrated in testing.

Optimization trajectories reveal that LLM agents converge on similar implicit and explicit performance scores over iterative pipeline refinements, as indicated by overlapping progression curves and quantitative metrics.

Evaluating the Echo: Quantitative and Qualitative Assessments of Fidelity

Quantitative metrics such as the Silhouette Score and Spearman Correlation are employed to evaluate the quality of dimensionality reduction and clustering techniques by providing numerical assessments of data characteristics. The Silhouette Score, ranging from -1 to 1, measures how well each data point fits within its assigned cluster, with higher values indicating better cluster separation; a score near 1 suggests strong cluster cohesion and separation. Spearman Correlation, a non-parametric measure, assesses the monotonic relationship between the original high-dimensional data and its reduced representation, indicating the degree to which the data’s rank order is preserved during dimensionality reduction – values closer to 1 represent strong preservation. Other metrics include Stress, which quantifies the distortion introduced by reducing dimensionality, and Kruskal’s Stress, a specific formulation often used in non-metric multidimensional scaling. These metrics allow for objective comparison of different dimensionality reduction or clustering approaches, though they do not always fully capture perceptual quality.

Traditional quantitative metrics, such as Silhouette Score and correlation coefficients, provide valuable, but incomplete, evaluations of visualization quality. These metrics typically assess specific aspects like cluster separation or dimensionality reduction preservation, but struggle to capture perceptual qualities or higher-level understandings of the visualized data. Consequently, a reliance on these measures alone can fail to identify visualizations that, while numerically “good,” are difficult to interpret or do not effectively communicate underlying patterns. LLM-driven Implicit Scoring addresses this limitation by leveraging large language models to assess visual quality based on a more holistic understanding of the data representation, effectively complementing quantitative analysis and providing a more comprehensive evaluation.

Evaluation of the visualization pipeline yielded a Large Language Model (LLM) Quality Score of 8.00, indicating a high degree of perceived visual quality. This score corresponded with a minimized Stress-1 value of 0.323, a metric quantifying the distortion in preserving interpoint distances during dimensionality reduction, and a Spearman Correlation of 0.773. The high Spearman Correlation value demonstrates a strong positive relationship between the LLM-assessed qualitative quality and the quantitative metrics of Stress-1, suggesting that lower distortion in the reduced space correlates with improved perceived visual quality as determined by the LLM.

Beyond Automation: The Trajectory of Insight Discovery

Agentic AI pipelines represent a significant advancement in addressing the complexities of high-dimensional data visualization. These pipelines, built on the principles of artificial intelligence, automate the typically iterative process of data exploration, feature selection, and visualization generation. Unlike traditional methods requiring substantial manual intervention, agentic systems can independently assess data characteristics, intelligently choose appropriate visualization techniques, and refine these visualizations based on defined objectives – all while maintaining a fully reproducible workflow. This scalability is crucial for handling the ever-increasing volume and complexity of modern datasets, particularly in fields where patterns are obscured by numerous variables. By systematically exploring the data space, these pipelines not only accelerate the identification of meaningful insights but also reduce the risk of human bias, offering a more objective and comprehensive understanding of complex phenomena.

The advent of agentic AI pipelines promises a significant leap forward for scientific exploration, particularly within the data-rich realms of genomics, immunology, and materials science. Researchers increasingly confront datasets of immense scale and complexity, often hindering the pace of discovery. This technology addresses this challenge by automating the traditionally manual process of data visualization and insight extraction. In genomics, automated pipelines can rapidly identify patterns in gene expression data, accelerating the understanding of disease mechanisms. Similarly, in immunology, these tools can decipher the intricate interactions within the immune system, potentially leading to new therapeutic strategies. Materials science benefits through accelerated discovery of novel materials with desired properties, as AI can efficiently analyze vast datasets of material compositions and characteristics. Ultimately, this intelligent automation doesn’t replace scientists, but rather empowers them to focus on higher-level interpretation and hypothesis generation, driving innovation at an unprecedented rate.

The increasing volume and complexity of modern datasets present a significant challenge to researchers across numerous fields. Intelligent automation offers a pathway to overcome this hurdle by handling the tedious and time-consuming aspects of data visualization, allowing scientists to focus on interpretation and discovery. This technology doesn’t simply generate charts; it actively explores data, identifies meaningful patterns, and suggests compelling visualizations – effectively acting as a collaborative partner in the research process. Consequently, innovation is accelerated not through increased manual effort, but through the synergistic combination of human expertise and automated insight generation, promising breakthroughs in areas ranging from personalized medicine and climate modeling to advanced materials design and fundamental physics.

The pursuit of optimal data visualization, as detailed in this work, echoes a fundamental truth about complex systems. This pipeline, employing an agentic AI to refine dimensionality reduction hyperparameters, isn’t about achieving a static ‘best’ view, but rather about establishing a process for graceful adaptation. As John McCarthy observed, “The best way to predict the future is to invent it.” The iterative refinement, driven by the LLM agent, embodies this inventive spirit, continually reshaping the visualization to better reveal underlying patterns within single-cell RNA sequencing data. It acknowledges that data, like all systems, is subject to change, and the value lies not in a perfect snapshot, but in the capacity to evolve understanding over time.

The Long View

This work, automating the refinement of data visualization, addresses a symptom, not the disease. Every architecture lives a life, and this one will inevitably succumb to the increasing complexity of the data it attempts to tame. The agentic approach, while promising, merely shifts the burden of understanding – from the algorithm’s parameters to the language model’s reasoning. The true challenge lies not in optimizing for a singular ‘best’ visualization, but in acknowledging the inherent subjectivity and temporal nature of interpretation.

Improvements age faster than one can understand them. Future iterations will likely focus on meta-optimization – agents designing agents – but this accelerates the cycle of obsolescence. A more fruitful, if less glamorous, path might involve explicitly modeling uncertainty and provenance within the visualization itself – a record not just of what is seen, but how it came to be seen, and with what assumptions.

The field must confront the fact that clarity is not a destination, but a fleeting moment. Each reduction in dimensionality is a loss of information, a simplification of reality. The agent, in its quest for elegance, risks obscuring the very nuances it seeks to reveal. The longevity of this approach will depend not on its ability to solve the problem of data interpretation, but on its capacity to gracefully acknowledge its own limitations.

Original article: https://arxiv.org/pdf/2604.15319.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/