When Text or Speech Is Superior for LLM-Visualization Interaction

Determine the tasks, contexts, and user characteristics under which textual or spoken natural language constitutes a superior input modality for interacting with data visualizations via large language models, compared to direct manipulation or other modalities.

Background

The survey discusses how LLMs reintroduce textual and spoken modalities for interacting with visualizations but notes uncertainty about when these modalities outperform alternatives. This uncertainty persists despite improvements that LLMs bring to language-based interaction.

Clarifying the relative advantages of text versus speech for LLM-mediated visualization is foundational for designing effective multimodal systems and for matching interaction techniques to user needs, tasks, and environments.

References

Some of the previous limitations of these modalities are now overcome by LLMs, but we do not yet know in what situations text and speech constitute superior modalities for interacting with visualization through LLMs.

State of the Art of LLM-Enabled Interaction with Visualization  (2601.14943 - Brossier et al., 21 Jan 2026) in Subsection: Interaction modalities (within Classification)