Papers
Topics
Authors
Recent
Search
2000 character limit reached

Chart What I Say: Exploring Cross-Modality Prompt Alignment in AI-Assisted Chart Authoring

Published 7 Apr 2024 in cs.HC | (2404.05103v1)

Abstract: Recent chart-authoring systems, such as Amazon Q in QuickSight and Copilot for Power BI, demonstrate an emergent focus on supporting natural language input to share meaningful insights from data through chart creation. Currently, chart-authoring systems tend to integrate voice input capabilities by relying on speech-to-text transcription, processing spoken and typed input similarly. However, cross-modality input comparisons in other interaction domains suggest that the structure of spoken and typed-in interactions could notably differ, reflecting variations in user expectations based on interface affordances. Thus, in this work, we compare spoken and typed instructions for chart creation. Findings suggest that while both text and voice instructions cover chart elements and element organization, voice descriptions have a variety of command formats, element characteristics, and complex linguistic features. Based on these findings, we developed guidelines for designing voice-based authoring-oriented systems and additional features that can be incorporated into existing text-based systems to support speech modality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Multimodal Presentation of Two-Dimensional Charts: An Investigation Using Open Office XML and Microsoft Excel. ACM Trans. Access. Comput. 3, 2, Article 8 (nov 2010), 50 pages. https://doi.org/10.1145/1857920.1857925
  2. Affordances of Input Modalities for Visual Data Exploration in Immersive Environments. https://api.semanticscholar.org/CorpusID:20980425
  3. Nicholas J Belkin. 1980. Anomalous states of knowledge as a basis for information retrieval. Canadian journal of information science 5, 1 (1980), 133–143.
  4. Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative research in psychology 3, 2 (2006), 77–101.
  5. A Multi-Modal Natural Language Interface to an Information Visualization Environment. International Journal of Speech Technology 4 (07 2001), 297–314. https://doi.org/10.1023/A:1011368926479
  6. Text-to-Viz: Automatic Generation of Infographics from Proportion-Related Natural Language Statements. IEEE Transactions on Visualization and Computer Graphics 26 (2020), 906–916.
  7. SVGPlott: An Accessible Tool to Generate Highly Adaptable, Accessible Audio-Tactile Charts for and from Blind and Visually Impaired People. In Proceedings of the 12th ACM International Conference on PErvasive Technologies Related to Assistive Environments (Rhodes, Greece) (PETRA ’19). Association for Computing Machinery, New York, NY, USA, 186–195. https://doi.org/10.1145/3316782.3316793
  8. Gemini: A Family of Highly Capable Multimodal Models. arXiv:2312.11805 [cs.CL]
  9. Holistic Evaluation of Language Models. arXiv:2211.09110 [cs.CL]
  10. ReMap: Lowering the barrier to help-seeking with multimodal search. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 979–986.
  11. DataTone: Managing Ambiguity in Natural Language Interfaces for Data Visualization. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology (Charlotte, NC, USA) (UIST ’15). Association for Computing Machinery, New York, NY, USA, 489–500. https://doi.org/10.1145/2807442.2807478
  12. Ido Guy. 2018. The Characteristics of Voice Search: Comparing Spoken with Typed-in Mobile Web Search Queries. ACM Trans. Inf. Syst. 36, 3, Article 30 (mar 2018), 28 pages. https://doi.org/10.1145/3182163
  13. Applying Pragmatics Principles for Interaction with Visual Analytics. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 309–318. https://doi.org/10.1109/TVCG.2017.2744684
  14. Communicating Visualizations without Visuals: Investigation of Visualization Alternative Text for People with Visual Impairments. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2022), 1095–1105. https://doi.org/10.1109/TVCG.2021.3114846
  15. The Power of Scale for Parameter-Efficient Prompt Tuning. arXiv:2104.08691 [cs.CL]
  16. Rambler: Supporting Writing With Speech via LLM-Assisted Gist Manipulation. arXiv preprint arXiv:2401.10838 (2024).
  17. ADVISor: Automatic Visualization Answer for Natural-Language Question on Tabular Data. 2021 IEEE 14th Pacific Visualization Symposium (PacificVis) (2021), 11–20.
  18. nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task. ArXiv abs/2112.12926 (2021).
  19. Are LLMs Robust for Spoken Dialogues? arXiv e-prints (2024), arXiv–2401.
  20. Shiri Melumad. 2023. Vocalizing search: How voice technologies alter consumer search processes and satisfaction. Journal of Consumer Research (2023), ucad009.
  21. NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Language Queries. IEEE Transactions on Visualization and Computer Graphics 27 (2021), 369–379.
  22. OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]
  23. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730–27744.
  24. Text2Chart: A Multi-Staged Chart Generator from Natural Language Text. In PAKDD.
  25. Eviza: A Natural Language Interface for Visual Analysis. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (Tokyo, Japan) (UIST ’16). Association for Computing Machinery, New York, NY, USA, 365–377. https://doi.org/10.1145/2984511.2984588
  26. Vidya Setlur and Melanie Tory. 2022. How Do You Converse with an Analytical Chatbot? Revisiting Gricean Maxims for Designing Analytical Conversational Behavior. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 29, 17 pages. https://doi.org/10.1145/3491102.3501972
  27. InChorus: Designing Consistent Multimodal Interactions for Data Visualization on Tablet Devices. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376782
  28. Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (2021).
  29. Arjun Srinivasan and Vidya Setlur. 2021. Snowy: Recommending Utterances for Conversational Visual Analysis. In The 34th Annual ACM Symposium on User Interface Software and Technology (Virtual Event, USA) (UIST ’21). Association for Computing Machinery, New York, NY, USA, 864–880. https://doi.org/10.1145/3472749.3474792
  30. Statista. 2007. Statista - The Statistics Portal for Market Data, Market Research and Market Studies. https://www.statista.com. [Accessed: April 17, 2023].
  31. Sevi: Speech-to-Visualization through Neural Machine Translation. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD ’22). Association for Computing Machinery, New York, NY, USA, 2353–2356. https://doi.org/10.1145/3514221.3520150
  32. Philip Tucker and Dylan M. Jones. 1991. Voice as interface: An overview. International Journal of Human–Computer Interaction 3, 2 (1991), 145–170. https://doi.org/10.1080/10447319109526002 arXiv:https://doi.org/10.1080/10447319109526002
  33. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. arXiv:1905.00537 [cs.CL]
  34. Towards Natural Language-Based Visualization Authoring. IEEE Transactions on Visualization and Computer Graphics 29, 1 (2023), 1222–1232. https://doi.org/10.1109/TVCG.2022.3209357
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 4 likes about this paper.