Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative Model
Abstract: Pictorial visualization seamlessly integrates data and semantic context into visual representation, conveying complex information in a manner that is both engaging and informative. Extensive studies have been devoted to developing authoring tools to simplify the creation of pictorial visualizations. However, mainstream works mostly follow a retrieving-and-editing pipeline that heavily relies on retrieved visual elements from a dedicated corpus, which often compromise the data integrity. Text-guided generation methods are emerging, but may have limited applicability due to its predefined recognized entities. In this work, we propose ChartSpark, a novel system that embeds semantic context into chart based on text-to-image generative model. ChartSpark generates pictorial visualizations conditioned on both semantic context conveyed in textual inputs and data information embedded in plain charts. The method is generic for both foreground and background pictorial generation, satisfying the design practices identified from an empirical research into existing pictorial visualizations. We further develop an interactive visual interface that integrates a text analyzer, editing module, and evaluation module to enable users to generate, modify, and assess pictorial visualizations. We experimentally demonstrate the usability of our tool, and conclude with a discussion of the potential of using text-to-image generative model combined with interactive interface for visualization design.
- Useful junk? the effects of visual embellishment on comprehension and memorability of charts. In Proc. ACM CHI, pp. 2573–2582, 2010. doi: 10 . 1145/1753326 . 1753716
- An empirical study on using visual embellishments in visualization. IEEE Trans. Vis. Comput. Graph., 18(12):2759–2768, 2012. doi: 10 . 1109/TVCG . 2012 . 197
- Beyond memorability: Visualization recognition and recall. IEEE Trans. Vis. Comput. Graph., 22(1):519–528, 2015. doi: 10 . 1109/TVCG . 2015 . 2467732
- What makes a visualization memorable? IEEE Trans. Vis. Comput. Graph., 19(12):2306–2315, 2013. doi: 10 . 1109/TVCG . 2013 . 234
- Showing people behind data: Does anthropomorphizing visualizations elicit more empathy for human rights data? In Proc. ACM CHI, pp. 5462–5474, 2017. doi: 10 . 1145/3025453 . 3025512
- InstructPix2Pix: Learning to follow image editing instructions. In Proc. CVPR, pp. 18392–18402, 2023.
- Designing with pictographs: Envision topics without sacrificing understanding. IEEE Trans. Vis. Comput. Graph., 28(12):4515–4530, 2021. doi: 10 . 1109/TVCG . 2021 . 3092680
- Figurative frames: A critical vocabulary for images in information visualization. Information Visualization, 18(1):45–67, 2019. doi: 10 . 1177/1473871617724
- Attend-and-excite: Attention-based semantic guidance for text-to-image diffusion models. arXiv preprint arXiv:2301.13826, 2023.
- Towards automated infographic design: Deep learning-based auto-extraction of extensible timeline. IEEE Trans. Vis. Comput. Graph., 26(1):917–926, 2019. doi: 10 . 1109/TVCG . 2019 . 2934810
- D. Coelho and K. Mueller. Infomages: Embedding data into thematic images. Comput. Graph. Forum, 39(3):593–606, 2020. doi: 10 . 1111/cgf . 14004
- Text-to-Viz: Automatic generation of infographics from proportion-related natural language statements. IEEE Trans. Vis. Comput. Graph., 26(1):906–916, 2019. doi: 10 . 1109/TVCG . 2019 . 2934785
- P. Dhariwal and A. Nichol. Diffusion models beat gans on image synthesis. In Proc. NIPS, vol. 34, pp. 8780–8794, 2021.
- S. Few and P. Edge. The chartjunk debate. Visual Business Intelligence Newsletter, pp. 1–11, 2011.
- CLIPDraw: Exploring text-to-drawing synthesis through language-image encoders. In Proc. NIPS, vol. 35, pp. 5207–5218, 2022.
- An image is worth one word: Personalizing text-to-image generation using textual inversion. In Proc. ICML, 2023.
- StyleGAN-NADA: CLIP-guided domain adaptation of image generators. ACM Trans. Graph., 41(4):1–13, 2022. doi: 10 . 1145/3528223 . 3530164
- ISOTYPE visualization: Working memory, performance, and engagement with pictographs. In Proc. ACM CHI, pp. 1191–1200, 2015. doi: 10 . 1145/2702123 . 2702275
- Infographic aesthetics: Designing for the first impression. In Proc. ACM CHI, pp. 1187–1190, 2015. doi: 10 . 1145/2702123 . 2702545
- F. Hartmann. Visualizing social facts: Otto Neurath’s ISOTYPE project. In European Modernism and the Information Society, pp. 279–293. Routledge, 2017.
- Prompt-to-prompt image editing with cross attention control. In Proc. ICML, 2023.
- Denoising diffusion probabilistic models. In Proc. NIPS, vol. 33, pp. 6840–6851, 2020.
- N. Holmes. Joyful Infographics: A Friendly, Human Approach to Data. CRC Press, 2022.
- Benefitting InfoVis with visual difficulties. IEEE Trans. Vis. Comput. Graph., 17(12):2213–2222, 2011. doi: 10 . 1109/TVCG . 2011 . 175
- Word-as-image for semantic typography. arXiv preprint arXiv:2303.01818, 2023.
- A style-based generator architecture for generative adversarial networks. In Proc. CVPR, pp. 4401–4410, 2019.
- Data-driven guides: Supporting expressive design for information graphics. IEEE Trans. Vis. Comput. Graph., 23(1):491–500, 2016. doi: 10 . 1109/TVCG . 2016 . 2598620
- Automatic annotation synchronizing with textual description for visualization. In Proc. ACM CHI, pp. 1–13, 2020. doi: 10 . 1145/3313831 . 3376443
- Smile or scowl? looking at infographic design through the affective lens. IEEE Trans. Vis. Comput. Graph., 27(6):2796–2807, 2021. doi: 10 . 1109/TVCG . 2021 . 3074582
- Structure-aware visualization retrieval. In Proc. ACM CHI, pp. 1–14, 2022. doi: 10 . 1145/3491102 . 3502048
- GLIGEN: open-set grounded text-to-image generation. In Proc. CVPR, pp. 22511–22521, 2023.
- Exploring visual information flows in infographics. In Proc. ACM CHI, p. 1–12, 2020.
- Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
- A. V. Moere and H. Purchase. On the role of design in information visualization. Information Visualization, 10(4):356–371, 2011. doi: 10 . 1177/1473871611415996
- Evaluating the effect of style in information visualization. IEEE Trans. Vis. Comput. Graph., 18(12):2739–2748, 2012. doi: 10 . 1109/TVCG . 2012 . 221
- Showing data about people: A design space of anthropographics. IEEE Trans. Vis. Comput. Graph., 28(3):1661–1679, 2020. doi: 10 . 1109/TVCG . 2020 . 3023013
- Retrieve-then-adapt: Example-based automatic generation for proportion-related infographics. IEEE Trans. Vis. Comput. Graph., 27(2):443–452, 2020. doi: 10 . 1109/TVCG . 2020 . 3030448
- Highly accurate dichotomous image segmentation. In Proc. ECCV, pp. 38–56, 2022.
- Learning transferable visual models from natural language supervision. In Proc. ICML, pp. 8748–8763, 2021.
- Hierarchical text-conditional image generation with CLIP latents. arXiv preprint arXiv:2204.06125, 2022.
- Zero-shot text-to-image generation. In Proc. ICML, pp. 8821–8831, 2021.
- Text2chart: A multi-staged chart generator from natural language text. In Proc. PAKDD, pp. 3–16, 2022.
- High-resolution image synthesis with latent diffusion models. In Proc. CVPR, pp. 10684–10695, 2022.
- Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. arXiv preprint arXiv:2208.12242, 2022.
- Photorealistic text-to-image diffusion models with deep language understanding. In Proc. NIPS, pp. 36479–36494, 2022.
- Doom or deliciousness: Challenges and opportunities for visualization in the age of generative models. Comput. Graph. Forum, 42(3):423–435, 2023. doi: 10 . 1111/cgf . 14841
- Safe latent diffusion: Mitigating inappropriate degeneration in diffusion models. In Proc. CVPR, pp. 22522–22531, 2023.
- Supporting expressive and faithful pictorial visualization design with visual style transfer. IEEE Trans. Vis. Comput. Graph., 29(1):236–246, 2022. doi: 10 . 1109/TVCG . 2022 . 3209486
- Denoising diffusion implicit models. 2022.
- MPNet: Masked and permuted pre-training for language understanding. In Proc. NIPS, vol. 33, pp. 16857–16867, 2020.
- E. R. Tufte. The Visual Display of Quantitative Information. Graphics Press, USA, second ed., 2001.
- Towards natural language-based visualization authoring. IEEE Trans. Vis. Comput. Graph., 29(1):1222–1232, 2022. doi: 10 . 1109/TVCG . 2022 . 3209357
- DataShot: Automatic generation of fact sheets from tabular data. IEEE Trans. Vis. Comput. Graph., 26(1):895–905, 2020. doi: 10 . 1109/TVCG . 2019 . 2934398
- InfoNice: Easy creation of information graphics. In Proc. ACM CHI, pp. 1–12, 2018. doi: 10 . 1145/3173574 . 3173909
- viz2viz: Prompt-driven stylized visualization generation using a diffusion model. arXiv preprint arXiv:2304.01919, 2023.
- DataInk: Direct and creative data-oriented drawing. In Proc. ACM CHI, pp. 1–13, 2018. doi: 10 . 1145/3173574 . 3173797
- WYTIWYR: A user intent-aware framework with multi-modal inputs for visualization retrieval. Comput. Graph. Forum, 42(3):311–322, 2023. doi: 10 . 1111/cgf . 14832
- MetaGlyph: Automatic generation of metaphoric glyph-based visualization. IEEE Trans. Vis. Comput. Graph., 29(1):331–341, 2023. doi: 10 . 1109/TVCG . 2022 . 3209447
- Scaling autoregressive models for content-rich text-to-image generation. arXiv preprint arXiv:2206.10789, 2022.
- DataQuilt: Extracting visual elements from images to craft pictorial visualizations. In Proc. ACM CHI, pp. 1–13, 2020. doi: 10 . 1145/3313831 . 3376172
- L. Zhang and M. Agrawala. Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543, 2023.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.