Unstructured and structured data: Can we have the best of both worlds with large language models?

Published 25 Apr 2023 in cs.DB and cs.CL | (2304.13010v2)

Abstract: This paper presents an opinion on the potential of using LLMs to query on both unstructured and structured data. It also outlines some research challenges related to the topic of building question-answering systems for both types of data.

Abstract PDF HTML Upgrade to Chat

References (24)

To be released.
Towards tracing knowledge in language models back to the training data. In EMNLP, pages 2429–2446, Dec. 2022.
Mathematical Capabilities of ChatGPT, 2023.
Realm: Retrieval-augmented language model pre-training. In ICML, 2020.
T. Haerder and A. Reuter. Principles of transaction-oriented database recovery. ACM Comput. Surv., 15(4):287–317, dec 1983.
A. Y. Halevy and J. Dwivedi-Yu. Learnings from data integration for augmented language models. CoRR, abs/2304.04576, 2023.
Crossing the Structure Chasm. In CIDR, 2003.
Explaining black box predictions and unveiling data artifacts through influence functions. In ACL, pages 5553–5563, July 2020.
Few-shot Learning with Retrieval Augmented Language Models. 2022.
A. Kashefi and T. Mukerji. ChatGPT for Programming Numerical Methods, 2023.
The hateful memes challenge: Detecting hate speech in multimodal memes. In NeurIPS, 2020.
LangChain. Retrieval Question Answering with Sources. https://python.langchain.com/en/latest/modules/chains/index_examples/vector_db_qa_with_sources.html.
Deep entity matching with pre-trained language models. Proc. VLDB Endow., 14(1):50–60, 2020.
OpenAI. text-davinci-003. cited on April 12, 2023, Playground at https://platform.openai.com/playground, info on training data at https://help.openai.com/en/articles/6643408-how-do-davinci-and-text-davinci-003-differ.
Estimating training data influence by tracing gradient descent. In NeurIPS, volume 33, pages 19920–19930, 2020.
RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL. In EMNLP, 2022.
K. Rose. How Should I Use A.I. Chatbots Like ChatGPT? New York Times (Mar 30, 2023) https://www.nytimes.com/2023/03/30/technology/ai-chatbot-chatgpt-uses-work-life.html.
Querying Large Language Models with SQL, 2023.
PICARD: Parsing incrementally for constrained auto-regressive decoding from language models. In EMNLP, pages 9895–9901, Nov. 2021.
An Independent Evaluation of ChatGPT on Mathematical Word Problems (MWP), 2023.
From Natural Language Processing to Neural Databases. VLDB Endow., 14(6):1033–1039, 2021.
Chain of thought prompting elicits reasoning in large language models. In NeurIPS, 2022.
ReAct: Synergizing reasoning and acting in language models. In ICLR, 2023.
Retrieval-augmented multimodal language modeling, 2022.

Citations (1)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Sign Up to Summarize

Paper to Video (Beta)

No one has generated a video about this paper yet.

Sign Up to Generate All Videos Create Your Own

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Sign Up to Generate

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Top Community Prompts

Explain it Like I'm 14

Practical Applications

Conceptual Simplification

Sign Up to Activate View All Prompts

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Wang-Chiew Tan

Collections

Sign up for free to add this paper to one or more collections.