Papers
Topics
Authors
Recent
Search
2000 character limit reached

Modelling the semantics of text in complex document layouts using graph transformer networks

Published 18 Feb 2022 in cs.CL, cs.AI, and cs.LG | (2202.09144v1)

Abstract: Representing structured text from complex documents typically calls for different machine learning techniques, such as LLMs for paragraphs and convolutional neural networks (CNNs) for table extraction, which prohibits drawing links between text spans from different content types. In this article we propose a model that approximates the human reading pattern of a document and outputs a unique semantic representation for every text span irrespective of the content type they are found in. We base our architecture on a graph representation of the structured text, and we demonstrate that not only can we retrieve semantically similar information across documents but also that the embedding space we generate captures useful semantic information, similar to LLMs that work only on text sequences.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.