Limits of Learning from Text-Only Corpora

Determine which competencies large language models can and cannot learn solely from large text corpora, without multi-modal data or embodied interaction, in light of their surprising pattern of successes and failures.

Background

In the Day Two discussion, participants asked what abilities LLMs can acquire from text alone. Andrew Lampinen noted the unpredictable success-failure profile of such models and expressed uncertainty about the boundaries of text-only learning.

Clarifying these limits would inform whether additional modalities, embodiment, or structured feedback are necessary to achieve specific forms of generalization and understanding.

References

The open discussion began with a request for Lampinen's own opinion about what can and can't be learned from large language corpora. Lampinen was unsure, but noted the often surprising pattern of success and failure in LLMs: these systems often fail in some seemingly easy tasks while succeeding in some seemingly difficult tasks.

Embodied, Situated, and Grounded Intelligence: Implications for AI  (2210.13589 - Millhouse et al., 2022) in General Discussion — Day Two