Papers
Topics
Authors
Recent
Search
2000 character limit reached

Cornstarch: Distributed Multimodal Training Must Be Multimodality-Aware

Published 14 Mar 2025 in cs.DC | (2503.11367v2)

Abstract: Multimodal LLMs (MLLMs) extend the capabilities of LLMs by combining heterogeneous model architectures to handle diverse modalities like images and audio. However, this inherent heterogeneity in MLLM model structure and data types makes makeshift extensions to existing LLM training frameworks unsuitable for efficient MLLM training. In this paper, we present Cornstarch, the first general-purpose distributed MLLM training framework. Cornstarch facilitates modular MLLM construction, enables composable parallelization of constituent models, and introduces MLLM-specific optimizations to pipeline and context parallelism for efficient distributed MLLM training. Our evaluation shows that Cornstarch outperforms state-of-the-art solutions by up to $1.57\times$ in terms of training throughput. Cornstarch is an open-source project available at https://github.com/cornstarch-org/Cornstarch.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.