Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards Multi-Modal Mastery: A 4.5B Parameter Truly Multi-Modal Small Language Model

Published 8 Nov 2024 in cs.LG, cs.AI, cs.CL, cs.CV, cs.SD, and eess.AS | (2411.05903v1)

Abstract: We present a novel 4.5B parameter small LLM that can handle multiple input and output modalities, including text, images, videos, and audio. Despite its small size, the model achieves near state-of-the-art performance on a variety of tasks, demonstrating the potential of multi-modal models to tackle complex real-world problems. Our approach leverages recent advancements in language modeling and multi-task learning to create a versatile and high-performing model that can even be deployed for edge inference. Experimental results show the model's strong performance across multiple benchmarks, paving the way for further progress in multi-modal artificial intelligence.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.