Papers
Topics
Authors
Recent
Search
2000 character limit reached

Small Singular Values Matter: A Random Matrix Analysis of Transformer Models

Published 23 Oct 2024 in cs.LG and cond-mat.dis-nn | (2410.17770v2)

Abstract: As LLMs become increasingly central to AI applications, understanding their inner workings is essential. In this work, we analyze the spectra of weight matrices in pretrained transformer models through the lens of random matrix theory (RMT) to uncover learned structures. We find that certain regions of the weight matrix spectra deviate markedly from RMT predictions, indicating richer feature encoding. By comparing the corresponding singular vectors to eigenvectors of activation covariance matrices, we observe substantial overlap precisely where the spectra deviate from RMT expectations. Our analysis further reveals the important role of small singular values in LLMs, showing that these values contain significant information, a claim supported by increased perplexity when they are removed from the model. Although these small values may appear unimportant prior to task-specific fine-tuning, removing them afterward significantly degrades performance, revealing that fine-tuning refines the model primarily in these spectral regions. These results emphasize the critical role of small singular values, suggesting that removing them in an already aligned transformer can be detrimental, as it may compromise model alignment.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 24 likes about this paper.