Foundation Models and Fair Use

Updated 9 February 2026

Foundation Models and Fair Use are large, pretrained systems that use lossy compression to transform extensive datasets, raising complex legal and ethical concerns.
Empirical studies reveal that these models can inadvertently memorize and reproduce copyrighted content, blurring the line between transformation and infringement.
Robust technical measures—such as data and output filtering, instance attribution, and differential privacy—are critical for mitigating legal risks and ensuring fair use compliance.

Foundation models—very large pretrained deep learning systems (e.g., LLMs, diffusion models)—are typically trained by ingesting vast datasets, much of which consists of copyrighted material. The intersection of these models with fair use doctrine encompasses complex legal, technical, and policy questions. Recent literature frames the weights of foundation models as compressed representations of their training corpora, analyzes the spectrum between reproduction and derivation under copyright law, catalogs empirical memorization risks, explores technical mitigation strategies, and proposes doctrinal evolutions to address recursive model pipelines and evidentiary burdens. Foundational court precedents and empirical studies demonstrate that both the construction and deployment of these models invoke nuanced applications of the fair use doctrine.

1. Foundation Model Training as Data Compression

The training process of foundation models can be modeled as a form of lossy compression: the parameters $W$ are optimized to minimize the reconstruction loss over a dataset $X$ —for instance, cross-entropy for language modeling,

$L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$

or mean-square error in vectorized auto-encoder frameworks. From the information bottleneck perspective, one interprets $W$ as a bottleneck representation $\hat{H}$ , balancing $I(X;W)$ (amount of input information retained) against $I(Y;W)$ (predictivity, where $Y\approx X$ in self-supervised settings). Empirical analyses show that models like LLaMA3-70B compress on the order of $225\times10^{12}$ bits of token data into $1.1\times10^{12}$ bits of weights—a compression factor near 200:1—implying substantial, but imperfect, retention of original content (Franceschelli et al., 2024).

This lossy process results in $X$ 0 not being a token-for-token record of $X$ 1 but a transformed statistical abstraction. Nonetheless, if $X$ 2 encodes $X$ 3 so faithfully that certain "decoding keys" (prompts) can recover long verbatim passages, it approaches the functional category of reproduction under copyright law.

2. Copyright Law: Reproduction, Derivation, and Fair Use

The core legal distinction lies between reproduction and derivative work. Under U.S. and EU law, if model weights $X$ 4 enable, through direct or indirect means, the regeneration of copyrighted content with substantial similarity to $X$ 5, this constitutes reproduction, invoking the copyright holder's exclusive right to copy. More commonly, the compressed, statistically filtered copy embodied in $X$ 6 transforms $X$ 7's content, resulting in derivative works—transformations or adaptations that nonetheless trigger separate copyright rights ("adaptation right") (Franceschelli et al., 2024).

Fair use remains a four-factor inquiry under 17 U.S.C. §107:

Purpose and character: Training typically constitutes a non-expressive, transformative, and often scientific use, weighing in favor of fair use, especially when outputs are not direct substitutes for the original. However, commercial generation of market-substituting outputs from the same model tips this factor against fair use.
Nature of the work: Factual or non-creative content has weaker protection; expressive works (e.g., novels, artworks) increase risk.
Amount and substantiality: Occasional short verbatim snippets (<90 characters, or single sentences) may be de minimis; reproduction of entire creative segments or substantial portions counts against fair use.
Market effect: Outputs that substitute for or depreciate the market for the original work (e.g., full lyric generation, code copying) strongly disfavor fair use.

Courts have analogized ML training (absent regurgitative generation) to "intermediate copying" (Perfect 10 v. Amazon), often finding it transformative. Where model outputs recall the "heart" of a copyrighted work (Harper & Row v. Nation), market harm is presumed, and fair use is unlikely (Henderson et al., 2023).

3. Empirical Evidence of Memorization and Legal Risk

Empirical experiments with foundation models (GPT-neo, GPT-3, OPT-175B, ChatGPT, GPT-4) demonstrate nonzero risks of memorization and regurgitation:

Models reproduce long contiguous text spans from popular books under low-temperature ( $X$ 8) decoding and targeted prompts; the risk scales with model size and context window.
With code generation (Codex models), functions from the Linux kernel (GPL-2.0) were reproduced with nontrivial overlap scores (MOSSPlus $X$ 9 20%, mean overlap 45-60% on large matches).
In image generation, style transfer requests frequently invoke the names of living artists and franchises, raising derivative-work issues (Henderson et al., 2023).

These behaviors indicate both a technical and legal vulnerability: even if training is primarily lossy, aggressive overfitting or insufficient regularization can enable functional reproduction, breaching the boundaries of fair use.

4. Technical and Policy Mitigations

A diverse suite of technical measures is now deployed to minimize legal exposure:

Mitigation Strategy	Primary Target	Limitation/Constraint
Data filtering	Amount, Nature	May reduce utility; imperfect license scope
Output filtering	Amount, Market Effect	Vulnerable to evasion/rephrasing, over-filtering harms utility
Instance attribution	Amount, Purpose	High computational overhead, not always accurate
Differential privacy / NAF	Substantiality	Reduces memorization, but can degrade utility
RLHF	Purpose	Requires calibrated human feedback, may not fully guarantee

Data filtering: Restricting training to licensed/public-domain corpora, duplicate removal (MinHash, LSH), and good-faith observance of robots.txt reduces raw infringement risk.
Output filtering: At generation time, models can be configured to block outputs with high $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 0-gram or LCS overlap with any training data.
Instance attribution: Use influence functions, leave-one-out retraining, and retrieval-augmented logs to associate outputs with potential training examples.
Differential privacy (DP): Algorithms such as DP-SGD enforce $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 1-DP, bounding the impact of any single training instance, but may impair downstream performance at strong privacy levels.
Learning from human feedback (RLHF): Annotators label outputs for "transformative enough?" and penalize near-verbatim cases, shifting model policy to maximize transformativeness.

There is ongoing discussion on integrating such mitigations into legal safe harbors, positing that rigorous technical risk-reduction should weigh favorably in fair use analysis or statutory immunity (Henderson et al., 2023, Franceschelli et al., 2024).

5. Recursive Training Pipelines and Copyright Laundering

Recursive AI pipelines—where foundation models are fine-tuned or distilled using synthetic data generated by ancestor models—create an evidentiary "laundering" challenge. Copyrighted material, once present in a foundational model, may propagate in attenuated form through several generations, evading traditional substantial similarity or access tests.

The AI-FOPT (Fruit of the Poisonous Tree) doctrine adapts criminal-law principles to assert a rebuttable presumption of taint for any model principally derived from an adjudicated infringing model:

Trigger: A finding of infringement against $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 2 renders it a "poisonous tree."
Presumption: Downstream models ( $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 3, $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 4, …) that inherit weights, synthetic data, or outputs carry a presumption of taint.
Rebuttal paths: Developers must produce provenance packets (cryptographically hashed manifests, licenses, lineage graphs), or demonstrate robust machine unlearning/heavy retraining to remove prior influence.

Formally, for generation $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 5, the residual copyrighted signal can be modeled as $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 6 ( $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 7; $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 8 original fraction; $L(X;W) = -\sum_{(x,t)\in X}\sum_i \log P(x_i|x_{i-k:i-1}, t; W)$ 9 clean/noise injection), implying long-lived, but statistically diluted, persistence (Mukherjee et al., 6 Jan 2026). This standard seeks to shift enforcement from impossible forensic content-matching to administrable lineage-focused evidence.

6. Best Practices and Future Directions

Emerging consensus best practices include:

Rigorous data documentation, foregrounding provenance and licensing.
Regularization, privacy noise, and memorization suppression during training.
Prompt- and deployment-time output filtering, especially for high-risk domains (poetry, lyrics, code).
Documentation of transformative intentions, market non-substitution, and adversarial mitigation protocols.
Active pursuit of new technical methods for semantic similarity assessment, quantifying "transformative enough," fact/expression boundary recognition, and scalable instance attribution (Franceschelli et al., 2024, Henderson et al., 2023).

Policy mechanisms (statutory licensing, safe harbors for strong mitigation, rights-holder compensation) are proposed as complements. There is a broad call for co-evolution of law and technology to calibrate incentives for innovation, respect for rights-holders, and pragmatic risk management.

7. Specialized Dimensions: Fairness Beyond Copyright

Although distinct from copyright, fairness concerns are increasingly intertwined with foundation model deployment. In the recommendation domain, prompt-based adversarial "Counterfactually-Fair-Prompt" (CFP) methods have been developed to achieve user-side counterfactual fairness—ensuring outputs do not encode or reveal protected user attributes (e.g., gender, age) (Hua et al., 2023). Such methods harness prompt engineering, adversarial objectives on internal representations, and lightweight prompt-mixture modules, and achieve substantial fairness gains without extensive backbone retraining. The interplay between algorithmic fairness and copyright issues reflects the expanding ethical and legal terrain for large foundation models.

Key References:

(Franceschelli et al., 2024) Franceschelli et al., "Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law"
(Mukherjee et al., 6 Jan 2026) "Copyright Laundering Through the AI Ouroboros: Adapting the 'Fruit of the Poisonous Tree' Doctrine to Recursive AI Training"
(Henderson et al., 2023) "Foundation Models and Fair Use"
(Hua et al., 2023) "UP5: Unbiased Foundation Model for Fairness-aware Recommendation"

Markdown Report Issue Upgrade to Chat

References (4)

Training Foundation Models as Data Compression: On Information, Model Weights and Copyright Law (2024)

Foundation Models and Fair Use (2023)

Copyright Laundering Through the AI Ouroboros: Adapting the 'Fruit of the Poisonous Tree' Doctrine to Recursive AI Training (2026)

UP5: Unbiased Foundation Model for Fairness-aware Recommendation (2023)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Foundation Models and Fair Use.