Greedy Coordinate Gradient (GCG) Attack

Updated 10 February 2026

The GCG attack is a gradient-based, coordinate-wise optimization method that constructs adversarial token sequences to bypass LLM safety filters.
It approximates gradients via finite differences over candidate tokens, simplifying the high-dimensional discrete optimization problem of adversarial suffix generation.
Mask-GCG extends the approach by dynamically pruning redundant tokens, leading to shorter suffixes and reduced computation while maintaining high attack success rates.

The Greedy Coordinate Gradient (GCG) attack is a gradient-based, coordinate-wise optimization algorithm for constructing adversarial token sequences—most commonly adversarial suffixes—that induce LLMs to generate responses that bypass alignment constraints, such as content refusals or safety filters. GCG has emerged as a general and effective method in automated LLM jailbreak red-teaming and adversarial prompting. Recent research has proposed several extensions and acceleration techniques, with Mask-GCG introducing dynamic pruning of redundant tokens, establishing that most—but not all—tokens in optimized adversarial suffixes contribute substantially to attack effectiveness (Mu et al., 8 Sep 2025).

1. Formalization: Objective and Algorithmic Framework

The core problem addressed by GCG is to find a discrete sequence of length $L$ from the model’s token vocabulary $V$ , denoted $S=(s_1, ..., s_L)$ , which maximizes a target loss function $L(S)$ . Typically, $L(S)$ is the cross-entropy (the negative log-probability) of a chosen harmful or affirmative continuation $y_{target}$ when the model is conditioned on a given prompt concatenated with $S$ . The formal optimization is:

$\max_{S \in V^L} L(S)$

or, equivalently for minimization conventions,

$\min_{S \in V^L} - L(S)$

This is a high-dimensional, discrete optimization problem; for $V$ of size $V$ 050K and $V$ 1, the search space is $V$ 2.

GCG addresses this intractability by iteratively improving $V$ 3 via greedy, coordinate-wise updates. At each iteration, for every coordinate $V$ 4 (1… $V$ 5), the algorithm computes an approximate gradient—typically using finite differences—by considering the effect on $V$ 6 of replacing $V$ 7 with candidate tokens $V$ 8:

$V$ 9

The coordinate–token pair $S=(s_1, ..., s_L)$ 0 that yields the largest loss reduction (or gain, depending on maximization/minimization) is chosen, and $S=(s_1, ..., s_L)$ 1 is set to $S=(s_1, ..., s_L)$ 2. The process is repeated until no further reduction is possible or a maximum number of steps is reached (Mu et al., 8 Sep 2025).

2. Coordinate Descent and Gradient Approximation

Since tokens are inherently discrete, GCG cannot perform standard continuous optimization. Instead, it relies on coordinate descent with finite-difference gradient proxies. For each position $S=(s_1, ..., s_L)$ 3, the local "pseudo-gradient" is approximated by evaluating the change in $S=(s_1, ..., s_L)$ 4 for a (subsampled) set of top-K candidate tokens. The coordinate with the maximal descent is updated.

The time complexity per iteration is $S=(s_1, ..., s_L)$ 5 in the exhaustive case or $S=(s_1, ..., s_L)$ 6 when the search is narrowed to top- $S=(s_1, ..., s_L)$ 7 candidates per coordinate. Over $S=(s_1, ..., s_L)$ 8 optimization steps, the total complexity is $S=(s_1, ..., s_L)$ 9 (Mu et al., 8 Sep 2025).

A simplified pseudocode of the core GCG loop is:

$L(S)$ 2

3. Mask-GCG: Token Pruning and Adaptive Masking

Mask-GCG extends GCG by learning a (soft) binary mask $L(S)$ 0 over the suffix positions, dynamically identifying coordinates that are high- or low-impact with respect to the loss. Each position $L(S)$ 1 has a logit $L(S)$ 2, mapped via sigmoid to an update probability $L(S)$ 3. At each pruning interval (e.g., every 10 steps), positions with $L(S)$ 4 are pruned from the suffix and mask vector.

The joint optimization alternates between:

Mask update: Optimizing $L(S)$ 5 using Adam on a composite loss,

$L(S)$ 6

with $L(S)$ 7 as an $L(S)$ 8 sparsity penalty.

GCG token update: Running a round of standard GCG on the active (unpruned) positions (Mu et al., 8 Sep 2025).

Token pruning reduces both the search-space size and the computational resources required per iteration.

4. Experimental Insights: Suffix Redundancy, Efficiency, and Attack Success

Empirically, Mask-GCG demonstrates the existence of significant token redundancy in standard GCG-optimized adversarial suffixes:

Model & Variant	Suffix Length	Suffix Compression Ratio (SCR)	ASR (orig.)	ASR (Mask-GCG)
Llama-2-7B + GCG	30	9.9% (→27 tokens)	64%	62%
Llama-13B + I-GCG	30	5.4%	100%	≥99%
Vicuna-7B + Ample-GCG	20	6.5%	100%	98%

On average, 7–10% of suffix tokens were removable with negligible change in cross-entropy loss or attack success rate (ASR). In extreme cases, up to 40% of a 30-token suffix could be pruned without affecting ASR (remaining at 100%). Across models and GCG variants, computational runtime decreased by 16.8% (e.g., 935 s→780 s for Llama-2-7B with $L(S)$ 9) (Mu et al., 8 Sep 2025).

5. Best Practices and Theoretical Implications

The observed redundancy indicates that the adversarial signal needed to induce jailbreak outputs typically concentrates on a majority subset of the suffix, but a minority of low-impact positions can be pruned aggressively, yielding a shorter and more stealthy attack vector.

Mask-GCG exposes several actionable guidelines for adversarial optimization:

Apply dynamic masking early to identify unimportant positions.
Use a regularization coefficient in the range $L(S)$ 0 and pruning threshold $L(S)$ 1.
Prune gradually and allow rollback if loss or ASR degrades.
For computational efficiency, run GCG with pruning at fixed intervals (e.g., every 10 iterations) and restart the optimizer after suffix truncation.

These steps lead to both computational gains and qualitative improvements in attack stealth, without compromising effectiveness (Mu et al., 8 Sep 2025).

6. Broader Implications for Model Evaluation and Security

The Mask-GCG findings have direct implications for both LLM developers and attackers. The presence of redundant, low-impact tokens in adversarial suffixes means that static detection rules based only on suffix length or perplexity are insufficient for robust defense. Conversely, attackers optimizing for stealth may prune their suffixes to reduce detection likelihood while maintaining high ASR.

The success of pruning also suggests that future work on LLM alignment should consider not just the presence of adversarial tokens, but their positional and functional saliency within model activations. Mask-GCG provides a mechanism for interpretable analysis of adversarial prompts by revealing which coordinates matter most for jailbreak effectiveness (Mu et al., 8 Sep 2025).

7. Summary Table: Mask-GCG vs GCG

Metric	GCG (L=30, Llama-2-7B)	Mask-GCG
Avg. suffix length	30	27 (9.9% reduction)
Suffix Compression Ratio (SCR)	0%	5.4–9.9% (commonly, up to 40% in special cases)
Attack Success Rate (ASR)	64%	62% (stable within statistical noise)
Average runtime	935 s	780 s (16.8% faster)

In aggregate, GCG constitutes a tractable, coordinate-wise greedy mechanism for adversarial prompt optimization, and Mask-GCG realizes explicit token selection and pruning with empirical gains in both suffix compactness and computational efficiency, all while preserving attack success (Mu et al., 8 Sep 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks? (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Greedy Coordinate Gradient (GCG) Attack.