Non-monotone DR-Submodular Maximization

Updated 11 November 2025

Non-monotone DR-submodular maximization is defined for continuous functions exhibiting diminishing returns and non-monotonic behavior over convex sets.
The paper establishes a novel non-monotone Frank-Wolfe algorithm achieving a (1/4)(1-m) approximation, proven optimal through tight complexity arguments.
Empirical evaluations in revenue maximization, location summarization, and quadratic programming highlight the method’s efficiency and practical superiority.

Non-monotone DR-submodular maximization concerns the optimization of functions that generalize discrete submodularity (diminishing returns) to the continuous domain, encompassing non-monotonic behavior and non-down-closed convex constraints. This class unifies and extends classical set-function submodular maximization and covers a diversity of problems in machine learning, economics, and network optimization. The area is notable for a sequence of impossibility results, breakthroughs on tight polynomial-time approximability, and the interplay between constraint geometry and achievable guarantees (Mualem et al., 2022).

1. DR-Submodularity and Problem Formulation

Let $F: [0,1]^n \to \mathbb{R}_{\geq 0}$ be a continuously differentiable function, with feasible set $K \subseteq [0,1]^n$ convex (not necessarily down-closed). $F$ is called DR-submodular if, for all $x \leq y$ (coordinate-wise), every $i \in [n]$ , and all $\delta \geq 0$ with $y + \delta e_i \in [0,1]^n$ ,

$F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$

Equivalently, the gradient is coordinate-wise anti-tone: $\nabla F(x) \geq \nabla F(y)$ when $x \leq y$ , and all mixed Hessians $K \subseteq [0,1]^n$ 0.

A function is non-monotone DR-submodular if the above holds but monotonicity ( $K \subseteq [0,1]^n$ 1 everywhere) is not assumed. Maximization of such functions over convex sets is NP-hard even in simple cases (Mualem et al., 2022).

Illustrative Example. The function $K \subseteq [0,1]^n$ 2 on $K \subseteq [0,1]^n$ 3 is DR-submodular but non-monotone: $K \subseteq [0,1]^n$ 4 is initially increasing in each $K \subseteq [0,1]^n$ 5, then decreasing for $K \subseteq [0,1]^n$ 6.

2. Approximability Barriers and the Minimum-Norm Parameter

A central negative result (Vondrák 2013) establishes that for non-monotone DR-submodular maximization over a general convex set $K \subseteq [0,1]^n$ 7, no algorithm running in sub-exponential time can achieve a constant-factor approximation in the worst case. The source of this hardness is the so-called symmetry-gap constructed by adversarially symmetrical feasible regions and objectives.

A key technique to bypass this barrier is to parameterize approximation in terms of the "minimum $K \subseteq [0,1]^n$ 8-norm"

$K \subseteq [0,1]^n$ 9

When $F$ 0, the feasible set $F$ 1 stays "interior," breaking full symmetry and permitting nontrivial bounds. Sub-exponential-time methods achieve approximation ratios that scale as $F$ 2, gracefully degrading as $F$ 3 approaches the cube's boundary (Mualem et al., 2022).

3. Polynomial-time Algorithms: The $F$ 4 Guarantee

Du (2022) discovered the first polynomial-time, information-theoretically optimal algorithm for non-monotone DR-submodular maximization over general convex constraints, achieving a guarantee of

$F$ 5

where $F$ 6 is the output after $F$ 7 iterations for small $F$ 8 (Mualem et al., 2022).

Algorithm—Non-monotone Frank-Wolfe:

Start from $F$ 9.
For $x \leq y$ $x \leq y$ 0:
1. $x \leq y$ 1.
2. $x \leq y$ 2.
Output the best $x \leq y$ 3.

Analysis: By DR-submodularity, the Frank-Wolfe direction ensures a margin on the directional derivative related to global optimum via $x \leq y$ 4. The iterative process contracts away from the boundary, ensuring the approximation factor dependently degrades as $x \leq y$ 5 (when $x \leq y$6 is almost fully boundary, e.g. a vertex).

This is provably information-theoretically sharp; no sub-exponential-time (let alone polynomial-time) algorithm can beat $x \leq y$ 7 in worst case (Mualem et al., 2022).

4. Online Maximization and Regret: Matching Tight Ratios

For the online version (sequentially revealed DR-submodular objectives $x \leq y$ 8), a matching $x \leq y$ 9-approximation is obtained with $i \in [n]$ 0 regret.

Algorithm—Non-monotone Meta-Frank-Wolfe:

At each round $i \in [n]$ 1, initialize $i \in [n]$ 2 at minimum $i \in [n]$ 3 norm in $i \in [n]$ 4.
Execute $i \in [n]$ 5 Frank-Wolfe steps with independent online linear-optimization subroutines $i \in [n]$ 6.
For each $i \in [n]$ $i \in [n]$ 7:
- Receive $i \in [n]$ 8 from $i \in [n]$ 9.
- $\delta \geq 0$ 0.
- Receive/estimate an unbiased $\delta \geq 0$ 1 and feed as the loss vector to $\delta \geq 0$ 2.
Play $\delta \geq 0$ 3.

The expected average reward over $\delta \geq 0$ 4 rounds satisfies

$\delta \geq 0$ 5

This guarantee, both offline and online, is proven optimal (Mualem et al., 2022).

5. Information-theoretic Hardness

A symmetry-gap argument demonstrates that for any $\delta \geq 0$ 6 and $\delta \geq 0$ 7, there is no sub-exponential-time algorithm that achieves

$\delta \geq 0$ 8

approximation for maximizing non-negative, $\delta \geq 0$ 9-smooth DR-submodular $y + \delta e_i \in [0,1]^n$ 0 over any polytope $y + \delta e_i \in [0,1]^n$ 1 with $y + \delta e_i \in [0,1]^n$ 2. The construction involves adversarial, high-dimensional instances where distinguishing optimal from near-optimal regions is exponentially hard due to function symmetry.

This implies the $y + \delta e_i \in [0,1]^n$ 3 factor achieved by Du (2022) and in the presented online method is not improvable short of exponential time, for general $y + \delta e_i \in [0,1]^n$ 4.

6. Extensions: Comparison to Other Settings and Interpolated Guarantees

The $y + \delta e_i \in [0,1]^n$ 5 bound specializes as follows:

For $y + \delta e_i \in [0,1]^n$ 6 (e.g., $y + \delta e_i \in [0,1]^n$ 7, "fully down-closed"), the approximation is tight at $y + \delta e_i \in [0,1]^n$ 8.
For $y + \delta e_i \in [0,1]^n$ 9 (e.g., $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 0 shrinks to a singleton or a low-dimensional facet), the guarantee vanishes, as expected. This characterizes a smooth transition between the easy (down-closed) and hard (general) cases.

Intermediate approximation ratios $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 1, $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 2, and $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 3 arise in the down-closed, box, or other special settings, addressed in the literature by continuous greedy, measured continuous greedy, double-greedy, and hybrid approaches (Chen et al., 2023, Bian et al., 2017, Niazadeh et al., 2018).

7. Empirical Performance across Applications

The Du (2022) and matching online algorithms were tested in several domains:

Revenue Maximization (Social Networks): On datasets such as Facebook (64K nodes) and Advogato (6.5K nodes) with box+budget constraints, the method converges substantially faster and reaches higher rewards than competing algorithms (e.g., [Thắng & Srivastav 2021]).
Location Summarization: For tasks on the Yelp Charlotte dataset, the method outperforms others in longitudinal objective improvement.
Quadratic Programming with DR-negative-definite matrices: Varied $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 4 (down-closed and non-down-closed) were used, and the polynomial-time Non-monotone Frank-Wolfe outperforms previous sub-exponential algorithms even in down-closed cases when all methods are run under the same time budget.

These results validate both the tightness and practical strength of the $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 5 class for both offline and online settings (Mualem et al., 2022).

Summary Table: Offline Approximability by Constraint Type

Constraint Type $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 6	Best Achievable Ratio	Achieved by	Complexity
Down-closed (e.g. box)	$F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 7	[Bian et al.], [Dürr et al.]	poly-time
General, $F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 8	$F(x + \delta e_i) - F(x) \geq F(y + \delta e_i) - F(y).$ 9	Du (2022) offline; (Mualem et al., 2022) online	poly-time
General, $\nabla F(x) \geq \nabla F(y)$ 0	$\nabla F(x) \geq \nabla F(y)$ 1 (hard)	— (no c.a.r.)	— (hardness)

8. Concluding Remarks

Non-monotone DR-submodular maximization over general convex sets is now fully characterized with respect to worst-case polynomial-time and sub-exponential-time approximability, with the $\nabla F(x) \geq \nabla F(y)$ 2 bound being sharp. Algorithmic frameworks (Frank-Wolfe variants, online meta-FW) are efficient, general, and empirically dominant, making the area a canonical example of tight complexity-theoretic and practical trade-off in non-convex continuous optimization. Advances in constraint-specific interpolation (e.g., via convex body decomposition) and specialized oracles further expand the landscape, but the inapproximability barrier sets a final limit without additional structure (Mualem et al., 2022, Mualem et al., 2024).