Papers
Topics
Authors
Recent
Search
2000 character limit reached

Convergence Rates Analysis of The Quadratic Penalty Method and Its Applications to Decentralized Distributed Optimization

Published 29 Nov 2017 in math.NA | (1711.10802v1)

Abstract: In this paper, we study a variant of the quadratic penalty method for linearly constrained convex problems, which has already been widely used but actually lacks theoretical justification. Namely, the penalty parameter steadily increases and the penalized objective function is minimized inexactly rather than exactly, e.g., with only one step of the proximal gradient descent. For such a variant of the quadratic penalty method, we give counterexamples to show that it may not give a solution to the original constrained problem. By choosing special penalty parameters, we ensure the convergence and further establish the convergence rates of $O\left(\frac{1}{\sqrt{K}}\right)$ for the generally convex problems and $O\left(\frac{1}{K}\right)$ for strongly convex ones, where $K$ is the number of iterations. Furthermore, by adopting Nesterov's extrapolation we show that the convergence rates can be improved to $O\left(\frac{1}{K}\right)$ for the generally convex problems and $O\left(\frac{1}{K2}\right)$ for strongly convex ones. When applied to the decentralized distributed optimization, the penalty methods studied in this paper become the widely used distributed gradient method and the fast distributed gradient method. However, due to the totally different analysis framework, we can improve their $O\left(\frac{\log K}{\sqrt{K}}\right)$ and $O\left(\frac{\log K}{K}\right)$ convergence rates to $O\left(\frac{1}{\sqrt{K}}\right)$ and $O\left(\frac{1}{K}\right)$ with fewer assumptions on the network topology for general convex problems. Using our analysis framework, we also extend the fast distributed gradient method to a communication efficient version, i.e., finding an $\varepsilon$ solution in $O\left(\frac{1}{\varepsilon}\right)$ communications and $O\left(\frac{1}{\varepsilon{2+\delta}}\right)$ computations for the non-smooth problems, where $\delta$ is a small constant.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.