Projected gradient descent

Projected gradient descent is a simple modification of gradient descent. Instead of setting the next iterate to

$\begin{align*} x^{(t+1)} = x^{(t)} - \eta \nabla f(x^{(t)}), \end{align*}$ we set the next iterate to $\begin{align*} x^{(t+1)} = P_{\mathcal{S}} (x^{(t)} - \eta \nabla f(x^{(t)})). \end{align*}$

Projected gradient descent convergence bound

(PGD convergence bound)

If $f, \mathcal{S}$ are convex and $T \geq \frac{R^2 G^2}{\epsilon^2}$ , then $f(\mathbf{\hat x}) \leq f(\mathbf{x^*}) + \epsilon$ .

Analysis

PGD convergence bound

Compare PGD to GD

In GD:

Pick an initial point $\mathbf{x}_0 \in \mathbb{R}^n$
Loop until stopping condition is met:
1. descent direction: compute $-\nabla f(\mathbf{x}_k)$
2. stepsize: pick a $\alpha_k$
3. update: $\mathbf{x}_{k+1} = \mathbf{x}_{k}-\alpha \nabla f(\mathbf{x}_k)$

In PGD:

Pick an initial point $\mathbf{x}_0 \in \mathcal{Q}$ , where $\mathcal{Q}$ is a set to constrain the solution
Loop until stopping condition is met:
1. descent direction: compute $-\nabla f(\mathbf{x}_k)$
2. stepsize: pick a $\alpha_k$
3. update: $\mathbf{y}_{k+1} = \mathbf{x}_{k}-\alpha \nabla f(\mathbf{x}_k)$
4. projection: $\mathbf{x}_{k+1} = {\arg\min}_{\mathbf{x} \in \mathcal{Q}} \frac{1}{2}||\mathbf{x}-\mathbf{y}_{k+1}||_2^2$

References: