Online gradient descent regret bound

(OGD Regret Bound)

After TT steps, ϵ=[i=1Tfi(𝐱(i))][i=1Tfi(𝐱*)]RGT\epsilon = [\sum_{i=1}^T f_i(\mathbf{x}^{(i)})]-[\sum_{i=1}^T f_i(\mathbf{x}^{*})] \leq RG\sqrt{T}

average regret over time is bounded by ϵTRGT\frac{\epsilon}{T} \leq \frac{RG}{\sqrt{T}}, goes 0\rightarrow 0 as TT \rightarrow \infty

Note: no assumptions on how f1,,fTf_1,…,f_T relate to each other, allowing even for these to be chosen adversarially, e.g. with fif_i depending on our choice of 𝐱i\mathbf{x}_i and all previous choices.


See: Regret bound, Online regret bound