Summary

Suppose, our target function is the sum of functions.

This problem usually arises in Deep Learning, where the gradient of the loss function is calculating over the huge number of data points, which could be very expensive in terms of the iteration cost.

Bounds

Conditions $\Vert \mathbb{E} [f(x_k)] - f(x^*)\Vert \leq$ Type of convergence $\Vert \mathbb{E}[x_k] - x^* \Vert \leq$
Convex $ \mathcal{O}\left(\dfrac{1}{\sqrt{k}} \right) $ Sublinear  
$ \mu$-Strongly convex $\mathcal{O}\left(\dfrac{1}{k} \right) $ Sublinear  

Materials