# Intuition

For the classic task of unconditional optimization $f(x) \to \min\limits_{x \in \mathbb{R}^n}$ the general scheme of iteration method is written as:

In the Newton method, the $s_k$ direction (Newton’s direction) is set by the linear system solution at each step:

i.e. at each iteration it is necessary to compensate hessian and gradient and resolve linear system.

Note here that if we take a single matrix of $B_k = I_n$ as $B_k$ at each step, we will exactly get the gradient descent method.

The general scheme of quasi-Newton methods is based on the selection of the $B_k$ matrix so that it tends in some sense at $k \to \infty$ to the true value of inverted Hessian in the local optimum $f_{xx}^{-1}(x_*)$. Let’s consider several schemes using iterative updating of $B_k$ matrix in the following way:

Then if we use Taylor’s approximation for the first order gradient, we get it:

Now let’s formulate our method as:

in case you set the task of finding an update $\Delta B_k$:

# Broyden method

The simplest option is when the amendment $\Delta B_k$ has a rank equal to one. Then you can look for an amendment in the form

where $\mu_k$ is a scalar and $q_k$ is a non-zero vector. Then mark the right side of the equation to find $\Delta B_k$ for $\Delta z_k$:

We get it:

A possible solution is: $q_k = \Delta z_k$, $\mu_k = \left(q_k^\top \Delta y_k\right)^{-1}$.

Then an iterative amendment to Hessian’s evaluation at each iteration:

# Davidon–Fletcher–Powell method

$\Delta B_k = \mu_1 \Delta x_k (\Delta x_k)^\top + \mu_2 B_k \Delta y_k (B_k \Delta y_k)^\top.$