It is known, that antigradient $-\nabla f (x_0)$ is the direction of the steepest descent of the function $f(x)$ at point $x_0$. However, we can introduce another concept for choosing the best direction of function decreasing.

Given $f(x)$ and a point $x_0$. Define $B_\varepsilon(x_0) = \{x \in \mathbb{R}^n : d(x, x_0) = \varepsilon^2 \}$ as the set of points with distance $\varepsilon$ to $x_0$. Here we presume the existence of a distance function $d(x, x_0)$.

Than, we can define another steepest descent direction in terms of minimizer of function on a sphere:

Let us assume that the distance is defined locally by some metric $A$:

Let us also consider first order Taylor approximation of a function $f(x)$ near the point $x_0$:

Now we can explicitly pose a problem of finding $s$, as it was stated above.

Using $\text{(A1)}$ it can be written as:

Using Lagrange multipliers method, we can easily conclude, that the answer is:

Which means, that new direction of steepest descent is nothing else, but $A^{-1} \nabla f(x_0)$.

Indeed, if the space is isotropic and $A = I$, we immediately have gradient descent formula, while Newton method uses local Hessian as a metric matrix.