Duality
Duality lets us associate to any constrained optimization problem, a concave maximization problem whose solutions lower bound the optimal value of the original problem. What is interesting is there are cases, when one can solve the primal problem by first solving the dual one. Now, consider general constrained optimization problem:
We’ll build $g(y)$, that preserves uniform bound:
As a consequence:
We’ll consider one (of the many) possible way to construct $g(y)$ in case, when we have general mathematical programming problem with functional constraints:
And the Lagrangian, associated with this problem:
Now, we notice, that the original problem is equivalent to the following unconstraint problem:
Because $\lambda$ don’t affect the equality entries (since they all are equal to zero), while $\mu$ is positive and decreases Lagrangian (since all inequality entries are non  positive). Moreover, it can be seen, that:
Now, we denote the inner optimization as follows:
Summary
Primal  Dual  

Function  
Variables  
Constraints  
Problem  
Optimal 
Useful features

Construction of lower bound on solution of the direct problem.
It could be very complicated to solve the initial problem. But if we have the dual problem, we can take an arbitrary $y \in \Omega$ and substitute it in $g(y)$  we’ll immediately obtain some lower bound.

Checking for the problem’s solvability and attainability of the solution.
From the inequality $\max\limits_{y \in \Omega} g(y) \leq \min\limits_{x \in S} f(x) $ follows: if $\min\limits_{x \in S} f(x) = \infty$, then $\Omega = \varnothing$ and vice versa.

Sometimes it is easier to solve a dual problem than a primal one.
In this case, if the strong duality holds: $g(y^∗) = f(x^∗)$ we lose nothing.

Obtaining a lower bound on the function’s residual.
$f(x)  f^∗ \leq f(x)  g(y)$ for an arbitrary $y \in \Omega$

Dual function is always concave
As a pointwise minimum of affine functions.
Weak duality
It is common to name this relation between optimals of primal and dual problems as weak duality. For problem, we have
While the difference between them is often called duality gap:
Note, that we always have weak duality, if we’ve formulated primal and dual problem. It means, that if we were managed to solve dual problem (which is always convex, no matter the initial problem was or not), than we have some lower bound. Surprisingly, there are some notable cases, when these solutions are equal.
Strong duality
Strong duality if duality gap is zero:
Notice: both and may be .
 Several sufficient conditions known!
 “Easy” necessary and sufficient conditions: unknown.
Slater’s sufficient condition
Theorem Let the primal problem be the Convex optimization problem. If there is a feasible point such that is strictly feasible for the nonaffine constraints (and merely feasible for affine, linear ones), then strong duality holds. Moreover, in this case, the dual optimal is attained (i.e., ).
Slater’s condition: There exists s.t. , i.e. there is a strictly feasible point.
Counterexample
Only feasible line is , therefore we know optimal value of the primal problem . Let’s construct the Lagrangian:
Then, dual function is:
Dual problem is formulating the following way:
Here, we clearly see unit dual gap: and lack of strictly feasible solutions.
Examples
Least squares
Problem
Lagrangian
Dual function
Setting gradient to zero to find minimum 🤔:
Here we see lower bound property:
Let’s solve the dual problem:
Calculate the primal optimal and check whether this problem has strong duality or not.
Fenchel  Rockafellar problem
Problem
Let and — function, defined on the sets and in Euclidian Spaces and respectively. Let be the conjugate functions to the and respectively. Let — linear mapping. We call Fenchel  Rockafellar problem the following minimization task:
where — preimage of . We’ll build the dual problem using variable separation. Let’s introduce new variable . The the problem could be rewritten:
Lagrangian
Dual function
Now, we need to remember the definition of the Conjugate function:
So, we have
which allows us to formulate one of the most important theorems, that connects dual problems and conjugate functions:
Fenchel  Rockafellar theorem Let and — function, defined on the sets and in Euclidian Spaces and respectively. Let be the conjugate functions to the and respectively. Let — linear mapping. Let  optimal values of primal and dual problems:
Then we have weak duality: . Furthermore, if the functions and are convex and , then we have strong duality: . While points and are optimal values for primal and dual problem if and only if
Convex case is especially important since if we have Fenchel  Rockafellar problem with parameters , than the dual problem has the form
Logistic regression
Problem
where . Let and be the following functions
And let be some linear map:
Then, we can introduce Fenchel  Rockafellar problem: