# Problem

We need to estimate probability density of a random variable from observed values.

# Approach

We will use idea of parametric distribution estimation, which involves choosing *the best* parameters, of a chosen family of densities , indexed by a parameter . The idea is very natural: we choose such parameters, which maximizes the probability (or, logarithm of probability) of observed values.

## Linear measurements with i.i.d. noise

Suppose, we are given the set of observations:

where

- - unknown vector of parameters
- are IID noise with density
- - measurements,

Which implies the following optimization problem:

Where the sum goes from the fact, that all observation are independent, which leads to the fact, that . The target function is called log-likelihood function .

### Gaussian noise

Which means, the maximum likelihood estimation in case of gaussian noise is a least squares solution.

### Laplacian noise

Which means, the maximum likelihood estimation in case of Laplacian noise is a -norm solution.

### Uniform noise

Which means, the maximum likelihood estimation in case of uniform noise is any vector , which satisfies .

## Binary logistic regression

Suppose, we are given a set of binary random variables . Let us parametrize the distribution function as a sigmoid, using linear transformation of the input as an argument of a sigmoid.

Let’s assume, that first observations are ones: , . Then, log-likelihood function will be written as follows:

# References

- Convex Optimization @ UCLA by Prof. L. Vandenberghe
- Numerical explanation