Problem

We need to estimate probability density of a random variable from observed values.

Approach

We will use idea of parametric distribution estimation, which involves choosing the best parameters, of a chosen family of densities , indexed by a parameter . The idea is very natural: we choose such parameters, which maximizes the probability (or, logarithm of probability) of observed values.

Linear measurements with i.i.d. noise

Suppose, we are given the set of observations:

where

  • - unknown vector of parameters
  • are IID noise with density
  • - measurements,

Which implies the following optimization problem:

Where the sum goes from the fact, that all observation are independent, which leads to the fact, that . The target function is called log-likelihood function .

Gaussian noise

Which means, the maximum likelihood estimation in case of gaussian noise is a least squares solution.

Laplacian noise

Which means, the maximum likelihood estimation in case of Laplacian noise is a -norm solution.

Uniform noise

Which means, the maximum likelihood estimation in case of uniform noise is any vector , which satisfies .

Binary logistic regression

Suppose, we are given a set of binary random variables . Let us parametrize the distribution function as a sigmoid, using linear transformation of the input as an argument of a sigmoid.

Picture from Wikipedia

Let’s assume, that first observations are ones: , . Then, log-likelihood function will be written as follows:

References