Monotone likelihood ratio Contents Intuition Families of distributions satisfying MLR Relation to other...
probability mass functionprobability density functioncumulative distribution functionquantile functionmoment-generating functioncharacteristic functionprobability-generating functioncumulantcombinant
Theory of probability distributionsStatistical hypothesis testing
density functionsstatisticsprobability density functionsmaximum-likelihoodestimationfirst-order stochastic dominancehazard ratiosstatisticuniformly most powerful testpoint estimationhypothesis testingprobability modelsexponential familiesprobability density functionsprobability mass functionssufficient statisticuniformly most powerful testsKarlin–Rubin theoremmedian-unbiased estimatorsRao–Blackwellmean-unbiased estimatorsloss functionsmechanism design

The ratio of the density functions above is increasing in the parameter x{displaystyle x}, so f(x)/g(x){displaystyle f(x)/g(x)} satisfies the monotone likelihood ratio property.
In statistics, the monotone likelihood ratio property is a property of the ratio of two probability density functions (PDFs). Formally, distributions ƒ(x) and g(x) bear the property if
- for every x1>x0,f(x1)g(x1)≥f(x0)g(x0){displaystyle {text{for every }}x_{1}>x_{0},quad {frac {f(x_{1})}{g(x_{1})}}geq {frac {f(x_{0})}{g(x_{0})}}}
that is, if the ratio is nondecreasing in the argument x{displaystyle x}.
If the functions are first-differentiable, the property may sometimes be stated
- ∂∂x(f(x)g(x))≥0{displaystyle {frac {partial }{partial x}}left({frac {f(x)}{g(x)}}right)geq 0}
For two distributions that satisfy the definition with respect to some argument x, we say they "have the MLRP in x." For a family of distributions that all satisfy the definition with respect to some statistic T(X), we say they "have the MLR in T(X)."
Contents
1 Intuition
1.1 Example: Working hard or slacking off
2 Families of distributions satisfying MLR
2.1 List of families
2.2 Hypothesis testing
2.3 Example: Effort and output
3 Relation to other statistical properties
3.1 Exponential families
3.2 Most powerful tests: The Karlin–Rubin theorem
3.3 Median unbiased estimation
3.4 Lifetime analysis: Survival analysis and reliability
3.4.1 Proofs
3.4.2 First-order stochastic dominance
3.4.3 Monotone hazard rate
4 Uses
4.1 Economics
5 References
Intuition
The MLRP is used to represent a data-generating process that enjoys a straightforward relationship between the magnitude of some observed variable and the distribution it draws from. If f(x){displaystyle f(x)} satisfies the MLRP with respect to g(x){displaystyle g(x)}, the higher the observed value x{displaystyle x}, the more likely it was drawn from distribution f{displaystyle f} rather than g{displaystyle g}. As usual for monotonic relationships, the likelihood ratio's monotonicity comes in handy in statistics, particularly when using maximum-likelihood estimation. Also, distribution families with MLR have a number of well-behaved stochastic properties, such as first-order stochastic dominance and increasing hazard ratios. Unfortunately, as is also usual, the strength of this assumption comes at the price of realism. Many processes in the world do not exhibit a monotonic correspondence between input and output.
Example: Working hard or slacking off
Suppose you are working on a project, and you can either work hard or slack off. Call your choice of effort e{displaystyle e} and the quality of the resulting project q{displaystyle q}. If the MLRP holds for the distribution of q conditional on your effort e{displaystyle e}, the higher the quality the more likely you worked hard. Conversely, the lower the quality the more likely you slacked off.
- Choose effort e∈{H,L}{displaystyle ein {H,L}} where H means high, L means low
- Observe q{displaystyle q} drawn from f(q∣e){displaystyle f(qmid e)}. By Bayes' law with a uniform prior,
- Pr[e=H∣q]=f(q∣H)f(q∣H)+f(q∣L){displaystyle Pr[e=Hmid q]={frac {f(qmid H)}{f(qmid H)+f(qmid L)}}}
- Suppose f(q∣e){displaystyle f(qmid e)} satisfies the MLRP. Rearranging, the probability the worker worked hard is
- 11+f(q∣L)/f(q∣H){displaystyle {frac {1}{1+f(qmid L)/f(qmid H)}}}
- which, thanks to the MLRP, is monotonically decreasing in q{displaystyle q}. Hence if some employer is doing a "performance review" he can infer his employee's behavior from the merits of his work.
Families of distributions satisfying MLR
Statistical models often assume that data are generated by a distribution from some family of distributions and seek to determine that distribution. This task is simplified if the family has the monotone likelihood ratio property (MLRP).
A family of density functions {fθ(x)}θ∈Θ{displaystyle {f_{theta }(x)}_{theta in Theta }} indexed by a parameter θ{displaystyle theta } taking values in an ordered set Θ{displaystyle Theta } is said to have a monotone likelihood ratio (MLR) in the statistic T(X){displaystyle T(X)} if for any θ1<θ2{displaystyle theta _{1}<theta _{2}},
fθ2(X=x1,x2,x3,…)fθ1(X=x1,x2,x3,…){displaystyle {frac {f_{theta _{2}}(X=x_{1},x_{2},x_{3},dots )}{f_{theta _{1}}(X=x_{1},x_{2},x_{3},dots )}}} is a non-decreasing function of T(X){displaystyle T(X)}.
Then we say the family of distributions "has MLR in T(X){displaystyle T(X)}".
List of families
| Family | T(X){displaystyle T(X)} in which fθ(X){displaystyle f_{theta }(X)} has the MLR |
|---|---|
| Exponential[λ]{displaystyle [lambda ]} | ∑xi{displaystyle sum x_{i}} observations |
| Binomial[n,p]{displaystyle [n,p]} | ∑xi{displaystyle sum x_{i}} observations |
| Poisson[λ]{displaystyle [lambda ]} | ∑xi{displaystyle sum x_{i}} observations |
| Normal[μ,σ]{displaystyle [mu ,sigma ]} | if σ{displaystyle sigma } known, ∑xi{displaystyle sum x_{i}} observations |
Hypothesis testing
If the family of random variables has the MLRP in T(X){displaystyle T(X)}, a uniformly most powerful test can easily be determined for the hypotheses H0:θ≤θ0{displaystyle H_{0}:theta leq theta _{0}} versus H1:θ>θ0{displaystyle H_{1}:theta >theta _{0}}.
Example: Effort and output
Example: Let e{displaystyle e} be an input into a stochastic technology – worker's effort, for instance – and y{displaystyle y} its output, the likelihood of which is described by a probability density function f(y;e).{displaystyle f(y;e).} Then the monotone likelihood ratio property (MLRP) of the family f{displaystyle f} is expressed as follows: for any e1,e2{displaystyle e_{1},e_{2}}, the fact that e2>e1{displaystyle e_{2}>e_{1}} implies that the ratio f(y;e2)/f(y;e1){displaystyle f(y;e_{2})/f(y;e_{1})} is increasing in y{displaystyle y}.
Relation to other statistical properties
Monotone likelihoods are used in several areas of statistical theory, including point estimation and hypothesis testing, as well as in probability models.
Exponential families
One-parameter exponential families have monotone likelihood-functions. In particular, the one-dimensional exponential family of probability density functions or probability mass functions with
- fθ(x)=c(θ)h(x)exp(π(θ)T(x)){displaystyle f_{theta }(x)=c(theta )h(x)exp(pi (theta )T(x))}
has a monotone non-decreasing likelihood ratio in the sufficient statistic T(x), provided that π(θ){displaystyle pi (theta )} is non-decreasing.
Most powerful tests: The Karlin–Rubin theorem
Monotone likelihood functions are used to construct uniformly most powerful tests, according to the Karlin–Rubin theorem.[1] Consider a scalar measurement having a probability density function parameterized by a scalar parameter θ, and define the likelihood ratio ℓ(x)=fθ1(x)/fθ0(x){displaystyle ell (x)=f_{theta _{1}}(x)/f_{theta _{0}}(x)}.
If ℓ(x){displaystyle ell (x)} is monotone non-decreasing, in x{displaystyle x}, for any pair θ1≥θ0{displaystyle theta _{1}geq theta _{0}} (meaning that the greater x{displaystyle x} is, the more likely H1{displaystyle H_{1}} is), then the threshold test:
- φ(x)={1if x>x00if x<x0{displaystyle varphi (x)={begin{cases}1&{text{if }}x>x_{0}\0&{text{if }}x<x_{0}end{cases}}}
- where x0{displaystyle x_{0}} is chosen so that Eθ0φ(X)=α{displaystyle operatorname {E} _{theta _{0}}varphi (X)=alpha }
is the UMP test of size α for testing H0:θ≤θ0 vs. H1:θ>θ0.{displaystyle H_{0}:theta leq theta _{0}{text{ vs. }}H_{1}:theta >theta _{0}.}
Note that exactly the same test is also UMP for testing H0:θ=θ0 vs. H1:θ>θ0.{displaystyle H_{0}:theta =theta _{0}{text{ vs. }}H_{1}:theta >theta _{0}.}
Median unbiased estimation
Monotone likelihood-functions are used to construct median-unbiased estimators, using methods specified by Johann Pfanzagl and others.[2][3] One such procedure is an analogue of the Rao–Blackwell procedure for mean-unbiased estimators: The procedure holds for a smaller class of probability distributions than does the Rao–Blackwell procedure for mean-unbiased estimation but for a larger class of loss functions.[4]
Lifetime analysis: Survival analysis and reliability
If a family of distributions fθ(x){displaystyle f_{theta }(x)} has the monotone likelihood ratio property in T(X){displaystyle T(X)},
- the family has monotone decreasing hazard rates in θ{displaystyle theta } (but not necessarily in T(X){displaystyle T(X)})
- the family exhibits the first-order (and hence second-order) stochastic dominance in x{displaystyle x}, and the best Bayesian update of θ{displaystyle theta } is increasing in T(X){displaystyle T(X)}.
But not conversely: neither monotone hazard rates nor stochastic dominance imply the MLRP.
Proofs
Let distribution family fθ{displaystyle f_{theta }} satisfy MLR in x, so that for θ1>θ0{displaystyle theta _{1}>theta _{0}} and x1>x0{displaystyle x_{1}>x_{0}}:
- fθ1(x1)fθ0(x1)≥fθ1(x0)fθ0(x0),{displaystyle {frac {f_{theta _{1}}(x_{1})}{f_{theta _{0}}(x_{1})}}geq {frac {f_{theta _{1}}(x_{0})}{f_{theta _{0}}(x_{0})}},}
or equivalently:
- fθ1(x1)fθ0(x0)≥fθ1(x0)fθ0(x1).{displaystyle f_{theta _{1}}(x_{1})f_{theta _{0}}(x_{0})geq f_{theta _{1}}(x_{0})f_{theta _{0}}(x_{1}).,}
Integrating this expression twice, we obtain:
1. To x1{displaystyle x_{1}} with respect to x0{displaystyle x_{0}}
integrate and rearrange to obtain
| 2. From x0{displaystyle x_{0}} with respect to x1{displaystyle x_{1}}
integrate and rearrange to obtain
|
First-order stochastic dominance
Combine the two inequalities above to get first-order dominance:
- Fθ1(x)≤Fθ0(x) ∀x{displaystyle F_{theta _{1}}(x)leq F_{theta _{0}}(x) forall x}
Monotone hazard rate
Use only the second inequality above to get a monotone hazard rate:
- fθ1(x)1−Fθ1(x)≤fθ0(x)1−Fθ0(x) ∀x{displaystyle {frac {f_{theta _{1}}(x)}{1-F_{theta _{1}}(x)}}leq {frac {f_{theta _{0}}(x)}{1-F_{theta _{0}}(x)}} forall x}
Uses
Economics
The MLR is an important condition on the type distribution of agents in mechanism design. Most solutions to mechanism design models assume a type distribution to satisfy the MLR to take advantage of a common solution method.
References
^ Casella, G.; Berger, R.L. (2008), Statistical Inference, Brooks/Cole. .mw-parser-output cite.citation{font-style:inherit}.mw-parser-output .citation q{quotes:"""""""'""'"}.mw-parser-output .citation .cs1-lock-free a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/9px-Lock-green.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .citation .cs1-lock-limited a,.mw-parser-output .citation .cs1-lock-registration a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/d/d6/Lock-gray-alt-2.svg/9px-Lock-gray-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .citation .cs1-lock-subscription a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Lock-red-alt-2.svg/9px-Lock-red-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration{color:#555}.mw-parser-output .cs1-subscription span,.mw-parser-output .cs1-registration span{border-bottom:1px dotted;cursor:help}.mw-parser-output .cs1-ws-icon a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/4/4c/Wikisource-logo.svg/12px-Wikisource-logo.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output code.cs1-code{color:inherit;background:inherit;border:inherit;padding:inherit}.mw-parser-output .cs1-hidden-error{display:none;font-size:100%}.mw-parser-output .cs1-visible-error{font-size:100%}.mw-parser-output .cs1-maint{display:none;color:#33aa33;margin-left:0.3em}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration,.mw-parser-output .cs1-format{font-size:95%}.mw-parser-output .cs1-kern-left,.mw-parser-output .cs1-kern-wl-left{padding-left:0.2em}.mw-parser-output .cs1-kern-right,.mw-parser-output .cs1-kern-wl-right{padding-right:0.2em}
ISBN 0-495-39187-5 (Theorem 8.3.17)
^ Pfanzagl, Johann. "On optimal median unbiased estimators in the presence of nuisance parameters." The Annals of Statistics (1979): 187–193.
^ Brown, L. D.; Cohen, Arthur; Strawderman, W. E. A Complete Class Theorem for Strict Monotone Likelihood Ratio With Applications. Ann. Statist. 4 (1976), no. 4, 712–722. doi:10.1214/aos/1176343543. http://projecteuclid.org/euclid.aos/1176343543.
^ Page 713:
Brown, L. D.; Cohen, Arthur; Strawderman, W. E. A Complete Class Theorem for Strict Monotone Likelihood Ratio With Applications. Ann. Statist. 4 (1976), no. 4, 712–722. doi:10.1214/aos/1176343543. http://projecteuclid.org/euclid.aos/1176343543.