Friday 21 June 2019

Probability Tail Bounds/ Concentration bounds

Author: Pooja Yadav

A variable with good concentration is one that is close to its mean with good probability. A "concentration inequality" is a theorem proving that a random variable has good concentration. Such theorems are also known as "tail bounds".
In a variety of settings, it is of interest to obtain bounds on the tails of a random variable or two-sided inequalities that guarantee that a random variable is close to its mean or median.

The role of inequalities is that you get bounds on probabilities of certain events. Inequalities are a definite fact about the randomness of an event. For instance, one can be able to say that the probability of an event happening is less than or equal to a definite number. 

In this blog, we explore a few elementary techniques for obtaining concentration inequalities.


1. Markov's Inequality

The most elementary tail bound or the simplest concentration inequality: Markov's Inequality.
It is often useful in scenarios where not much concentration is needed, or where the random variable is too complicated to be analyzed by more powerful techniques. The downside is that it only gives very weak bounds, but the upside is that it needs almost no assumptions about the random variable.

Theorem: Let X be a random variable that takes only non-negative values, then for any real number a>0, 
Pr[X ≥ a] ≤ E[X]/a
Proof: Let I be the indicator variable such that
                                                                  I = 1      if X ≥ a
 = 0      otherwise
           Now, X ≥ a 
                ⇒ 1 ≤ X/a
                ⇒  I ≤ X/a
          Taking expectation on both sides, we get
          E[I] ≤ E[X/a] = E[X].(1/a) (inequality remains unchanged as X takes only non-negative values)

          Therefore, we get
          Pr[X ≥ a].1 + Pr[X < a].0   ≤   E[X].(1/a)   (Using the definition of Expectation)
          i.e.  Pr[X ≥ a]   ≤   E[X]/a
          Hence, proved.
  
                       
2. Chebyshev's Inequality
Chebyshev's Inequality states that the difference between X and E[X] is somehow limited by Var(X). This is intuitively expected as variance shows on average how far we are from the mean.

Theorem: If X is a random variable with finite mean μ and variance σ2, then for any k>0,

Pr(|X-μ| ≥ k) ≤  σ2/k2

Proof: Let Y =(X-μ)2 ≥ 0 and a = k2

            Using Markov's inequality gives 
                            Pr((X-µ)2 ≥ k2)  ≤  E[(X-µ)2]/k2

            But (X-μ)≥ k2  holds iff |X-μ| ≥ k holds, since k>0
            Therefore,                           
                             Pr(|X-μ| ≥ k) ≤  σ2/k2   
            Often, we are asked to obtain a bound for Pr(|X-μ| > k).
            Since, Pr(|X-μ| ≥ k) ≤ Pr(|X-μ| > k), Chebyshev's bound of σ2/kis valid for Pr(|X-μ| > k)                     also.

Sample Problems

Que 1. Let X = exp(λ). Using Markov's inequality, find an upper bound for P(X≥ a). Compare the upper bound with the actual value of P(X ≥ a).

Solution: If X = exp(λ)  ⇒ E[X]=1/λ
                 Using Markov's Inequality, we get
                 P(X ≥ a) ≤ E[X]/a = 1/λa
                 The actual value of P(X ≥ a) is e-ƛa, and we always have 1/λa ≥ e-ƛa

                 
Que 2. Let X = exp(λ). Using Chebyshev's inequality, find an upper bound for P(|X-μ| ≥ b).
Solution: We have, E[X] = 1/λ and Var(X) = 1/λ2
                     Using Chebyshev's Inequality, we get 
                                                   P(|X-μ| ≥ b) ≤ Var(X)/b2 = 1/λb2
                                        

No comments:

Post a Comment

When Randomness meets Automation- The Intersection of Cryptography with Machine Learning

Author- Anuj Srivastava Here is a brief summary, or rather the cornerstone of what Dr. Arpita Patra, a renowned faculty and esteemed r...