A surprising result about surprise index
John D. Cook » Probability & Statistics
by John
1M ago
Surprise index Warren Weaver [1] introduced what he called the surprise index to quantify how surprising an event is. At first it might seem that the probability of an event is enough for this purpose: the lower the probability of an event, the more surprise when it occurs. But Weaver’s notion is more subtle than this. Let X be a discrete random variable taking non-negative integer values such that Then the surprise index of the ith event is defined as Note that if X takes on values 0, 1, 2, … N−1 all with equal probability 1/N, then Si = 1, independent of N. If N is very large, each outcome ..read more
Visit website
Estimating an author’s vocabulary
John D. Cook » Probability & Statistics
by John
1M ago
How would you estimate the size of an author’s vocabulary? Suppose you have a analyzed the author’s available works and found n words, x of which are unique. Then you know the author’s vocabulary was at least x, but it’s reasonable to assume that the author may have know words he never used in writing, or that at least not in works you have access to. Brainerd [1] suggested the following estimator based on a Markov chain model of language. The estimated vocabulary is the number N satisfying the equation The left side is a decreasing function of N, so you could solve the equation by findi ..read more
Visit website
How likely is a random variable to be far from its center?
John D. Cook » Probability & Statistics
by John
2M ago
There are many answers to the question in the title: How likely is a random variable to be far from its center? The answers depend on how much you’re willing to assume about your random variable. The more you can assume, the stronger your conclusion. The answers also depend on what you mean by “center,” such as whether you have in mind the mean or the mode. Chebyshev’s inequality says that the probability of a random variable X taking on a value more than k standard deviations from its mean is less than 1/k². This of course assumes that X has a mean and a standard deviation. If we as ..read more
Visit website
Beta inequality symmetries
John D. Cook » Probability & Statistics
by John
3M ago
I was thinking about the work I did when I worked in biostatistics at MD Anderson. This work was practical rather than mathematically elegant, useful in its time but not of long-term interest. However, one result came out of this work that I would call elegant, and that was a symmetry I found. Let X be a beta(a, b) random variable and let Y be a beta(c, d) random variable. Let g(a, b, c, d) be the probability that a sample from X is larger than a sample from Y. g(a, b, c, d) = Prob(X > Y) This function often appeared in the inner loop of a simulation and so we spent thousands of CPU-hours c ..read more
Visit website
Zero-Concentrated Differential Privacy
John D. Cook » Probability & Statistics
by John
6M ago
Differential privacy can be rigid and overly conservative in practice, and so finding ways to relax pure differential privacy while retaining its benefits is an active area of research. Two approaches to doing this are concentrated differential privacy [1] and Rényi differential privacy [3]. Differential privacy quantifies the potential impact of an individual’s participation or lack of participation in a database and seeks to bound the difference. The original proposal for differential privacy and the approaches discussed here differ in how they measure the difference an individual can make ..read more
Visit website
Using dimensional analysis to check probability calculations
John D. Cook » Probability & Statistics
by John
6M ago
Probability density functions are independent of physical units. The normal distribution, for example, works just as well when describing weights or times. But sticking in units anyway is useful. Normal distribution example Suppose you’re trying to remember the probability density function for the normal distribution. Is the correct form or or or maybe some other variation? Suppose the distribution represents heights. (More on that here, here, and here.) The argument to an exponential function must be dimensionless, so the numerator and denominator in the exp() argument must have the same u ..read more
Visit website
Randomized response and local differential privacy
John D. Cook » Probability & Statistics
by John
6M ago
Differential privacy protects user privacy by adding randomness as necessary to the results of queries to a database containing private data. Local differential privacy protects user privacy by adding randomness before the data is inserted to the database. Using the visualization from this post, differential privacy takes the left and bottom (blue) path through the diagram below, whereas local differential privacy takes the top and right (green) path. The diagram does not commute. Results are more accurate along the blue path. But this requires a trusted party to hold the identifiable data. L ..read more
Visit website
Earth mover’s distance
John D. Cook » Probability & Statistics
by John
6M ago
There are many ways to describe the distance between two probability distributions. The previous two posts looked at using the p-norm to measure the difference between the PDFs and using Kullbach-Leibler divergence. Earth mover’s distance (EMD) is yet another approach. Imagine a probability distribution on ℝ² as a pile of dirt. Earth mover’s distance measures how different two distributions are by how much work it would take to reshape the pile of dirt representing one distribution into a pile of dirt representing the other distribution. Unlike KL divergence, earth mover’s distance is symmetri ..read more
Visit website
KL divergence from normal to normal
John D. Cook » Probability & Statistics
by John
6M ago
The previous post looked at the best approximation to a normal density by normal density with a different mean. Dan Piponi suggested in the comments that it would be good to look at the Kullback-Leibler (KL) divergence. The previous post looked at the difference from between two densities from an analytic perspective, solving the problem that an analyst would find natural. This post takes an information theoretic perspective. Just is p-norms are natural in analysis, KL divergence is natural in information theory. The Kullback-Leibler divergence between two random variables X and Y is ..read more
Visit website
Normal approximation to normal
John D. Cook » Probability & Statistics
by John
6M ago
In my previous post on approximating a logistic distribution with a normal distribution I accidentally said something about approximating a normal with a normal. Obviously the best approximation to a probability distribution is itself. As Norbert Wiener said “The best material model of a cat is another, or preferably the same, cat.” But this made me think of the following problem. Let f be the density function of a standard normal random variable, i.e. one with mean zero and standard deviation 1. Let g be the density function of a normal random variable with mean μ > 0 and standard deviatio ..read more
Visit website

Follow John D. Cook » Probability & Statistics on FeedSpot

Continue with Google
Continue with Apple
OR