Standard Normal Distribution

Agent Smith

Full Member
Joined
Oct 18, 2023
Messages
335
A standard normal distribution is basically the Bell curve. Once, for a particular statistical question we have a normal distribution, we can apply this formula: \(\displaystyle \text{z score}_x = \frac{x - \overline x}{\sigma}\). Let's say that we're statistically analyzing the heights of students in a particular high school and the distribution is normal; we compute the mean and the standard deviation. We then take a particular student (A) and measure his height to be 143 cm. Let it also be that \(\displaystyle \text{z score}_{143 \text{cm}} = -2\). A is exactly 2 standard deviations below the mean, which means, here I'm confused (a little help), by the empirical or 99.7-95-68 rule ...
a) 2.5% of the students have heights \(\displaystyle < 143 \text{ cm}\)
or
b) 2.5% of the students have height \(\displaystyle \leq 143 \text{ cm}\)

Mayday! Mayday!
 
A standard normal distribution is basically the Bell curve. Once, for a particular statistical question we have a normal distribution, we can apply this formula: \(\displaystyle \text{z score}_x = \frac{x - \overline x}{\sigma}\). Let's say that we're statistically analyzing the heights of students in a particular high school and the distribution is normal; we compute the mean and the standard deviation. We then take a particular student (A) and measure his height to be 143 cm. Let it also be that \(\displaystyle \text{z score}_{143 \text{cm}} = -2\). A is exactly 2 standard deviations below the mean, which means, here I'm confused (a little help), by the empirical or 99.7-95-68 rule ...
a) 2.5% of the students have heights \(\displaystyle < 143 \text{ cm}\)
or
b) 2.5% of the students have height \(\displaystyle \leq 143 \text{ cm}\)

Mayday! Mayday!
It makes no difference.

The normal distribution is a continuous distribution, so P(x=143) = 0. That is, the probability of any single precise value is zero.
 
@Dr.Peterson gracias.

In this particular example I can't find students who are either as short/tall as A or shorter/taller, but I can find people who are shorter/taller than A. A is 143 cm tall.

How would I find the proportion of people as tall as A (143 cm)? Do I need to create an interval e.g. (142, 144), which allows me to compute a z score interval and then a proportion interval? We can then find the proportion of students whose heights are within 1 cm of A. Oui?
 
Is this really the reason why we can't compute P(x = 143). Do you mean to say if we had a dicontinuous probability function, that looked like (say) a histogram, we could compute P(x = 143)?
Finding the probability in normal distribution is the same as finding the area under the curve. Therefore, what do you think you will get when you calculate [imath]\displaystyle \int_{143}^{143} ..... = \ ?[/imath]
 
@mario99 , arigato for clarifying. The answer is \(\displaystyle 0\)

Am I right about what I said then? We fix an interval like (142, 144) and then compute the proportion of students who are within 1 cm of A's height which is 143 cm.
 
@mario99 , arigato for clarifying. The answer is \(\displaystyle 0\)

Am I right about what I said then? We fix an interval like (142, 144) and then compute the proportion of students who are within 1 cm of A's height which is 143 cm.
Doing this is the same as calculating the probability of [imath]P(142 \leq x \leq 144)[/imath]. I think that you are confused between continuous and discrete distributions. In discrete distributions, you can get a value of the probability [imath]P(x = 143)[/imath].
 
@mario99 I think so too. With discrete distributions we can compute something like P(X = 143), but not with continuous distributions.

So I should go with < and > and forget about \(\displaystyle \leq\) and \(\displaystyle \geq\) and \(\displaystyle =\).
 
@mario99 I think so too. With discrete distributions we can compute something like P(X = 143), but not with continuous distributions.

So I should go with < and > and forget about \(\displaystyle \leq\) and \(\displaystyle \geq\) and \(\displaystyle =\).
In continuous distribution, all of these notations are the same:

[imath]\displaystyle P(142 < x < 144)[/imath]
[imath]\displaystyle P(142 \leq x \leq 144)[/imath]
[imath]\displaystyle P(142 \leq x < 144)[/imath]
[imath]\displaystyle P(142 < x \leq 144)[/imath]
 
In continuous distribution, all of these notations are the same:

[imath]\displaystyle P(142 < x < 144)[/imath]
[imath]\displaystyle P(142 \leq x \leq 144)[/imath]
[imath]\displaystyle P(142 \leq x < 144)[/imath]
[imath]\displaystyle P(142 < x \leq 144)[/imath]
Si, I get that now.

It's sad that we can't find P(X = 143 cm). If we could we can find the proportion of students who are the same height as A, whose height is 143 cm. As a workaround I proposed an 1 cm interval (142, 144). We always stipulate an interval instead of a specific value.
 
Si, I get that now.

It's sad that we can't find P(X = 143 cm). If we could we can find the proportion of students who are the same height as A, whose height is 143 cm. As a workaround I proposed an 1 cm interval (142, 144). We always stipulate an interval instead of a specific value.
Who told you we cannot? We can. [imath]P(X = 143 \ \text{cm}) = 0[/imath].

😉
 
It's sad that we can't find P(X = 143 cm). If we could we can find the proportion of students who are the same height as A, whose height is 143 cm. As a workaround I proposed an 1 cm interval (142, 144). We always stipulate an interval instead of a specific value.

I would consider 142.5<= X <143.5 as these values would round to 143.
This is appropriate if heights are being rounded to the nearest whole centimeter, so that we consider two students as having the "same height" if they both round to the same number.

But I think there is also another issue here:
Let's say that we're statistically analyzing the heights of students in a particular high school and the distribution is normal; we compute the mean and the standard deviation. We then take a particular student (A) and measure his height to be 143 cm.
The distribution of heights of actual students in this school is not normal; it is a discrete distribution that presumably can be approximated by a normal distribution. So in fact, P(x = 143) is determined by counting the actual number of students with that height.

In fact, if the distribution were literally normal, then there would be a non-zero probability of a negative height!

We use the normal distribution (a) as a simplification (believe it or not) of the actual distribution, and (b) as a model of a theoretical distribution from which we imagine the students in the school to have been drawn. This distinction seems to be generally ignored in elementary statistics, unfortunately.
 
@Dr.Peterson
The Wikipage on normal distribution concurs with what you say. A normal distribution is assumed for data with discrete frequencies even though the variable itself might be continuous, like in my example. Intriguingly the only example for a continuous frequency is from QM.

There doesn't seem to be a choice in the matter. The normal distribution appears to be the only key we have for statistical analysis, at least at my level.

@Harry_the_cat I suppose that would depend on whether the measurement precision was to the 1st decimal place. If while collecting data we rounded to the nearest centimeter (like I assumed), it would be wrong to use a precision like \(\displaystyle 142.5 \leq x < 143.5\), one that was never there..
 
Last edited:
Top