Standard Normal Distribution

Agent Smith · Aug 25, 2024

A standard normal distribution is basically the Bell curve. Once, for a particular statistical question we have a normal distribution, we can apply this formula:

\displaystyle \text{z score}_x = \frac{x - \overline x}{\sigma}

. Let's say that we're statistically analyzing the heights of students in a particular high school and the distribution is normal; we compute the mean and the standard deviation. We then take a particular student (A) and measure his height to be 143 cm. Let it also be that

\displaystyle \text{z score}_{143 \text{cm}} = -2

. A is exactly 2 standard deviations below the mean, which means, here I'm confused (a little help), by the empirical or 99.7-95-68 rule ...
a) 2.5% of the students have heights

\displaystyle < 143 \text{ cm}

or
b) 2.5% of the students have height

\displaystyle \leq 143 \text{ cm}

Mayday! Mayday!

Dr.Peterson · Aug 25, 2024

Agent Smith said:
A standard normal distribution is basically the Bell curve. Once, for a particular statistical question we have a normal distribution, we can apply this formula: $\displaystyle \text{z score}_x = \frac{x - \overline x}{\sigma}$ . Let's say that we're statistically analyzing the heights of students in a particular high school and the distribution is normal; we compute the mean and the standard deviation. We then take a particular student (A) and measure his height to be 143 cm. Let it also be that $\displaystyle \text{z score}_{143 \text{cm}} = -2$ . A is exactly 2 standard deviations below the mean, which means, here I'm confused (a little help), by the empirical or 99.7-95-68 rule ...
a) 2.5% of the students have heights $\displaystyle < 143 \text{ cm}$
or
b) 2.5% of the students have height $\displaystyle \leq 143 \text{ cm}$

Mayday! Mayday!

It makes no difference.

The normal distribution is a continuous distribution, so P(x=143) = 0. That is, the probability of any single precise value is zero.

Agent Smith · Aug 25, 2024

@Dr.Peterson gracias.

In this particular example I can't find students who are either as short/tall as A or shorter/taller, but I can find people who are shorter/taller than A. A is 143 cm tall.

How would I find the proportion of people as tall as A (143 cm)? Do I need to create an interval e.g. (142, 144), which allows me to compute a z score interval and then a proportion interval? We can then find the proportion of students whose heights are within 1 cm of A. Oui?

Agent Smith · Aug 25, 2024

Dr.Peterson said:
The normal distribution is a continuous distribution, so P(x=143) = 0. That is, the probability of any single precise value is zero.

Is this really the reason why we can't compute P(x = 143). Do you mean to say if we had a dicontinuous probability function, that looked like (say) a histogram, we could compute P(x = 143)?

mario99 · Aug 25, 2024

Agent Smith said:
Is this really the reason why we can't compute P(x = 143). Do you mean to say if we had a dicontinuous probability function, that looked like (say) a histogram, we could compute P(x = 143)?

Finding the probability in normal distribution is the same as finding the area under the curve. Therefore, what do you think you will get when you calculate

\displaystyle \int_{143}^{143} ..... = \ ?

Agent Smith · Aug 25, 2024

@mario99 , arigato for clarifying. The answer is

\displaystyle 0

Am I right about what I said then? We fix an interval like (142, 144) and then compute the proportion of students who are within 1 cm of A's height which is 143 cm.

mario99 · Aug 25, 2024

Agent Smith said:
@mario99 , arigato for clarifying. The answer is $\displaystyle 0$

Am I right about what I said then? We fix an interval like (142, 144) and then compute the proportion of students who are within 1 cm of A's height which is 143 cm.

Doing this is the same as calculating the probability of

P(142 \leq x \leq 144)

. I think that you are confused between continuous and discrete distributions. In discrete distributions, you can get a value of the probability

P(x = 143)

.

Agent Smith · Aug 25, 2024

@mario99 I think so too. With discrete distributions we can compute something like P(X = 143), but not with continuous distributions.

So I should go with < and > and forget about

\displaystyle \leq

and

\displaystyle \geq

and

\displaystyle =

.

mario99 · Aug 25, 2024

Agent Smith said:
@mario99 I think so too. With discrete distributions we can compute something like P(X = 143), but not with continuous distributions.

So I should go with < and > and forget about $\displaystyle \leq$ and $\displaystyle \geq$ and $\displaystyle =$ .

In continuous distribution, all of these notations are the same:

\displaystyle P(142 < x < 144)

\displaystyle P(142 \leq x \leq 144)

\displaystyle P(142 \leq x < 144)

\displaystyle P(142 < x \leq 144)

Agent Smith · Aug 25, 2024

mario99 said:
In continuous distribution, all of these notations are the same:

$\displaystyle P(142 < x < 144)$
$\displaystyle P(142 \leq x \leq 144)$
$\displaystyle P(142 \leq x < 144)$
$\displaystyle P(142 < x \leq 144)$

Si, I get that now.

It's sad that we can't find P(X = 143 cm). If we could we can find the proportion of students who are the same height as A, whose height is 143 cm. As a workaround I proposed an 1 cm interval (142, 144). We always stipulate an interval instead of a specific value.

mario99 · Aug 25, 2024

Agent Smith said:
Si, I get that now.

It's sad that we can't find P(X = 143 cm). If we could we can find the proportion of students who are the same height as A, whose height is 143 cm. As a workaround I proposed an 1 cm interval (142, 144). We always stipulate an interval instead of a specific value.

Who told you we cannot? We can.

P(X = 143 \ \text{cm}) = 0

.

Agent Smith · Aug 26, 2024

mario99 said:
Who told you we cannot? We can. $P(X = 143 \ \text{cm}) = 0$ .

, but we know it isn't 0. A's height is 143 cm. Am I missing something?

mario99 · Aug 26, 2024

Agent Smith said:
, but we know it isn't 0. A's height is 143 cm. Am I missing something?

Prove it.

Harry_the_cat · Aug 26, 2024

I would consider 142.5<= X <143.5 as these values would round to 143.

Dr.Peterson · Aug 26, 2024

Agent Smith said:
It's sad that we can't find P(X = 143 cm). If we could we can find the proportion of students who are the same height as A, whose height is 143 cm. As a workaround I proposed an 1 cm interval (142, 144). We always stipulate an interval instead of a specific value.

Harry_the_cat said:
I would consider 142.5<= X <143.5 as these values would round to 143.

This is appropriate if heights are being rounded to the nearest whole centimeter, so that we consider two students as having the "same height" if they both round to the same number.

But I think there is also another issue here:

Agent Smith said:
Let's say that we're statistically analyzing the heights of students in a particular high school and the distribution is normal; we compute the mean and the standard deviation. We then take a particular student (A) and measure his height to be 143 cm.

The distribution of heights of actual students in this school is not normal; it is a discrete distribution that presumably can be approximated by a normal distribution. So in fact, P(x = 143) is determined by counting the actual number of students with that height.

In fact, if the distribution were literally normal, then there would be a non-zero probability of a negative height!

We use the normal distribution (a) as a simplification (believe it or not) of the actual distribution, and (b) as a model of a theoretical distribution from which we imagine the students in the school to have been drawn. This distinction seems to be generally ignored in elementary statistics, unfortunately.

Agent Smith · Aug 26, 2024

@Dr.Peterson
The Wikipage on normal distribution concurs with what you say. A normal distribution is assumed for data with discrete frequencies even though the variable itself might be continuous, like in my example. Intriguingly the only example for a continuous frequency is from QM.

There doesn't seem to be a choice in the matter. The normal distribution appears to be the only key we have for statistical analysis, at least at my level.

@Harry_the_cat I suppose that would depend on whether the measurement precision was to the 1st decimal place. If while collecting data we rounded to the nearest centimeter (like I assumed), it would be wrong to use a precision like

\displaystyle 142.5 \leq x < 143.5

, one that was never there..

Standard Normal Distribution

Agent Smith

Full Member

Dr.Peterson

Elite Member

Agent Smith

Full Member

Agent Smith

Full Member

mario99

Full Member

Agent Smith

Full Member

mario99

Full Member

Agent Smith

Full Member

mario99

Full Member

Agent Smith

Full Member

mario99

Full Member

Agent Smith

Full Member

mario99

Full Member

Harry_the_cat

Elite Member

Dr.Peterson

Elite Member

Agent Smith

Full Member