Consider the event "flipping a coin [imath]n[/imath]-times and recording the number of heads."
Let [imath]\alpha[/imath] denote the binomial induced measure with
[imath]\qquad \qquad \alpha(\{j\})=\left(\begin{array}{c} n \\ j \end{array}\right) p^{j}(1-p)^{n-j} \quad \text { for } j=0,1,2, \ldots, n[/imath]
Note: The value [imath]p[/imath] appearing in (3) with [imath]0<p<1[/imath] is not the probability [imath]p[/imath] featured in the name "[imath]p[/imath]-values", but you should understand that probability by the end of this problem.
We take as a "null hypothesis" the statement
The coin used in the event above is a fair coin.
Complete parts (a)-(d) under the assumption that the null hypothesis holds, i.e., [imath]p=1 / 2[/imath]
(a) Calculate the expectation [imath]x_{*}[/imath] of [imath]x:\{0,1\}^{n} \rightarrow \mathbb{R}[/imath] by
[imath]\qquad \qquad x\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right)=\sum_{j=1}^{n} \omega_{j}[[/imath]
with respect to the Binomial measure [imath]\beta: \wp\left(\{0,1\}^{n}\right) \rightarrow[0,1][/imath], i.e., calculate the integral
[imath]\qquad \qquad x_{*}=\int_{S} x[/imath]
with respect to [imath]\beta[/imath] where [imath]S=\{0,1\}^{n}[/imath]
(b) Calculate the expectation of the identity function on [imath]\mathbb{R}[/imath] with respect to the binomial induced measure [imath]\alpha: \wp(\mathbb{R}) \rightarrow[0,1][/imath].
(c) Taking [imath]n=3[/imath], consider the set
[imath]\qquad \qquad A=\left\{\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \in S: x\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \geq 2\right\}[/imath]
(i) What compound outcome does the set [imath]A[/imath] model and what is the probabilistic interpretation of [imath]\beta(A)[/imath] in terms of the event "flipping a coin 3 times and recording the number of heads"?
(ii) Rewrite the set [imath]A[/imath] in the form
[imath]\qquad \qquad A=\left\{\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \in S: x\left(\omega_{1}, \omega_{2}, \omega_{3}\right)-x_{*} \geq \delta\right\}[/imath]
for some [imath]\delta>0[/imath].
(iii) What compound outcome does the set
[imath]\qquad \qquad B=\left\{\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \in S:\left|x\left(\omega_{1}, \omega_{2}, \omega_{3}\right)-x_{*}\right| \geq \delta\right\}[/imath]
model?
(iv) Find [imath]\beta(B)[/imath].
(d) Generalize/repeat part (c) for [imath]n=4,5,6[/imath] replacing the relation
[imath]\qquad \qquad x\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \geq 2[/imath]
in (4) with
[imath]\qquad \qquad x\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right) \geq n-1 [/imath]
(e) The answers you got for [imath]\beta(B)[/imath] are not technically [imath]p[/imath]-values. Technically, a [imath]p[/imath] value is both the value of a probability measure, i.e., a "probability", and a statistic. This means that technically you need a data set to get a [imath]p[/imath]-value. The idea is that the existence of a certain data set may cast doubt on (or justify the rejection of) the null hypothesis.
(i) Say you actually flip a coin three times with the result "heads", "tails", "heads" corresponding to the model outcome [imath](1,0,1)[/imath]. Then the value you computed in part (c)(iv) is the [imath]p[/imath]-value associated with the data from your event (or experiment). The fact that the value [imath]\beta(B)[/imath] depends on the data makes it a statistic.
How does [imath]\beta(B)[/imath] depend on the data? Why is it a statistic?
(ii) If the value of [imath]\beta(B)[/imath] is "high", then the idea is that the data gives you no reason to reject the null hypothesis: As far as this data goes, it may very well be the case that the coin is a fair coin. But if the value is "low," then perhaps the null hypothesis should be rejected. [imath]{ }^{2}[/imath]
Should the [imath]p[/imath]-value in this case be considered "high" or "low"?
(iii) Repeat part (e)(ii) for [imath]n=4,5,6[/imath]. For example, say you flip a coin four times and obtain an outcome involving [imath]4-1=3[/imath] heads. Find the [imath]p[/imath]-value. Do you think it is "high," "low," or somewhere in between?
(iv) Given data [imath]\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right) \in S[/imath] corresponding to actual coin flipping for some general [imath]n[/imath], formulate the associated [imath]p[/imath]-value determined by the data and the null hypothesis the coin is fair. Hint: If [imath]x\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right)=k[/imath], then the [imath]p[/imath]-value is the probability, assuming the null hypothesis, that any data [imath]a \in S[/imath] collected in a similar manner, i.e., by flipping the coin [imath]n[/imath] times, has [imath]x(a)[/imath] at least as far from the expected value [imath]x_{*}[/imath] as the actual data [imath]\omega=\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right)[/imath].
(f) Here is perhaps the most interesting part of this problem: The description of the calculation of a [imath]p[/imath]-value defined/described in part (e) above, e.g.,
\section{... the probability, assuming the null hypothesis, ...}
strongly suggests the calculation of the value of some restriction probability measure.
(i) What is the domain of the measure in question which is being restricted?
(ii) What collection of abstract outcomes does that domain model?
Let [imath]\alpha[/imath] denote the binomial induced measure with
[imath]\qquad \qquad \alpha(\{j\})=\left(\begin{array}{c} n \\ j \end{array}\right) p^{j}(1-p)^{n-j} \quad \text { for } j=0,1,2, \ldots, n[/imath]
Note: The value [imath]p[/imath] appearing in (3) with [imath]0<p<1[/imath] is not the probability [imath]p[/imath] featured in the name "[imath]p[/imath]-values", but you should understand that probability by the end of this problem.
We take as a "null hypothesis" the statement
The coin used in the event above is a fair coin.
Complete parts (a)-(d) under the assumption that the null hypothesis holds, i.e., [imath]p=1 / 2[/imath]
(a) Calculate the expectation [imath]x_{*}[/imath] of [imath]x:\{0,1\}^{n} \rightarrow \mathbb{R}[/imath] by
[imath]\qquad \qquad x\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right)=\sum_{j=1}^{n} \omega_{j}[[/imath]
with respect to the Binomial measure [imath]\beta: \wp\left(\{0,1\}^{n}\right) \rightarrow[0,1][/imath], i.e., calculate the integral
[imath]\qquad \qquad x_{*}=\int_{S} x[/imath]
with respect to [imath]\beta[/imath] where [imath]S=\{0,1\}^{n}[/imath]
(b) Calculate the expectation of the identity function on [imath]\mathbb{R}[/imath] with respect to the binomial induced measure [imath]\alpha: \wp(\mathbb{R}) \rightarrow[0,1][/imath].
(c) Taking [imath]n=3[/imath], consider the set
[imath]\qquad \qquad A=\left\{\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \in S: x\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \geq 2\right\}[/imath]
(i) What compound outcome does the set [imath]A[/imath] model and what is the probabilistic interpretation of [imath]\beta(A)[/imath] in terms of the event "flipping a coin 3 times and recording the number of heads"?
(ii) Rewrite the set [imath]A[/imath] in the form
[imath]\qquad \qquad A=\left\{\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \in S: x\left(\omega_{1}, \omega_{2}, \omega_{3}\right)-x_{*} \geq \delta\right\}[/imath]
for some [imath]\delta>0[/imath].
(iii) What compound outcome does the set
[imath]\qquad \qquad B=\left\{\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \in S:\left|x\left(\omega_{1}, \omega_{2}, \omega_{3}\right)-x_{*}\right| \geq \delta\right\}[/imath]
model?
(iv) Find [imath]\beta(B)[/imath].
(d) Generalize/repeat part (c) for [imath]n=4,5,6[/imath] replacing the relation
[imath]\qquad \qquad x\left(\omega_{1}, \omega_{2}, \omega_{3}\right) \geq 2[/imath]
in (4) with
[imath]\qquad \qquad x\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right) \geq n-1 [/imath]
(e) The answers you got for [imath]\beta(B)[/imath] are not technically [imath]p[/imath]-values. Technically, a [imath]p[/imath] value is both the value of a probability measure, i.e., a "probability", and a statistic. This means that technically you need a data set to get a [imath]p[/imath]-value. The idea is that the existence of a certain data set may cast doubt on (or justify the rejection of) the null hypothesis.
(i) Say you actually flip a coin three times with the result "heads", "tails", "heads" corresponding to the model outcome [imath](1,0,1)[/imath]. Then the value you computed in part (c)(iv) is the [imath]p[/imath]-value associated with the data from your event (or experiment). The fact that the value [imath]\beta(B)[/imath] depends on the data makes it a statistic.
How does [imath]\beta(B)[/imath] depend on the data? Why is it a statistic?
(ii) If the value of [imath]\beta(B)[/imath] is "high", then the idea is that the data gives you no reason to reject the null hypothesis: As far as this data goes, it may very well be the case that the coin is a fair coin. But if the value is "low," then perhaps the null hypothesis should be rejected. [imath]{ }^{2}[/imath]
Should the [imath]p[/imath]-value in this case be considered "high" or "low"?
(iii) Repeat part (e)(ii) for [imath]n=4,5,6[/imath]. For example, say you flip a coin four times and obtain an outcome involving [imath]4-1=3[/imath] heads. Find the [imath]p[/imath]-value. Do you think it is "high," "low," or somewhere in between?
(iv) Given data [imath]\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right) \in S[/imath] corresponding to actual coin flipping for some general [imath]n[/imath], formulate the associated [imath]p[/imath]-value determined by the data and the null hypothesis the coin is fair. Hint: If [imath]x\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right)=k[/imath], then the [imath]p[/imath]-value is the probability, assuming the null hypothesis, that any data [imath]a \in S[/imath] collected in a similar manner, i.e., by flipping the coin [imath]n[/imath] times, has [imath]x(a)[/imath] at least as far from the expected value [imath]x_{*}[/imath] as the actual data [imath]\omega=\left(\omega_{1}, \omega_{2}, \ldots, \omega_{n}\right)[/imath].
(f) Here is perhaps the most interesting part of this problem: The description of the calculation of a [imath]p[/imath]-value defined/described in part (e) above, e.g.,
\section{... the probability, assuming the null hypothesis, ...}
strongly suggests the calculation of the value of some restriction probability measure.
(i) What is the domain of the measure in question which is being restricted?
(ii) What collection of abstract outcomes does that domain model?
Last edited by a moderator: