Associating histograms and boxplots with respective data

Petcheco

New member
Joined
Mar 11, 2021
Messages
2
Hello, everyone!

In this question we're asked to associate each sample with its histogram and boxplot. The problem is we are only given the mean, median and standard deviation of the samples. They are as follows:
SampleMeanMedianSt. Dev.
168.727130.019
250.085033.625
347.884119.020
440.763821.171
552.325338.297
667.87417.049
750.325016.163
849.45020.265

The graphs are given below:

Histograms and boxplots

Using the fact that the median is given by the vertical line in the boxplots, I was able to deduce the following:

SampleBoxplot
1C
2E, F or G
3A
4B
5H
6D
7E, F or G
8E, F or G

However, I cannot for the life of me figure out how to decide between E, F and G for 2, 7 and 8 and especially how to deduce which histograms corresponds to which samples.

Any help is appreciated.
 
You may need to actually calculate the means and standard deviations and quartiles for each of the histograms, and match the quartiles with the boxplots and the standard deviations with the data.

Have you tried bringing the histograms into the mix at all? That may be the key.
 
Hello, Dr.Peterson! Thanks for replying.

Indeed, that was my first attempt. However, two problems made so that this path didn't work:

i) The values in the x-axis are intervals and not single numbers, i.e. [0,10], not 0 and 10. Because of this, calculations only yield approximations or lower/upper bound of the statistics;

ii) The set of samples' means given has some values which are very near each other, making it hard to distinguish between the samples using only approximations.

But I was able to solve it (I think so at least), even if it still feels a bit like guesswork.

By using the fact that the vertical line in the boxes represents the samples' medians, I assembled the following table:

Sample​
Boxplot​
1​
C​
2​
E, F or G​
3​
A​
4​
B​
5​
H​
6​
D​
7​
E, F or G​
8​
E, F or G​

Now using the whiskers as a way to identify minimum and maximum values and the position of the median as a measure of skewness, I can go a little further:

Boxplot​
Minimum​
Maximum​
Asymmetry​
A​
(20,30)​
(90,100)​
Positive​
B​
(0,10)​
(90,100)​
Positive​
C​
(0,10)​
(90,100)​
Negative​
D​
(20,30)​
(80,90)​
Negative​
E​
(0,10)​
(80,90)​
Positive​
F​
(20,30)​
(70,80)​
Positive​
G​
(0,10)​
(90,100)​
Symmetric​
H​
(0,10)​
(90,100)​
Negative​

Now, using both tables, we can associate each boxplot with its corresponding histogram:

Boxplot​
Histogram​
A​
III​
B​
I​
C​
VIII​
D​
V​
E​
II​
F​
IV​
G​
VI​
H​
VII​

Finally, using the difference between the mean and the median as a measure of skewness, we can finalize the question and assign the remaining non-unique boxplots. The answer is:

Sample​
Boxplot​
Histogram​
1​
C​
VIII​
2​
G​
VI​
3​
A​
III​
4​
B​
I​
5​
H​
VII​
6​
D​
V​
7​
E​
II​
8​
F​
IV​

Have a good one!
 
A testing question. Requires good logic and detective work.
I tried it myself and got pretty much the same as yourself. I swapped your sample 7 and 8 and I swapped your histogram VI and VII.
It's hard to distinguish between them, so who knows?!
A good question.

(The reasoning for my swaps:
Histogram VI has high bars at the extreme ends of the graph, so its LQ should be lower and its UQ higher than Histogram VII's, so I think, looking at the box plots, Histogram VI is (H, 5) and Histogram VII is (G, 2).

Histogram IV's bars occur closer together and so there will be a smaller Standard Deviation than in Histogram II, so I think (F, IV) is Sample 7 and (E, II) is Sample 8).
 
Top