Average Hourly Rate/Wage Query

AlifSolo

New member
Joined
Aug 5, 2022
Messages
2
Here is general Math/Statistics question that I would like to learn more about. There are two areas/departments where people work and are paid hourly. I am trying to check on my Average Hourly Rate/Wage for the areas and the whole company to see how we are trending.

See table attached context: Between June and July both areas increased AHR by 0.02c and 0.07c respectively. However, when totaling, it's not telling the same story.

Question 1: When totaling the Wages Paid and Hour Worked and calculating AHR, why is July total AHR decreasing vs June? Whereas, both area's July AHR are increasing between June and July.

Question 2: What is the mathematical/Statistical principle that is causing this?

Thanks,

1659706289346.png
 
Question 1: When totaling the Wages Paid and Hour Worked and calculating AHR, why is July total AHR decreasing vs June? Whereas, both area's July AHR are increasing between June and July.

Question 2: What is the mathematical/Statistical principle that is causing this?

View attachment 33593

I think this would be considered an example of Simpson's Paradox.

As it says there,

Simpson's paradox, which also goes by several other names, is a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined. ...​
The lesson of Simpson's paradox isn't really to tell us which viewpoint to take but to insist that we keep both the parts and the whole in mind at once.​

I tried making a diagram like that shown under Vector Interpretation, and it mostly just shows how small the effect is in your example:

1659726136071.png

In general, I simply don't expect overall averages to be related to individual averages! But it's surprising when you haven't see it before, isn't it?
 
Last edited:
I realized I'd mixed up all the numbers and axes, so I redid the picture:

1659736026071.png

The idea is that the slope is the average hourly rate, and for each area it hardly changes from June to July; but the way they combine makes it change in the opposite direction (still only a little) when added together.
 
I shall try to give a non-mathematical explanation.

Do you see that in each monththe the average for each area considered in isolation increased?

Do you also see that in each month the average for area 1 was lower than the average for area 2?

But the difference in hours worked between June and July was relatively modest in area 1 but relatively large in area 2?

So the average monthly change for the whole company is determined by the difference in the average monthly wages paid in the two areas AND by the month;y difference in the hours worked in the two areas. In this specific case, the greater reduction in the hours worked in the more costly area more than offset the increase in wages in both areas. In other words, more than one thing is going on, and you cannot focus on just one.

Does that make intuitive sense?

There is a way to eliminate the problem, and that is to take both into account, but the math can get a bit tricky to explain.
 
Thank you guys so much. I was able to figure this out by both of your directions. Ultimately both areas must have the same number of days of data in order for the total var to work.

Cheers!
 
Top