OK, so, I have a fairly complicated query to which the answer is possibly straightforward, but I can't see it so any help would be gratefully received. I work with data a lot, but Rates always give me problems. For some reason I just can't quite get my head around them in the same way as any other mathematical concept.
I have a set of data for which the rate per 100,000 appears to have been calculated incorrectly. I can see the problem and I can see what's been done, I just don't understand it. I can't include the data, or a sample of it as it's data obtained from behind a paywall, however I'll attempt to replicate it.
The data is for a single quarter (October to December), looks a bit like this:
It actually reports on a weird lag, so October-Decembers data is being reported now and acts as the first quarter for this reporting period (it's a weird setup). The formula that was included with the dataset that has been used to work out the rate per 100K is this:
No. of Identified Cases/Total Population x 100,000
The problem is that using that formula gives different results than those on the official data. Town 1, for example, comes out to a rate per 100k of 536. Town 2 is 480 per 100k. Having played with the data, I've discovered that the given 'Rate per 100k' is actually exactly 4 times that of my calculation, which suggests that the data is being provisionally calculated for an entire 12 month period based on a single quarters figure.
Now, the data is reported in a cumulative manner, i.e. next quarters' data will cover January to March and will be reported as a sum of 2 quarters data. So if Jan-March identified cases is 150, the reported figure will be 292 (142+150).
However, doing this will increase the rate per 100k as the quarters progress and the number of identified cases increases because the population will remain at the same level.
My question is 2-fold:
1. Is it usual practice for a rate per 100k to be worked out in advance based on an initial entry?
2. Is it correct to calculate a new rate per 100k using cumulative data?
As I said, I just don't 'get' rates per... I understand the concept, I understand the formula, but for some reason the combination of the 2 just does not sit in my head correctly.
I have a set of data for which the rate per 100,000 appears to have been calculated incorrectly. I can see the problem and I can see what's been done, I just don't understand it. I can't include the data, or a sample of it as it's data obtained from behind a paywall, however I'll attempt to replicate it.
The data is for a single quarter (October to December), looks a bit like this:
Place | No. of Identified Cases | Total Population | Rate per 100k |
Town 1 | 142 | 26,472 | 2,146 |
Town 2 | 203 | 42,254 | 1,922 |
Town 3 | 122 | 28,578 | 1,708 |
It actually reports on a weird lag, so October-Decembers data is being reported now and acts as the first quarter for this reporting period (it's a weird setup). The formula that was included with the dataset that has been used to work out the rate per 100K is this:
No. of Identified Cases/Total Population x 100,000
The problem is that using that formula gives different results than those on the official data. Town 1, for example, comes out to a rate per 100k of 536. Town 2 is 480 per 100k. Having played with the data, I've discovered that the given 'Rate per 100k' is actually exactly 4 times that of my calculation, which suggests that the data is being provisionally calculated for an entire 12 month period based on a single quarters figure.
Now, the data is reported in a cumulative manner, i.e. next quarters' data will cover January to March and will be reported as a sum of 2 quarters data. So if Jan-March identified cases is 150, the reported figure will be 292 (142+150).
However, doing this will increase the rate per 100k as the quarters progress and the number of identified cases increases because the population will remain at the same level.
My question is 2-fold:
1. Is it usual practice for a rate per 100k to be worked out in advance based on an initial entry?
2. Is it correct to calculate a new rate per 100k using cumulative data?
As I said, I just don't 'get' rates per... I understand the concept, I understand the formula, but for some reason the combination of the 2 just does not sit in my head correctly.