The Range Median function is used to calculate a median from a series of data giving a count within a range. For the purposes of this explanation Median Household income will be used and the example data will be AGS 2004 data for Boulder County, Colorado.
The data as provided for AGS for Boulder looks like:
Var |
Desc |
Households |
Running Total |
---|---|---|---|
HIXCYLT05 |
Income $0 to $5,000 |
4233 |
4233 |
HIXCY0510 |
Income $5,000 to $9,999 |
2991 |
7224 |
HIXCY1015 |
Income $10,000 to $14,999 |
4277 |
11501 |
HIXCY1520 |
Income $15,000 to $19,999 |
4364 |
15865 |
HIXCY2025 |
Income $20,000 to $24,999 |
4756 |
20621 |
HIXCY2530 |
Income $25,000 to $29,999 |
4991 |
25612 |
HIXCY3035 |
Income $30,000 to $34,999 |
5483 |
31095 |
HIXCY3540 |
Income $35,000 to $39,999 |
5109 |
36204 |
HIXCY4045 |
Income $40,000 to $44,999 |
5180 |
41384 |
HIXCY4550 |
Income $45,000 to $49,999 |
4709 |
46093 |
HIXCY5055 |
Income $50,000 to $54,999 |
4784 |
50877 |
HIXCY5560 |
Income $55,000 to $59,999 |
4701 |
55578 |
HIXCY6065 |
Income $60,000 to $64,999 |
4720 |
60298 |
HIXCY6570 |
Income $65,000 to $69,999 |
3637 |
63935 |
HIXCY7075 |
Income $70,000 to $74,999 |
3943 |
67878 |
HIXCY7580 |
Income $75,000 to $79,999 |
4110 |
71988 |
HIXCY8085 |
Income $80,000 to $84,999 |
3680 |
75668 |
HIXCY8590 |
Income $85,000 to $89,999 |
3057 |
78725 |
HIXCY9095 |
Income $90,000 to $94,999 |
2785 |
81510 |
HIXCY95100 |
Income $95,000 to $99,999 |
2523 |
84033 |
HIXCY10025 |
Income $100,0000 to $124,999 |
11234 |
95267 |
HIXCY12550 |
Income $125,000 to $149,999 |
6186 |
101453 |
HIXCY15020 |
Income $150,000 to $199,999 |
5015 |
106468 |
HIXCY20050 |
Income $200,000 to $249,999 |
1969 |
108437 |
HIXCY25050 |
Income $250,000 to $499,999 |
2463 |
110900 |
HIXCYGT500 |
Income $500,000 Plus |
1523 |
112423 |
This is giving counts of households in each income range. How do we calculate a median from the ranges? Going back to the definition of a median, "a value in an ordered set of values below and above which there is an equal number of values (from dictionary.com)", we first have to find the range where the median point must exist. There are 112,423 households in the above table. The median HH income is the household where 56,211 households are above and below that one. Looking at the running total above, we can see that the median must be within the $60,000 to $64,999 range.
So now that we know for certainty that the Median Income for Boulder County is between $60K and $65K, we would like to know more specifically what it is. In reality, we don't have any information about the households in that range. All 4720 households could be making $60,001 or they could all be making $64,999, so the median really could be anywhere in that range, but we can presume that most areas fit a bell curve and there is a fairly even distribution of households in that range. We know that the median is 633 households into the range (56211-55578=633) or 13.4 % (100*633/4720=633). So simplistically we can assume that the median is 13.4% of the way into the range or 670 HHs into the range (.134*5000+60000=60670).
Now, just because we don't want to be too simple and we want to produce the most accurate result possible, we can take our estimate one step further. If we look at the above chart of HH incomes in Boulder, we see that there is a major drop off from the $60K-65K group to the $65K-70K group. Above we presumed that there were an equal # of people making incomes within the entire $60-65K range, but we can assume that there is a falloff within the range the is similar to the falloff between the ranges and that HHs are more concentrated at the low end of the range. So the RangeMedian function makes a very simplistic model of the curve (it only looks at 1 range above and 1 range below the median range.) Using a little area under the curve calculus, we can more accurately place the median at 60653. The exact details of the math are a little too complicated to put here, but the result is always very similar to the above simplistic method, but slightly more accurate.
RANGEMEDIAN(0,[HIXCYLT05],5000,[HIXCY0510],10000,[HIXCY1015],15000,[HIXCY1520],20000,[HIXCY2025],25000,[HIXCY2530],30000,[HIXCY3035],35000,[HIXCY3540],40000,[HIXCY4045],45000,[HIXCY4550],50000,[HIXCY5055],55000,[HIXCY5560],60000,[HIXCY6065],65000,[HIXCY6570],70000,[HIXCY7075],75000,[HIXCY7580],80000,[HIXCY8085],85000,[HIXCY8590],90000,[HIXCY9095],95000,[HIXCY95100],100000,[HIXCY10025],125000,[HIXCY12550],150000,[HIXCY15020],200000,[HIXCY20050],250000,[HIXCY25050],500000,[HIXCYGT500])
©2018 Alteryx, Inc., all rights reserved. Allocate®, Alteryx®, Guzzler®, and Solocast® are registered trademarks of Alteryx, Inc.