Weighted median
In statistics, a weighted median of a sample is the 50% weighted percentile.[1][2][3] It was first proposed by F. Y. Edgeworth in 1888.[4][5] Like the median, it is useful as an estimator of central tendency, robust against outliers. It allows for non-uniform statistical weights related to, e.g., varying precision measurements in the sample.
Definition
For distinct ordered elements with positive weights such that , the weighted median is the element satisfying: and . If then the weighted median is Where is the next element with non-zero weight. This is a generalization of the standard median which is the weighted median of a set of elements with equal weight. [1]
Properties
Weighted median of a set partitions that set in two halves such that sum of weights in each partition is as equal as possible.
If weights of all numbers in the set are equal then median is same as weighted median.
When set has even number of elements, we will have lower weighted median and upper weighted median instead of a single element as median.
Example
Consider the set of numbers with each number having weights respectively. The median is 0.1 and the weighted median is 0.2.[1]
Algorithm
Weighted median can be computed by sorting the set of numbers and finding the smallest numbers which sums to half the weight of total weight. This algorithm takes time. There is a better approach to find weighted median using a modified selection algorithm.[1]
//Main call is WeightedMedian(a, 1, n)
//Returns lower median
WeightedMedian(a[1..n], p, r)
//base case for single element
if r = p
return a[p]
//base case for two elements
//make sure we return the average, in case the two candidates have equal weight
if r-p = 1
if a[p].w == a[r].w
return (a[p] + a[r])/2
if a[p].w > a[r].w
return a[p]
else
return a[r]
//partition around pivot r
q = partition(a, p, r)
wl, wg = sum weights of partitions (p, q-1), (q+1, r)
//if partitions are balanced then we are done
if wl and wg both < 1/2
return a[q]
else
//increase pivot weight by the amount of partition we eliminate
if wl > wg
a[q].w += wg
//recurse on pivot inclusively
WeightedMedian(a, p, q)
else
a[q].w += wl
WeightedMedian(a, q, r)
As the recursive call in inclusive of median, recurrence relation is,
Which yields runtime.
See also
References
- 1 2 3 4 Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L.; Stein, Clifford (2001). "Introduction to Algorithms". ISBN 9780262032933.
- ↑ Horowitz, Ellis; Sahni, Sartaj; Rajasekaran, Sanguthevar (1996-12-15). "Computer Algorithms C++: C++ and Pseudocode Versions". ISBN 9780716783152.
- ↑ Bovik, Alan C (2010-07-21). "Handbook of Image and Video Processing". ISBN 9780080533612.
- ↑ Edgeworth, F. Y. (1888). "On a New Method of Reducing Observations Relating to Several Quantities". Philosophical Magazine. 25 (154): 184. doi:10.1080/14786448808628170.
- ↑ Edgeworth, F. Y. (1887). "On Observations Relating to Several Quantities". Hermathena. Trinity College Dublin. 6: 279–285. JSTOR 23036355.