Winsorized mean

A winsorized mean is a winsorized statistical measure of central tendency, much like the mean and median, and even more similar to the truncated mean. It involves the calculation of the mean after winsorizing -- replacing given parts of a probability distribution or sample at the high and low end with the most extreme remaining values,^[1] typically doing so for an equal amount of both extremes; often 10 to 25 percent of the ends are replaced. The winsorized mean can equivalently be expressed as a weighted average of the truncated mean and the quantiles at which it is limited, which corresponds to replacing parts with the corresponding quantiles.

Advantages[]

The winsorized mean is a useful estimator because it is less sensitive to outliers than the mean but will still give a reasonable estimate of central tendency or mean for almost all statistical models. In this regard it is referred to as a robust estimator.

Drawbacks[]

The winsorized mean uses more information from the distribution or sample than the median. However, unless the underlying distribution is symmetric, the winsorized mean of a sample is unlikely to produce an unbiased estimator for either the mean or the median.

Example[]

For a sample of 10 numbers (from x₁, the smallest, to x₁₀ the largest) the 10% winsorized mean is

{\frac {\overbrace {x_{2}+x_{2}} +x_{3}+x_{4}+x_{5}+x_{6}+x_{7}+x_{8}+\overbrace {x_{9}+x_{9}} }{10}}.\,

The key is in the repetition of x₂ and x₉: the extras substitute for the original values x₁ and x₁₀ which have been discarded and replaced.

This is equivalent to a weighted average of 0.1 times the 5th percentile (x₂), 0.8 times the 10% trimmed mean, and 0.1 times the 95th percentile (x₉).

Notes[]

^ Dodge, Y (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9 (entry for "winsorized estimation")

References[]

Wilcox, R.R.; Keselman, H.J. (2003). "Modern robust data analysis methods: Measures of central tendency". Psychological Methods. 8 (3): 254–274. doi:10.1037/1082-989X.8.3.254. PMID 14596490.

[1] Dodge, Y (2003) The Oxford Dictionary of Statistical Terms, OUP. ISBN 0-19-920613-9 (entry for "winsorized estimation")

[1]