w_iqr() calculates the interquartile range using survey weights. The IQR
is the distance between the 25th and 75th percentiles – it tells you the range
that contains the middle 50% of your population. Unlike the standard deviation,
the IQR is not affected by outliers, making it a robust measure of spread.
Arguments
- data
Your survey data (a data frame or tibble)
- ...
The numeric variables you want to analyze. You can list multiple variables or use helpers like
starts_with("income")- weights
Survey weights to make results representative of your population. Without weights, you get the simple sample IQR.
- na.rm
Remove missing values before calculating? (Default: TRUE)
Value
Population-weighted IQR(s) with sample size information, including the weighted IQR, effective sample size (effective N), and the number of valid observations used.
Details
Understanding the Results
Weighted IQR: The range that covers the middle 50% of the weighted population. A larger IQR means more spread in the central part of the data.
Effective N: How many independent observations your weighted data represents.
N: The actual number of observations used.
The IQR is especially useful when your data is skewed. For example, with income data, the IQR gives a better sense of "typical spread" than the SD because extreme incomes do not distort it.
When to Use This
Use w_iqr() when:
Your data has outliers or is skewed (e.g., income, response times)
You want a robust measure of spread that is not influenced by extremes
You need to describe the spread of the middle 50% of your population
You need SPSS-compatible weighted IQR values
Formula
\(IQR_w = Q_{3,w} - Q_{1,w}\)
where \(Q_{1,w}\) and \(Q_{3,w}\) are the weighted 25th and 75th
percentiles, calculated using cumulative weights (see w_quantile).
See also
IQR for the base R IQR function.
w_quantile for arbitrary weighted percentiles.
w_sd for weighted standard deviation (another spread measure).
w_range for the full weighted range.
describe for comprehensive descriptive statistics including IQR.
Other weighted_statistics:
w_kurtosis(),
w_mean(),
w_median(),
w_modus(),
w_quantile(),
w_range(),
w_sd(),
w_se(),
w_skew(),
w_var()
Examples
# Load required packages and data
library(dplyr)
data(survey_data)
# Basic weighted IQR
survey_data %>% w_iqr(age, weights = sampling_weight)
#>
#> Weighted Interquartile Range Statistics
#> ---------------------------------------
#>
#> --- age ---
#> Variable weighted_iqr Effective_N
#> age 25 2468.8
#>
# Multiple variables
survey_data %>% w_iqr(age, income, weights = sampling_weight)
#>
#> Weighted Interquartile Range Statistics
#> ---------------------------------------
#>
#> --- age ---
#> Variable weighted_iqr Effective_N
#> age 25 2468.8
#>
#> --- income ---
#> Variable weighted_iqr Effective_N
#> income 1900 2158.9
#>
# Grouped data
survey_data %>% group_by(region) %>% w_iqr(age, weights = sampling_weight)
#>
#> Weighted Interquartile Range Statistics
#> ---------------------------------------
#>
#> Group: region = East
#> Warning: Unknown or uninitialised column: `Variable`.
#>
#> Group: region = West
#> Warning: Unknown or uninitialised column: `Variable`.
#>
# In summarise context
survey_data %>% summarise(iqr_age = w_iqr(age, weights = sampling_weight))
#> # A tibble: 1 × 1
#> iqr_age
#> <dbl>
#> 1 25
# Unweighted (for comparison)
survey_data %>% w_iqr(age)
#>
#> Interquartile Range Statistics
#> ------------------------------
#>
#> --- age ---
#> Variable iqr N
#> age 24 2500
#>
