w_mean() calculates averages that accurately represent your population
by using survey weights. This ensures that groups who were over- or under-sampled
contribute appropriately to the final average.
Arguments
- data
Your survey data (a data frame or tibble)
- ...
The numeric variables you want to average. You can list multiple variables or use helpers like
starts_with("income")- weights
Survey weights to make the average representative of your population. Without weights, you get the simple sample average.
- na.rm
Remove missing values before calculating? (Default: TRUE)
Details
When to Use This
Use w_mean() when your survey uses sampling weights and you need
population-representative averages. Weights correct for:
Oversampling of certain groups (weights < 1)
Undersampling of other groups (weights > 1)
Non-response patterns
Complex survey designs
See also
weighted.mean for the base R weighted mean function.
w_sd for weighted standard deviation.
w_median for weighted median.
describe for comprehensive descriptive statistics.
Other weighted_statistics:
w_iqr(),
w_kurtosis(),
w_median(),
w_modus(),
w_quantile(),
w_range(),
w_sd(),
w_se(),
w_skew(),
w_var()
Examples
# Load required packages and data
library(dplyr)
data(survey_data)
# Basic weighted usage
survey_data %>% w_mean(age, weights = sampling_weight)
#>
#> Weighted Mean Statistics
#> ------------------------
#>
#> --- age ---
#> Variable weighted_mean Effective_N
#> age 50.514 2468.8
#>
# Multiple variables
survey_data %>% w_mean(age, income, life_satisfaction, weights = sampling_weight)
#>
#> Weighted Mean Statistics
#> ------------------------
#>
#> --- age ---
#> Variable weighted_mean Effective_N
#> age 50.514 2468.8
#>
#> --- income ---
#> Variable weighted_mean Effective_N
#> income 3743.099 2158.9
#>
#> --- life_satisfaction ---
#> Variable weighted_mean Effective_N
#> life_satisfaction 3.625 2390.9
#>
# Grouped data
survey_data %>% group_by(region) %>% w_mean(age, weights = sampling_weight)
#>
#> Weighted Mean Statistics
#> ------------------------
#>
#> Group: region = East
#> Warning: Unknown or uninitialised column: `Variable`.
#>
#> Group: region = West
#> Warning: Unknown or uninitialised column: `Variable`.
#>
# In summarise context
survey_data %>% summarise(mean_age = w_mean(age, weights = sampling_weight))
#> # A tibble: 1 × 1
#> mean_age
#> <dbl>
#> 1 50.5
# Unweighted (for comparison)
survey_data %>% w_mean(age)
#>
#> Mean Statistics
#> ---------------
#>
#> --- age ---
#> Variable mean N
#> age 50.55 2500
#>
