frequency() helps you understand categorical data by showing how many people
chose each option. It's perfect for survey questions with fixed choices like
education level, yes/no questions, or rating scales.
Think of it as creating a summary table that shows:
How many people chose each option
What percentage that represents
Running totals to see cumulative patterns
Usage
frequency(
data,
...,
weights = NULL,
sort.frq = "none",
show.na = TRUE,
show.prc = TRUE,
show.valid = TRUE,
show.sum = TRUE,
show.labels = "auto",
show.unused = FALSE
)
fre(data, ..., weights = NULL, sort.frq = "none", show.na = TRUE,
show.prc = TRUE, show.valid = TRUE, show.sum = TRUE, show.labels = "auto",
show.unused = FALSE)Arguments
- data
Your survey data (a data frame or tibble)
- ...
The categorical variables you want to analyze. You can list multiple variables separated by commas, or use helpers like
starts_with("trust")- weights
Optional survey weights for population-representative results. Without weights, you get sample frequencies. With weights, you get population estimates.
- sort.frq
How to order the results:
"none"(default): Keep original order"asc": Sort from lowest to highest frequency"desc": Sort from highest to lowest frequency
- show.na
Include missing values in the table? (Default: TRUE)
- show.prc
Show raw percentages including missing values? (Default: TRUE)
- show.valid
Show percentages excluding missing values? (Default: TRUE)
- show.sum
Show cumulative totals? (Default: TRUE)
- show.labels
Show category labels if available? (Default: "auto" - shows labels when they exist)
- show.unused
Show all defined value labels, even those with zero observations? (Default: FALSE). When TRUE, values that have labels defined (e.g., from statistical software files) but no cases in the data are included with frequency 0. This is useful for labelled datasets where unused categories should still appear in the output. Automatically enables label display.
Details
Understanding the Results
The frequency table shows:
Freq: Number of responses in each category
%: Percentage including missing values (use for "response rate")
Valid %: Percentage excluding missing values (use for "among those who answered")
Cum %: Running total percentage (helps identify cutoff points)
When to Use This
Use frequency() when you have:
Categorical variables (gender, region, education level)
Yes/No questions
Rating scales (satisfied/neutral/dissatisfied)
Any question with a fixed set of options
Weights Make a Difference
Without weights, you're describing your sample. With weights, you're estimating population values. Always use weights for population inference.
Tagged Missing Values
When data is imported with tagged NAs (e.g., via read_spss() with
tag.na = TRUE, or read_stata(), read_sas(), read_xpt() with the
tag.na parameter), frequency() automatically expands the missing value
section to show each missing type individually (with its original missing
value code and label), plus summary rows for Total Valid and Total
Missing.
See also
table for base R frequency tables.
crosstab for cross-tabulation of two variables.
chi_square for testing relationships between categories.
describe for numeric variable summaries.
Other descriptive:
crosstab(),
describe()
Examples
# Load required packages and data
library(dplyr)
data(survey_data)
# Basic categorical analysis
survey_data %>% frequency(gender)
#>
#> Frequency Analysis Results
#> --------------------------
#>
#> gender (Gender)
#> # total N=2500 valid N=2500 mean=NA sd=NA skewness=NA
#>
#> +--------+--------+--------+--------+--------+--------+
#> | Value | Label | N | Raw % |Valid % | Cum. % |
#> +--------+--------+--------+--------+--------+--------+
#> | Male | Male | 1194 | 47.76 | 47.76 | 47.76 |
#> | Female | Female | 1306 | 52.24 | 52.24 | 100.00 |
#> +--------+--------+--------+--------+--------+--------+
#> | Total | | 2500 | 100.00 | 100.00 | |
#> +--------+--------+--------+--------+--------+--------+
#>
# Multiple variables with weights
survey_data %>% frequency(gender, region, weights = sampling_weight)
#>
#> Weighted Frequency Analysis Results
#> -----------------------------------
#>
#> gender (Gender)
#> # total N=2516 valid N=2516 mean=NA sd=NA skewness=NA
#>
#> +--------+--------+--------+--------+--------+--------+
#> | Value | Label | N | Raw % |Valid % | Cum. % |
#> +--------+--------+--------+--------+--------+--------+
#> | Male | Male | 1195 | 47.48 | 47.48 | 47.48 |
#> | Female | Female | 1321 | 52.52 | 52.52 | 100.00 |
#> +--------+--------+--------+--------+--------+--------+
#> | Total | | 2516 | 100.00 | 100.00 | |
#> +--------+--------+--------+--------+--------+--------+
#>
#>
#> region (Region (East/West))
#> # total N=2516 valid N=2516 mean=NA sd=NA skewness=NA
#>
#> +--------+--------+--------+--------+--------+--------+
#> | Value | Label | N | Raw % |Valid % | Cum. % |
#> +--------+--------+--------+--------+--------+--------+
#> | East | East | 509 | 20.23 | 20.23 | 20.23 |
#> | West | West | 2007 | 79.77 | 79.77 | 100.00 |
#> +--------+--------+--------+--------+--------+--------+
#> | Total | | 2516 | 100.00 | 100.00 | |
#> +--------+--------+--------+--------+--------+--------+
#>
# Grouped analysis by region
survey_data %>%
group_by(region) %>%
frequency(gender, weights = sampling_weight)
#>
#> Weighted Frequency Analysis Results
#> -----------------------------------
#>
#> gender (Gender)
#>
#> Group: region = East
#> --------------------
#> # total N=509 valid N=509 mean=NA sd=NA skewness=NA
#>
#> +--------+--------+--------+--------+--------+--------+
#> | Value | Label | N | Raw % |Valid % | Cum. % |
#> +--------+--------+--------+--------+--------+--------+
#> | Male | Male | 249 | 49.01 | 49.01 | 49.01 |
#> | Female | Female | 260 | 50.99 | 50.99 | 100.00 |
#> +--------+--------+--------+--------+--------+--------+
#> | Total | | 509 | 100.00 | 100.00 | |
#> +--------+--------+--------+--------+--------+--------+
#>
#>
#> Group: region = West
#> --------------------
#> # total N=2007 valid N=2007 mean=NA sd=NA skewness=NA
#>
#> +--------+--------+--------+--------+--------+--------+
#> | Value | Label | N | Raw % |Valid % | Cum. % |
#> +--------+--------+--------+--------+--------+--------+
#> | Male | Male | 945 | 47.09 | 47.09 | 47.09 |
#> | Female | Female | 1062 | 52.91 | 52.91 | 100.00 |
#> +--------+--------+--------+--------+--------+--------+
#> | Total | | 2007 | 100.00 | 100.00 | |
#> +--------+--------+--------+--------+--------+--------+
#>
# Education levels with sorting
survey_data %>% frequency(education, sort.frq = "desc")
#>
#> Frequency Analysis Results
#> --------------------------
#>
#> education (Highest educational attainment)
#> # total N=2500 valid N=2500 mean=NA sd=NA skewness=NA
#>
#> +------------------------+------------------------+--------+--------+--------+--------+
#> | Value | Label | N | Raw % |Valid % | Cum. % |
#> +------------------------+------------------------+--------+--------+--------+--------+
#> | University | University | 399 | 15.96 | 15.96 | 100.00 |
#> | Intermediate Secondary | Intermediate Secondary | 629 | 25.16 | 25.16 | 58.80 |
#> | Basic Secondary | Basic Secondary | 841 | 33.64 | 33.64 | 33.64 |
#> | Academic Secondary | Academic Secondary | 631 | 25.24 | 25.24 | 84.04 |
#> +------------------------+------------------------+--------+--------+--------+--------+
#> | Total | | 2500 | 100.00 | 100.00 | |
#> +------------------------+------------------------+--------+--------+--------+--------+
#>
# Employment status with custom display options
survey_data %>% frequency(employment, weights = sampling_weight,
show.na = TRUE, show.sum = TRUE)
#>
#> Weighted Frequency Analysis Results
#> -----------------------------------
#>
#> employment (Employment status)
#> # total N=2516 valid N=2516 mean=NA sd=NA skewness=NA
#>
#> +------------+------------+--------+--------+--------+--------+
#> | Value | Label | N | Raw % |Valid % | Cum. % |
#> +------------+------------+--------+--------+--------+--------+
#> | Student | Student | 80 | 3.18 | 3.18 | 3.18 |
#> | Employed | Employed | 1603 | 63.71 | 63.71 | 66.89 |
#> | Unemployed | Unemployed | 184 | 7.32 | 7.32 | 74.21 |
#> | Retired | Retired | 534 | 21.21 | 21.21 | 95.41 |
#> | Other | Other | 115 | 4.59 | 4.59 | 100.00 |
#> +------------+------------+--------+--------+--------+--------+
#> | Total | | 2516 | 100.00 | 100.00 | |
#> +------------+------------+--------+--------+--------+--------+
#>
