Skip to contents

Data Import

Import statistical data files with preserved missing value information

read_spss()
Read SPSS Data with Tagged Missing Values
read_por()
Read SPSS Portable Data with Tagged Missing Values
read_stata()
Read Stata Data with Tagged Missing Values
read_sas()
Read SAS Data with Tagged Missing Values
read_xpt()
Read SAS Transport File with Tagged Missing Values
read_xlsx()
Read Excel Data with Label Reconstruction

Data Export

Export data to statistical formats with preserved labels and missing values

write_spss()
Export Data to SPSS Format
write_stata()
Export Data to Stata Format
write_xpt()
Export Data to SAS Transport Format
write_xlsx()
Export Data to Excel with Label Support

Label Management

Inspect, modify, and convert labelled survey data

var_label()
Get or Set Variable Labels
val_labels()
Get or Set Value Labels
copy_labels()
Copy Labels from One Data Frame to Another
drop_labels()
Remove Unused Value Labels
to_label()
Convert Labelled Variables to Factors
to_character()
Convert Labelled Variables to Character
to_numeric()
Convert Factors or Labelled Variables to Numeric
to_labelled()
Convert Variables to Labelled Format
set_na()
Declare Values as Missing
unlabel()
Remove All Label Metadata

Data Transformation

Recode, dichotomize, dummy-code, standardize, and center variables

rec()
Recode Variables Using String Syntax
to_dummy()
Create Dummy Variables (One-Hot Encoding)
std()
Standardize Variables (Z-Scores)
center()
Center Variables (Mean Centering)

Data Exploration

Search and inspect your data

find_var()
Find Variables by Name or Label

Descriptive Statistics

Summarize and explore your survey data

codebook()
Create a Codebook for Your Data
describe()
Get to Know Your Numeric Data
frequency() fre()
Count How Many People Chose Each Option
crosstab()
Compare Two Categories: See How They Relate

Hypothesis Testing

Compare groups and test for significant differences

t_test()
Test If Two Groups Differ
oneway_anova()
Compare Multiple Groups: Are Their Averages Different?
factorial_anova()
Compare Groups Across Multiple Factors: Factorial ANOVA
ancova()
Analysis of Covariance: ANCOVA
mann_whitney()
Compare Two Groups Without Assuming Normal Data
chi_square() phi() cramers_v() goodman_gamma()
Test If Two Categories Are Related
fisher_test()
Fisher's Exact Test for Small Samples
chisq_gof()
Chi-Square Goodness-of-Fit Test
mcnemar_test()
McNemar's Test for Paired Proportions

Non-Parametric Tests

Distribution-free tests for ordinal and nominal data

kruskal_wallis()
Compare Multiple Groups Without Assuming Normal Data
wilcoxon_test()
Compare Two Related Measurements Without Assuming Normality
friedman_test()
Compare Three or More Related Measurements Without Assuming Normality
binomial_test()
Test Whether a Proportion Matches an Expected Value

Correlation Analysis

Measure relationships between variables

pearson_cor()
Measure How Strongly Variables Are Related
spearman_rho()
Spearman's Rank Correlation Analysis
kendall_tau()
Kendall's Tau Correlation Analysis

Post-Hoc Analysis

Follow-up tests for detailed group comparisons

tukey_test()
Find Which Specific Groups Differ After ANOVA
scheffe_test()
Compare All Groups More Conservatively After ANOVA
levene_test()
Test If Groups Vary Similarly
dunn_test()
Find Which Specific Groups Differ After Kruskal-Wallis
pairwise_wilcoxon()
Find Which Specific Measurements Differ After Friedman Test

Scale Analysis

Factor analysis, reliability, and scale construction

reliability()
Check How Reliably Your Scale Measures a Concept
efa()
Explore the Structure Behind Your Survey Items
pomps()
Transform Scores to Percent of Maximum Possible (POMPS)
row_means()
Compute Row Means Across Items
row_sums()
Compute Row Sums Across Items
row_count()
Count Occurrences of a Value Across Columns

Regression Analysis

Linear and logistic regression with SPSS-compatible output

linear_regression()
Run a Linear Regression
logistic_regression()
Run a Logistic Regression

Weighted Statistics

Individual weighted statistics for survey data

w_mean()
Calculate Population-Representative Averages
w_median()
Find the Population-Representative Middle Value
w_sd()
Calculate Population-Representative Standard Deviations
w_var()
Calculate Population-Representative Variance
w_se()
Calculate Population-Representative Standard Errors
w_iqr()
Measure Population-Representative Spread (IQR)
w_range()
Find the Range of Your Data
w_quantile()
Calculate Population-Representative Percentiles
w_modus()
Find the Most Common Value in Your Population
w_skew()
Measure Population-Representative Skewness
w_kurtosis()
Measure Population-Representative Kurtosis

Datasets

Example datasets for learning and testing

survey_data
Social Survey Data (Synthetic)
longitudinal_data
Longitudinal Study Data (Synthetic)
longitudinal_data_wide
Longitudinal Study Data - Wide Format (Synthetic)