Reads an SPSS .sav file and preserves user-defined missing values as
tagged NAs instead of converting them to regular NA. This allows you to
distinguish between different types of missing data (e.g., "no answer",
"not applicable", "refused") while still treating them as NA in
standard R operations.
Arguments
- path
Path to an SPSS
.savfile.- tag.na
If
TRUE(the default), user-defined missing values are converted to tagged NAs usinghaven::tagged_na(). IfFALSE, the file is read with standardhaven::read_sav()behavior (all missing values become regularNA).- encoding
Character encoding for the file. If
NULL, haven's default encoding detection is used.- verbose
If
TRUE, prints a message summarizing how many values were converted.
Value
A tibble with the SPSS data. When tag.na = TRUE:
User-defined missing values are stored as tagged NAs
is.na()returnsTRUEfor these values (standard R behavior)The original SPSS missing codes can be recovered via
na_frequencies(),untag_na(), orhaven::na_tag()Each tagged variable has an
"na_tag_map"attribute mapping tag characters to original SPSS codes
Details
SPSS allows defining specific values as "user-defined missing values"
(e.g., -9 = "no answer", -8 = "don't know"). When reading .sav files
with haven::read_sav(), these are silently converted to NA, losing the
information about why a value is missing.
read_spss() preserves this information using haven's tagged NA system:
each missing value type gets a unique tag character (a-z, A-Z, 0-9) that
can be inspected with haven::na_tag(). The values still behave as NA
in all standard R operations (mean(), sum(), is.na(), etc.).
Use the companion functions to work with the tagged NAs:
na_frequencies()- Frequency table of missing typesuntag_na()- Convert tagged NAs back to original SPSS codesstrip_tags()- Convert tagged NAs to regular NAs (drop tags)
See also
na_frequencies(), untag_na(), strip_tags(), haven::read_sav(),
frequency(), read_por()
Other data-import:
na_frequencies(),
read_por(),
read_sas(),
read_stata(),
read_xlsx(),
read_xpt(),
strip_tags(),
untag_na()
Examples
if (FALSE) { # \dontrun{
# Read SPSS file with tagged missing values
data <- read_spss("survey.sav")
# Check what types of missing values exist
na_frequencies(data$satisfaction)
# Standard R operations work normally (NAs are excluded)
mean(data$satisfaction, na.rm = TRUE)
# frequency() shows each missing type separately
data %>% frequency(satisfaction)
# Recover original SPSS codes
original_codes <- untag_na(data$satisfaction)
# Convert to regular NAs (standard behavior)
data_clean <- strip_tags(data$satisfaction)
} # }
