Replaces specific numeric values with NA (or tagged NAs) across one or
more variables. This is essential for data cleaning workflows where missing
value codes (e.g., -9, -8, 99) are stored as regular values and need to be
declared as missing after import.
Arguments
- data
A data frame, tibble, or a single vector.
- ...
Values to set as missing. Can be:
Unnamed numeric values: Applied to all numeric columns (e.g.,
set_na(data, -9, -8))Named pairs: Applied to specific variables (e.g.,
set_na(data, income = c(-9, -8), age = -1))
- tag
If
TRUE(default), uses tagged NAs to preserve distinct missing types. The resulting tagged NAs integrate withna_frequencies(),frequency(), andcodebook(). IfFALSE, replaces with regularNA.- verbose
If
TRUE, prints a summary of conversions.
Details
Tagged vs. Regular NA
When tag = TRUE (default), each missing value code gets a unique tag
character, so you can distinguish between "No answer" (-9) and "Not
applicable" (-8) in downstream analysis. This is the same system used by
read_spss() with tag.na = TRUE.
When tag = FALSE, all specified values become regular NA and the
distinction between different missing types is lost.
Interaction with Existing Labels
If a value being set to missing has an existing value label, that label
is preserved as a tagged NA label (when tag = TRUE), making it visible
in frequency() and codebook() output.
See also
na_frequencies() for inspecting missing types,
strip_tags() for converting tagged NAs to regular NA,
untag_na() for recovering original codes
Other labels:
copy_labels(),
drop_labels(),
find_var(),
to_character(),
to_label(),
to_labelled(),
to_numeric(),
unlabel(),
val_labels(),
var_label()
Examples
if (FALSE) { # \dontrun{
# Set -9 and -8 as missing across all numeric variables
data <- set_na(survey_data, -9, -8)
# Set missing for specific variables only
data <- set_na(survey_data,
income = c(-9, -8, -42),
life_satisfaction = c(-9, -11)
)
# Use regular NA instead of tagged NA
data <- set_na(survey_data, -9, -8, tag = FALSE)
# Check the result
na_frequencies(data$income)
} # }
