The CPS publishes their data in a numeric format, with a separate
PDF codebook (not machine readable) describing factor values. This function
labels the raw numeric CPS data according to a supplied factor key. Codes
that appear in a given year and are not included in factors will be
recoded as NA.
Usage
cps_label(
data,
factors = cpsvote::cps_factors,
names_col = "new_name",
na_vals = c("-1", "BLANK", "NOT IN UNIVERSE"),
expand_year = TRUE,
rescale_weight = TRUE,
toupper = TRUE
)Arguments
- data
The raw CPS data that factors should be applied to
- factors
A data frame containing the label codes to be applied
- names_col
Which column of
factorscontains the column names ofdata- na_vals
Which character values should be considered "missing" across the dataset and be set to NA after labelling
- expand_year
Whether to change the two-digit year listed in earlier surveys (94, 96) into a four-digit year (1994, 1996)
- rescale_weight
Whether to rescale the weight, dividing by 10,000. The CPS describes the given weight as having "four implied decimals", so this rescaling adjusts the weight to produce sensible population totals.
- toupper
Whether to convert all factor levels to uppercase
Examples
cps_label(cps_2016_10k)
#> # A tibble: 10,000 × 17
#> FILE YEAR STATE AGE SEX EDUCATION RACE HISPANIC WEIGHT VRS_VOTE
#> <fct> <int> <fct> <int> <fct> <fct> <fct> <fct> <dbl> <fct>
#> 1 cps_nov2016… 2016 AL 69 FEMA… HIGH SCH… WHIT… NON-HIS… 1328. YES
#> 2 cps_nov2016… 2016 AL 35 MALE BACHELOR… WHIT… NON-HIS… 1793. YES
#> 3 cps_nov2016… 2016 AL 54 FEMA… HIGH SCH… WHIT… NON-HIS… 1757. NO RESP…
#> 4 cps_nov2016… 2016 AL 47 MALE HIGH SCH… WHIT… NON-HIS… 1628. NO
#> 5 cps_nov2016… 2016 AL 60 FEMA… SOME COL… WHIT… NON-HIS… 1396. NO RESP…
#> 6 cps_nov2016… 2016 AL 12 FEMA… NA WHIT… NON-HIS… 1917. NA
#> 7 cps_nov2016… 2016 AL 65 MALE HIGH SCH… WHIT… NON-HIS… 1732. NO
#> 8 cps_nov2016… 2016 AL 43 MALE SOME COL… WHIT… NON-HIS… 2042. YES
#> 9 cps_nov2016… 2016 AL 46 MALE SOME COL… WHIT… HISPANIC 2068. YES
#> 10 cps_nov2016… 2016 AL 47 MALE HIGH SCH… WHIT… NON-HIS… 1694. NO
#> # ℹ 9,990 more rows
#> # ℹ 7 more variables: VRS_REG <fct>, VRS_REG_WHYNOT <fct>,
#> # VRS_VOTE_WHYNOT <fct>, VRS_VOTEMODE_2004toPRESENT <fct>,
#> # VRS_VOTEWHEN_2004toPRESENT <fct>, VRS_REG_METHOD <fct>, VRS_RESIDENCE <fct>