The response sets in certain CPS questions change between years. This function
consolidates several of these response sets across years (and fixes typos
from the CPS documentation), specifically race, Hispanic status, duration of
residency, reason for not voting, and method of registration. Additionally,
this creates a new column VRS_VOTEMETHOD_CON which consolidates multiple
expressions of vote method across years (By Mail, Early, and Election Day)
into one variable.
Details
While consolidating response sets across multiple surveys can be
fraught with peril, this function attempts to combine disparate levels for
race and other CPS variable across multiple years. Some of these are
relatively straightforward typos fixes ("NON-HIPSANIC" should clearly match
"NON-HISPANIC"), but others have differing degrees of subjectivity applied.
Take this function with a grain of salt, as it depends on some exact variable
names you may or may not be using, and recode variables as needed for your
own uses. To explore exactly how these variables were recoded, you can run
table(data$RACE, cps_refactor(data)$RACE) in the console, substituting
your column of interest in for RACE.
Examples
cps_refactor(cps_label(cps_2016_10k))
#> # A tibble: 10,000 × 18
#> FILE YEAR STATE AGE SEX EDUCATION RACE HISPANIC WEIGHT VRS_VOTE
#> <fct> <int> <fct> <int> <fct> <fct> <fct> <fct> <dbl> <fct>
#> 1 cps_nov2016… 2016 AL 69 FEMA… HIGH SCH… WHITE NON-HIS… 1328. YES
#> 2 cps_nov2016… 2016 AL 35 MALE BACHELOR… WHITE NON-HIS… 1793. YES
#> 3 cps_nov2016… 2016 AL 54 FEMA… HIGH SCH… WHITE NON-HIS… 1757. NO RESP…
#> 4 cps_nov2016… 2016 AL 47 MALE HIGH SCH… WHITE NON-HIS… 1628. NO
#> 5 cps_nov2016… 2016 AL 60 FEMA… SOME COL… WHITE NON-HIS… 1396. NO RESP…
#> 6 cps_nov2016… 2016 AL 12 FEMA… NA WHITE NON-HIS… 1917. NA
#> 7 cps_nov2016… 2016 AL 65 MALE HIGH SCH… WHITE NON-HIS… 1732. NO
#> 8 cps_nov2016… 2016 AL 43 MALE SOME COL… WHITE NON-HIS… 2042. YES
#> 9 cps_nov2016… 2016 AL 46 MALE SOME COL… WHITE HISPANIC 2068. YES
#> 10 cps_nov2016… 2016 AL 47 MALE HIGH SCH… WHITE NON-HIS… 1694. NO
#> # ℹ 9,990 more rows
#> # ℹ 8 more variables: VRS_REG <fct>, VRS_REG_WHYNOT <fct>,
#> # VRS_VOTE_WHYNOT <fct>, VRS_VOTEMODE_2004toPRESENT <fct>,
#> # VRS_VOTEWHEN_2004toPRESENT <fct>, VRS_REG_METHOD <fct>,
#> # VRS_RESIDENCE <fct>, VRS_VOTEMETHOD_CON <fct>