How to change values in a dataset using a pipe (%>%).

Tagged: ,

  • How to change values in a dataset using a pipe (%>%).

    Posted by Frank on 20 June 2024 at 16:53

    This was the question from Duncan’s intermediate class asking about altertives for the below operation

    hiv_data <- rio::import("on_ggplot_session_1/hivdata.xlsx")
    hiv_data$hiv.gender <- NA # creates an empty column called hiv.gender
    hiv_data$hiv.gender[hiv_data$gender==1] <- "Female" # hiv.gender is updated with "Female" if it is 1
    hiv_data$hiv.gender[hiv_data$gender==2] <- "Male" # hiv.gender is updated with "Male" if it is 2

    Ibrahim replied 1 year, 1 month ago 2 Members · 2 Replies
  • 2 Replies
  • Frank

    Organizer
    20 June 2024 at 16:55

    Here are two suggestions:

    # 1: You can use if_else and mutate to change your values
    # Let's do so by making a new variable category called gender_cats
    hiv_data%>%
      mutate(gender_cats =            # adding a new variable to our dataset
    if_else(gender==1, # assigning a condition
    'Female', # all that match will be assigned the 'Female' value
    'Male')) # all the remaining will be assigned the 'Male' value
    Alternatively,
    # 2: You can use the case_when and mutate, the same procedure as above
    hiv_data%>%
    mutate(
    gender_cats = case_when( # adding a new variable to our dataset
    gender==1 ~ 'Female', # assigning the first condition
    gender==2 ~ 'Male' # assigning the second condition
    )
    )

    Both these options give you a similar answer. The conditions help you to use the %>% (pipe operator) while you are transforming your data

    • Ibrahim

      Moderator
      21 June 2024 at 01:25

      I also prefer the mutate approach as I can modify many variables at the same time. I do have an additional strategy though if the variables are already numeric

      data |> ## native pipe because I don't have access to the keyboard shortcut

      mutate(

      sex = factor(sex, levels = 1:2, labels = c("Male", "Female")) ## 1:2 == c(1,2) ,

      education = factor(levels = 0:3, labels = c("labels here")

      )

      If the variable is written such as female, Male, male, m, M, F, Female, f, I will use

      mutate(

      sex = fct_collapse(sex, "Male" = c("m", "male", "M"), "Female" = c("f", "F", "female")

      )

      ## Male and Female were intentionally left out. However, for good measure, add it in your own 😊

      • This reply was modified 1 year, 1 month ago by  Ibrahim. Reason: Code completion
      • This reply was modified 1 year, 1 month ago by  Ibrahim. Reason: Remove hmtl nodes
      • This reply was modified 1 year, 1 month ago by  Ibrahim.

Log in to reply.