if not ifelse or if_else then what else?

June 16, 2019    hablar R dplyr

It is almost impossible to miss the ifelse() and if_else() functions in R. It’s the goto functions when you want to write an if-statement and apply the logic to every element of that vector. The hablar package introduces if_else_(), giving you a third way to do a vectorised if-else in R. Let’s take an example:

library(dplyr)
library(hablar)
library(lubridate)

df <- tibble(name = c("John", "Lisa", "Astrid"),
             birthday = ymd("1976-04-15", "1983-02-03", "1883-10-27"))
name birthday
John 1976-04-15
Lisa 1983-02-03
Astrid 1883-10-27

In the data frame above you can see that the birthday of Astrid is in the year 1883, which we argue is an incorrect birthday. The task is to change it to missing.

Why use if_else_ from hablar?

With if_else_() from hablar we can write it as follows.

df %>% 
  mutate(birthday = if_else_(year(birthday) < 1900, NA, birthday))
name birthday
John 1976-04-15
Lisa 1983-02-03
Astrid NA

It solved our problem by setting the fauly birthday to NA (missing value).

Why not use if_else from dplyr?

You definetly could, but it forces you to be more specific than necessary. The above example would raise an error. You need to specify which type of NA you want. In if_else you need to write:

df %>% 
  mutate(birthday = if_else(year(birthday) < 1900, as.Date(NA), birthday))
name birthday
John 1976-04-15
Lisa 1983-02-03
Astrid NA

It does the same job, but is harder to read.

Why not use ifelse() from base R?

You should avoid it since it changes your data type, which makes it unreliable. Additionally, it removes attributes. The above cases with ifelse() returns integers which represent the amount of days passed since 1970-01-01. Simply confusing.

df %>% 
  mutate(birthday = ifelse(year(birthday) < 1900, NA, birthday))
name birthday
John 2296
Lisa 4781
Astrid NA

Read more

For more examples and information install hablar and type vignette("hablar").



comments powered by Disqus