It is almost impossible to miss the ifelse()
and if_else()
functions in R. It’s the goto functions when you want to write an if-statement and apply the logic to every element of that vector. The hablar
package introduces if_else_()
, giving you a third way to do a vectorised if-else in R. Let’s take an example:
library(dplyr)
library(hablar)
library(lubridate)
df <- tibble(name = c("John", "Lisa", "Astrid"),
birthday = ymd("1976-04-15", "1983-02-03", "1883-10-27"))
name | birthday |
---|---|
John | 1976-04-15 |
Lisa | 1983-02-03 |
Astrid | 1883-10-27 |
In the data frame above you can see that the birthday of Astrid is in the year 1883, which we argue is an incorrect birthday. The task is to change it to missing.
if_else_
from hablar?With if_else_()
from hablar
we can write it as follows.
df %>%
mutate(birthday = if_else_(year(birthday) < 1900, NA, birthday))
name | birthday |
---|---|
John | 1976-04-15 |
Lisa | 1983-02-03 |
Astrid | NA |
It solved our problem by setting the fauly birthday to NA
(missing value).
if_else
from dplyr?You definetly could, but it forces you to be more specific than necessary. The above example would raise an error. You need to specify which type of NA you want. In if_else
you need to write:
df %>%
mutate(birthday = if_else(year(birthday) < 1900, as.Date(NA), birthday))
name | birthday |
---|---|
John | 1976-04-15 |
Lisa | 1983-02-03 |
Astrid | NA |
It does the same job, but is harder to read.
ifelse()
from base R?You should avoid it since it changes your data type, which makes it unreliable. Additionally, it removes attributes. The above cases with ifelse()
returns integers which represent the amount of days passed since 1970-01-01. Simply confusing.
df %>%
mutate(birthday = ifelse(year(birthday) < 1900, NA, birthday))
name | birthday |
---|---|
John | 2296 |
Lisa | 4781 |
Astrid | NA |
For more examples and information install hablar
and type vignette("hablar")
.