Felix Schönbrodt

PD Dr. Dipl.-Psych.

July 2012

Validating email adresses in R

I currently program an automated report generation in R – participants fill out a questionnaire, and they receive a nicely formatted pdf with their personality profile. I use knitr, LaTex, and the sendmailR package.

Some participants did not provide valid email addresses, which caused the sendmail function to crash. Therefore I wanted some validation of email addresses – here’s the function:

isValidEmail <- function(x) {
    grepl("\\<[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}\\>", as.character(x), ignore.case=TRUE)
}

Let’s test some valid and invalid adresses:

# Valid adresses
isValidEmail("felix@nicebread.de")
isValidEmail("felix.123.honeyBunny@nicebread.lmu.de")
isValidEmail("felix@nicebread.de  ")
isValidEmail("    felix@nicebread.de")
isValidEmail("felix+batman@nicebread.de")
isValidEmail("felix@nicebread.office")

# invalid addresses
isValidEmail("felix@nicebread")  
isValidEmail("felix@nicebread@de")
isValidEmail("felixnicebread.de")

The regexp is taken from www.regular-expressions.info and adapted to the R style of regexp. Please note the many comments (e.g., here or here) about “Is there a single regexp that matches all valid email adresses?” (the answer is no).

Comments (7) | Trackback