Protect Your Code with Tests! Your First R Package Journey Continues
Following the first post I published about how to write your first R package using modern tools, we continue our adventures with R packages and touch upon the topic of writing tests. Tests are essential to ensure your package is robust and that any modification you ever make has no impact on its key features.
Unit tests are like insurance. You hope you never have to use them, but you're glad you have them when you do.
Martin Fowler
Why Write Tests?
Writing tests is not the most exciting task. Most developers would rather spend time developing new features than new tests. But the more complex your code becomes, the more likely you will see bugs pop up everywhere. You can't plan to cover the whole surface of issues that can arise, and Unit Testing is the first thing that helps to ensure that independent parts of your code (typically, functions), behave as they should, even in corner cases.
How to write tests in the context of a R package?
You've come here for that!
Let's go back to our packagedfun package. It has a unique function that does not much right now:
#' Saying Hello is polite
#'
#' @description
#' `hello_fun` says hello and uses the name of the person as an argument.
#'
#' @importFrom glue glue
#'
#' @details
#' This is a generic function that can be used to say hello/
#'
#' @param name the name of the person you want to send a Hello message to
#' @export
hello_fun <- function(name) {
print(glue::glue("Hello {name}"))
}
First, we will make this function a little more elaborate, and do a few more things, such as:
- having a default name value if the function is run as is, without any parameter
- refuse empty names if entered, by throwing an error
- accept multiple names if given in the form of a vector of characters
#' Saying Hello is polite
#'
#' @description
#' `hello_fun` says hello and uses the name of the person(s) as an argument. y
#'
#' @importFrom glue glue
#'
#' @details
#' This is a generic function that can be used to say hello/
#'
#' @param name the name of the person(s) you want to send a Hello message to - you can enter a vector of names. By default is no name is entered, it takes the value of "Person with no name"
#' @export
hello_fun <- function(name="Person with no name") {
if (length(name)==1) {
if (is.na(name) | name=="") {
stop("I'd like you to enter a name. not an empty string!")
} else {
return(glue::glue("Hello {name}"))
}
}
if (length(name)>1) {
names_string <- paste(name,collapse=" and ")
return(glue::glue("Hello {names_string}!"))
}
}
Now that we can do several things with this function, let's write some tests for it.
Go into the main folder of your package, and create a folder called tests and inside that folder testthat
In there, you can create one or several R files that will contain your tests. We will create a first file called test-hello.R.
Here's the first test we write for our function in test-hello.R:
test_that("Confirm that the function works with John", {
expect_equal(hello_fun(name="John"), "Hello John")
})
It's a positive test. We confirm that the function will be returning "Hello John" in case we give "John" as a parameter. Makes sense right? We use the expect_equal function from the testthat package, which will in its first argument secure a function call, and compare it with its second argument to confirm that they are identical.
You can then go in RStudio and hit to "Test Package" menu:
You should see something showing up in the environment pane:
The key info to pick up here is that I am a coding rockstar your test was successful, and there was only one of them coming from the "hello" test file.
Let's write a few more tests! This is now what I have in my tests/testthat/test-hello.R file:
test_that("Confirm that the function works with John", {
expect_equal(hello_fun(name="John"), "Hello John")
})
test_that("Confirm that the function works when no parameter is given", {
expect_equal(hello_fun(), "Hello Person with no name")
})
test_that("Confirm that the function works when two names are given", {
expect_equal(hello_fun(name=c("Joan","Liz")), "Hello Joan and Liz!")
})
test_that("Confirm that the function returns an error when given an empty string", {
expect_error(hello_fun(name=""), "I'd like you to enter a name. not an empty string!")
})
The new thing that I introduced here is the expect_error function. It will not look for equality, but instead confirm that the execution was stopped, and that an error message was returned (and checks that it's the right one).
Let's hit the "Test Package" menu one more time.
Good! We have now 4 tests, and all of them were passed. Are we done?
Let me add one more test!
test_that("Confirm that the function works when two names are given but one is empty", {
expect_equal(hello_fun(name=c("Joan","")), "Hello Joan")
})
I would expect that the function ignores names that are empty... let's test one more time!
Ouch! Look like we met with our first failed test! Looking at my original code, it's not supposed to handle things this way, and will instead return something like "Hello Joan and !" if the second name is an empty string.
This is to show you that writing tests is not just about writing obvious statements like checking that 1+1=2, but rather checking expected behavior that should occur in fairly typical scenarios.
You may not always think about how to deal with empty values, NA values, or other strange things like -Inf or +Inf when dealing with numeric values.
Writing tests is about thinking what can go wrong. Considering "what if...?" stories. And checking if your code can handle them.
In the meantime, let's fix our function!
Whack-a-Mole
Fixing this issue? Just 1 minute work, I thought. Here's my fix:
#' Saying Hello is polite
#'
#' @description
#' `hello_fun` says hello and uses the name of the person(s) as an argument. y
#'
#' @importFrom glue glue
#'
#' @details
#' This is a generic function that can be used to say hello/
#'
#' @param name the name of the person(s) you want to send a Hello message to - you can enter a vector of names. By default is no name is entered, it takes the value of "Person with no name"
#' @export
hello_fun <- function(name="Person with no name") {
is_empty_string_vector <- name == ""
name <- name[!is_empty_string_vector]
is_na_vector <- is.na(name)
name <- name[!is_na_vector]
if (length(name)==1) {
if (is.na(name) | name=="") {
stop("I'd like you to enter a name. not an empty string!")
} else {
return(glue::glue("Hello {name}"))
}
}
if (length(name)>1) {
names_string <- paste(name,collapse=" and ")
return(glue::glue("Hello {names_string}!"))
}
}
The new bits are there to sanitize and remove the empty strings or NA values from the name parameter, if given, and it works no matter how long the vector is!
Let's enjoy the final, successful test, while I go and grab a cup of tea:
The tea will have to wait. What's happening here?
Oh, I fixed one issue, but I created a new one!? Now, when someone enters an empty string as an argument, it will stop throwing an error since my new additions will completely remove them.
When you end up here, there's two ways to proceed:
- First would be to fix the function to restore the older behavior. It's as simple as doing a condition to check if the length of the vector is 1 first before applying our couple of lines that were made mainly for longer vectors.
- The other approach is... well, that test we had does not make sense anymore. I ended up with a generalization that's more correct, and I could just as well get rid of that test.
My choice here is to take the second approach. And this is where this quote feels just right at home:
"Unit testing is a conversation between the developer and the code."
Martin Fowler (yes, again)
So my final test-hello.R file will be like that:
test_that("Confirm that the function works with John", {
expect_equal(hello_fun(name="John"), "Hello John")
})
test_that("Confirm that the function works when no parameter is given", {
expect_equal(hello_fun(), "Hello Person with no name")
})
test_that("Confirm that the function works when two names are given", {
expect_equal(hello_fun(name=c("Joan","Liz")), "Hello Joan and Liz!")
})
test_that("Confirm that the function works when two names are given but one is empty", {
expect_equal(hello_fun(name=c("Joan","")), "Hello Joan")
})
And this time, all tests are passed:
And apparently, I'm not a rockstar anymore :-(
But I get to enjoy my tea, before it got cold. All's good.
Is that it?
For simple tests, yes. But there are many other tests you can plan, and different ways to write them, too. Next time, I will bring a more complex example on this topic - there's a lot to cover.
Stay tuned!