Use R packages efficiently

I would like to use this opportunity to write about some useful patterns to organize your R code. These may be well know to advanced R users, but are left out of most tutorials. So, sit back, relax, and enjoy some useful simple tips for R!

1. Loading R packages

The most famous way of loading packages in R is through the library() command. One disadvantage of this option is the fact that an error is thrown if the package is not found in the R environment. An alternative function is require() which just throws a warning message and returns FALSE if the package is not found, or loads the package and returns TRUE if it is found.

In the case that you want to see all packages in your system, you can use installed.packages(), or a command similar to "testthat" %in% installed.packages() to check for a specific package (in this case, testthat).

2. Basic folder structure

The contents of an R package are usually included in a structured folder containing R files along with documentation and other binaries. The simplest structure of a working R package should be similar to the following.

├── DESCRIPTION
├── NAMESPACE
└── R
    ├── file1.R
    └── file2.R

DESCRIPTION is a file containing metadata of the package. It requires, at least, two fields: the name of the package and the version. NAMESPACE is a file generated automatically by roxygen2 and should not be edited by hand. That being said, the following mock should work with any package to export all functions defined in the R folder.

# Generated by roxygen2: fake comment so roxygen2 overwrites silently.
exportPattern("^[^\\.]")

Lastly, R is the folder containing the files with all R definition and variables.

3. Accessing exported variables in a namespace

Loading a whole package can take a while, not to mention that it may override variables in memory having the same names as some objects defined in the package. So, if want to use just one function in a package, you can use the double colon operator, ::. For example, in order to use the function test_file() from the package testthat one can simply type testthat::test_file().

4. Accessing internal variables in a namespace

The triple colon operator, :::, is, simply put, :: on steroids! This operator allows you to use variables defined in a package that are not exported to the user environment.

As an example, run the command testthat:::praise() and you will find a pleasant surprise!

5. devtools is out there! Use it!

Installing the devtools package on your R environment is often not straight forward due to its many dependencies and C binaries, but it will make developing R packages much easier. So, be patient through the extra debugging required - It will pay off in the future!

Some of the most useful functions to develop R packages are: document(), install(), load_all() and test().

document(): generates all documentation including man pages, using the comments starting with #', and the NAMESPACE file. When NAMESPACE is rewritten, only the definitions marked with @export are exported every time that the package is loaded.
install(): installs the R package corresponding to the working directory or to the path provided, in the local R environment.
load_all(): loads all functions and variables defined in the R package corresponding to the working directory or to the path provided, without requiring installation. This is specially useful in combination with RStudio to create breakpoints and inspect the behavior of the functions written.
test(): automatically runs all the unit tests, located in tests, in the R package.

References

All these ideas were gathered while exploring the help pages or source code of some R packages. The following pages have been specially helpful to me.

Conclusion

I hope you find this post useful and do not forget to share your favorite R tricks, too!

1. Loading R packages¶

2. Basic folder structure¶

3. Accessing exported variables in a namespace¶

4. Accessing internal variables in a namespace¶

5. devtools is out there! Use it!¶

References¶

Conclusion¶