New CRAN requirements for packages with C and C++

  Andy Teucher

The R package landscape is dynamic, with changes in infrastructure common, especially when CRAN makes changes to their policies and requirements. This is particularly true for packages that include low-level compiled code, requiring developers to be nimble in responding to these changes.

The tidyverse team at Posit is in the unique situation where we have a concentration of developers working full-time on creating and maintaining open source packages. This internal community provides the opportunity to collaborate to develop shared practices and discover solutions to problems that arise. When we can, we like to share what we’ve learned so other developers can benefit.

There have been a few recent changes at CRAN for packages containing C and C++ code that developers have had to adapt to, and we would like to share some of our learning:

NOTE regarding SystemRequirements: C++11

Many package authors might have noticed a new NOTE on R-devel when submitting a package to CRAN containing C++ code:

* checking C++ specification ...
  NOTE Specified C++11: please drop specification unless essential

This NOTE is now appearing during R CMD check on R-devel for packages where the DESCRIPTION file has the following:

SystemRequirements: C++11 

Packages that use C++11 would also usually have set CXX_STD=CXX11 in the src/Makevars and src/Makevars.win files (and src/Makevars.ucrt, if present). These specifications tell R to use the C++11 standard when compiling the code.

To understand the NOTE, a bit of history will be helpful (thanks to Winston Chang for writing this up):

  • In R 3.5 and below, on systems with an old compiler, R would default to using the C++98 standard when compiling the code. If a package needed a C++11 compiler, the DESCRIPTION file needed to have SystemRequirements: C++11, and the various src/Makevars* files needed to set CXX_STD=CXX11.
  • In R 3.6.2, R began defaulting to compiling packages with the C++11 standard, as long as the compiler supported C++11 (which was true on most systems).
  • In R 4.0, C++11 became the minimum supported compiler, so SystemRequirements: C++11 was no longer necessary.
  • In (the forthcoming) R 4.3, the default C++ standard is C++17 where available. R CMD check now raises a NOTE if anything older than the default is specified in SystemRequirements: or CXX_STD in the various src/Makevars* files. This NOTE will block submission to CRAN — if the standard you specify is necessary for your package you will likely need to explain why.

How to fix it

  1. Edit the DESCRIPTION file and remove SystemRequirements: C++11.
  2. Edit src/Makevars, src/Makevars.win, and src/Makevars.ucrt and remove CXX_STD=CXX11.

After making these changes, the package should install without trouble on R 3.6 and above. It may not build on R <= 3.5 on systems with very old compilers, though it is likely that the vast majority of users will have a newer version of R and/or have recent enough compilers. If you want to be confident that your package will be installable on R 3.5 and below with old compilers, there are several options; we offer two of the simplest approaches here:

  • You can use a configure script at the top level of the package, and have it add CXX_STD=CXX11 for R 3.5 and below. An example (unmerged) pull request to the readxl package demonstrates this approach. You will also need to add Biarch: true in your DESCRIPTION file. This appears to be the approach preferred by CRAN.
  • For users with R <= 3.5 on a system with an older compiler, package authors can instruct users to edit their ~/.R/Makevars file to include this line: CXX_STD=CXX11.

The tidyverse has a policy of supporting four previous versions of R. Currently that includes R 3.5, but with the upcoming release of R 4.3 (which should be this Spring some time) the minimum version we will support is R 3.6. As we won’t be supporting R 3.5 in the near future, you should not feel pressured to either.

WARNING regarding the use of sprintf() in C/C++

Another recent change in CRAN checks on R-devel that authors might encounter is the disallowing of the use of the C functions sprintf() and vsprintf(). R CMD check on R-devel may throw warnings that look something like this:

checking compiled code ... WARNING
File 'fs/libs/fs.so':
  Found 'sprintf', possibly from 'sprintf' (C)
    Object: 'file.o'
Compiled code should not call entry points which might 
terminate R nor write to stdout/stderr instead of to the 
console, nor use Fortran I/O nor system RNGs nor [v]sprintf.
See 'Writing portable packages' in the 'Writing R Extensions' manual.

According to the NEWS for R-devel (which will be R 4.3):

The use of sprintf and vsprintf from C/C++ has been deprecated in macOS 13 and is a known security risk. R CMD check now reports (on all platforms) if their use is found in compiled code: replace by snprintf or vsnprintf respectively.

These are considered to be a security risk because they potentially allow buffer overflows that write more bytes than are available in the output buffer. This is a risk if the text that is being passed to sprintf() comes from an uncontrolled source.

Here is a very simple example:

library(cpp11)

cpp_function('
  int say_height(int height) {
    // "My height is xxx cm" is 19 characters but we need
    // to add one for the null-terminator
    char out[19 + 1];
    int n;
    n = sprintf(out, "My height is %i cm", height);
    Rprintf(out);
    return n;
  }
'
)

say_height(182)
#> My height is 182 cm
#> [1] 19
say_height(1824) # This will abort due to buffer overflow

How to fix it

In most cases, this should be a simple fix: replace sprintf() with snprintf() and vsprintf() with vsnprintf(). These n variants take a second parameter size, that specifies the maximum number of bytes to be written, including the automatically appended null-terminator. If the output is a static buffer, you can use sizeof():

cpp_function('
  int say_height_safely(int height) {
    // "My height is xxx cm\\n" is 20 characters but we need 
    // to add one for the null-terminator
    char out[20 + 1];
    int n;
    n = snprintf(out, sizeof(out), "My height is %i cm\\n", height);
    Rprintf(out);
    return n;
  }
')

say_height_safely(182)
#> My height is 182 cm
#> [1] 20
say_height_safely(1824567)
#> My height is 1824567
#> [1] 24

Notice that the return value of sprintf() and snprintf() are slightly different. sprintf() returns the total number of characters written (excluding the null-terminator), while snprintf() returns the length of the formatted string, whether or not it has been truncated to match size.

It is a bit trickier if the destination is not a static buffer, so you’ll have to determine the maximum size by carefully thinking about the code.

WARNING regarding the use of strict prototypes in C

Many maintainers with packages containing C code have also been getting hit with this warning:

warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]

This usually comes from C function declarations that look like this, with no arguments specified (which is very common):

int myfun() {
  ...
};

This new warning is because CRAN is now running checks on R-devel with the -Wstrict-prototypes compiler flag set. In R we define functions that take no arguments with myfun <- function() {...} all the time. In C, with this flag set, the fact that a function takes no arguments must be explicitly stated (i.e., the arguments list cannot be empty). In the upcoming C23 standard, empty function signatures will be considered valid and not ambiguous, however at this point it is likely to be the reason you encounter this warning from CRAN.

How to fix it

This can be fixed by placing the void keyword in the previously empty argument list:

int myfun(void) {
  ...
};

Here is an example where the authors of Cubist applied the necessary patches, and another one in rlang.

Vendored code

Function declarations without a prototype are very common, and unfortunately are thus likely to appear in libraries that you include in your package. This may require you to patch that code in your package. The readxl package includes the libxls C library, which was patched in readxl here to deal with this issue.

The ideal solution in cases like this would be to submit patches to the upstream libraries so you don’t have to deal with the ongoing maintenance of your local patches, but that is not always possible. Generally, you can explain this problem when submitting your package, and as long as you’ve have notified the upstream maintainer, CRAN should accept your updated package.

Unspecified types in function signature

The -Wstrict-prototypes compiler flag will also catch deprecated function definitions where the types of the arguments are not declared. This is actually likely the primary purpose for CRAN enabling this flag, as it is ambiguous and much more dangerous than empty function signatures.

These take the form:

void myfun(x, y) {
  ...
};

where the argument types are not declared. This is solved by declaring the types of the arguments:

void myfun(int x, char* y) {
  ...
};