fs 1.0.0

  r-lib, fs

  Jim Hester

fs 1.0.0 is now available on CRAN! fs provides a cross-platform, uniform interface to file system operations. fs uses libuv under the hood, which gives a rock solid cross-platform interface to the filesystem.

Install the latest version with:

install.packages("fs")

Comparison vs base equivalents

fs functions smooth over some of the idiosyncrasies of file handling with base R functions:

  • Vectorization. All fs functions are vectorized, accepting multiple paths as input. Base functions are inconsistently vectorized.

  • Predictable return values that always convey a path. All fs functions return a character vector of paths, a named integer or a logical vector, where the names give the paths. Base return values are more varied: they are often logical or contain error codes which require downstream processing.

  • Explicit failure. If fs operations fail, they throw an error. Base functions tend to generate a warning and a system dependent error code. This makes it easy to miss a failure.

  • UTF-8 all the things! fs functions always convert input paths to UTF-8 and return results as UTF-8. This gives you path encoding consistency across OSes. Base functions rely on the native system encoding.

  • Naming convention. fs functions use a consistent naming convention. Because base R’s functions were gradually added over time there are a number of different conventions used (e.g. path.expand() vs normalizePath(); Sys.chmod() vs file.access()).

Tidy paths

fs functions always return ‘tidy’ paths. Tidy paths

  • always expand ~
  • use / to delimit directories
  • never have multiple / or trailing /

Tidy paths are also coloured (if your terminal supports it) based on the file permissions and file type. This colouring can be customised or extended by setting the LS_COLORS environment variable, in the same output format as GNU dircolors.

Usage

fs functions are divided into four main categories:

  • path_ for manipulating paths
  • file_ for files
  • dir_ for directories
  • link_ for links

Directories and links are special types of files, so file_ functions will generally also work when applied to a directory or link.

library(fs)

# list files in the current directory
dir_ls()
#> COPYRIGHTS  DESCRIPTION INDEX       Meta        NAMESPACE   NEWS.md     
#> R           help        html        libs        tests

# create a new directory
tmp <- dir_create(file_temp())
tmp
#> /var/folders/dt/r5s12t392tb5sk181j3gs4zw0000gn/T/RtmpmTqzrq/filed45419bda5e2

# create new files in that directory
file_create(path(tmp, "my-file.txt"))
dir_ls(tmp)
#> /var/folders/dt/r5s12t392tb5sk181j3gs4zw0000gn/T/RtmpmTqzrq/filed45419bda5e2/my-file.txt

# remove files from the directory
file_delete(path(tmp, "my-file.txt"))
dir_ls(tmp)
#> character(0)

# remove the directory
dir_delete(tmp)

fs is designed to work well with the pipe, though because it is a minimal-dependency infrastructure package it doesn’t provide the pipe itself. You will need to attach magrittr or similar.

library(magrittr)

paths <- file_temp() %>%
  dir_create() %>%
  path(letters[1:5]) %>%
  file_create()
paths
#> /var/folders/dt/r5s12t392tb5sk181j3gs4zw0000gn/T/RtmpmTqzrq/filed45416d276a/a
#> /var/folders/dt/r5s12t392tb5sk181j3gs4zw0000gn/T/RtmpmTqzrq/filed45416d276a/b
#> /var/folders/dt/r5s12t392tb5sk181j3gs4zw0000gn/T/RtmpmTqzrq/filed45416d276a/c
#> /var/folders/dt/r5s12t392tb5sk181j3gs4zw0000gn/T/RtmpmTqzrq/filed45416d276a/d
#> /var/folders/dt/r5s12t392tb5sk181j3gs4zw0000gn/T/RtmpmTqzrq/filed45416d276a/e

paths %>% file_delete()

fs functions also work well in conjunction with other tidyverse packages, like dplyr and purrr.

Some examples…

suppressMessages(
  library(tidyverse))

Filter files by type, permission, size and 15 other attributes.

dir_info(recursive = TRUE) %>%
  filter(type == "file", permissions == "u+r", size > "10KB") %>%
  arrange(desc(size)) %>%
  select(path, permissions, size, modification_time)
#> # A tibble: 5 x 4
#>   path                       permissions        size modification_time  
#>   <fs::path>                 <fs::perms> <fs::bytes> <dttm>             
#> 1 libs/fs.so                 rwxr-xr-x        328.6K 2018-01-19 08:32:18
#> 2 R/fs.rdb                   rw-r--r--        214.7K 2018-01-19 08:32:18
#> 3 help/fs.rdb                rw-r--r--         45.1K 2018-01-19 08:32:19
#> 4 COPYRIGHTS                 rw-r--r--         24.1K 2018-01-19 08:32:18
#> 5 tests/testthat/test-path.R rw-r--r--         11.2K 2018-01-19 08:32:18

Tabulate and display folder size.

dir_info(recursive = TRUE) %>%
  group_by(directory = path_dir(path)) %>%
  tally(wt = size, sort = TRUE)
#> # A tibble: 8 x 2
#>   directory                n
#>   <fs::path>     <fs::bytes>
#> 1 libs               328.62K
#> 2 R                   217.8K
#> 3 tests/testthat      48.18K
#> 4 help                47.67K
#> 5 .                   29.96K
#> 6 html                 9.25K
#> 7 Meta                 4.82K
#> 8 tests                  728

Read a collection of files into one data frame.

dir_ls() returns a named vector, so it can be used directly with purrr::map_df(.id).

# Create separate files for each species
iris %>%
  split(.$Species) %>%
  map(select, -Species) %>%
  iwalk(~ write_tsv(.x, paste0(.y, ".tsv")))

# Show the files
iris_files <- dir_ls(glob = "*.tsv")
iris_files
#> setosa.tsv     versicolor.tsv virginica.tsv

# Read the data into a single table, including the filenames
iris_files %>%
  map_df(read_tsv, .id = "file", col_types = cols(), n_max = 2)
#> # A tibble: 6 x 5
#>   file           Sepal.Length Sepal.Width Petal.Length Petal.Width
#>   <chr>                 <dbl>       <dbl>        <dbl>       <dbl>
#> 1 setosa.tsv             5.10        3.50         1.40       0.200
#> 2 setosa.tsv             4.90        3.00         1.40       0.200
#> 3 versicolor.tsv         7.00        3.20         4.70       1.40 
#> 4 versicolor.tsv         6.40        3.20         4.50       1.50 
#> 5 virginica.tsv          6.30        3.30         6.00       2.50 
#> 6 virginica.tsv          5.80        2.70         5.10       1.90

file_delete(iris_files)

Feedback wanted!

We hope fs is a useful tool for both analysis scripts and packages. Please open GitHub issues for any feature requests or bugs.

In particular, we have found non-ASCII filenames in non-English locales on Windows to be especially tricky to reproduce and handle correctly. There is already one fix for this issue since fs was submitted to CRAN. Additional Feedback from users is greatly appreciated!

Learn more about fs at - http://fs.r-lib.org