Introduction to formattable package

2016-08-05

This package is designed for applying formatting on vectors and data frames to make data presentation easier, richer, more flexible and hopefully convey more information.

Atomic vectors are fundamental data structures in R. Some data can be read more easily with formatting. A numeric vector, for example, stores percentage numbers but is printed as typical floating numbers. This package provides functions to create data structures with predefined formatting rules so that these objects store the original data but are printed with formatting.

Several typical formattable numeric vectors are provided such as percent, comma, currency, accounting and scientific. These functions basically create numeric vectors with pre-defined formatting rules and parameters. For example,

library(formattable)
p <- percent(c(0.1, 0.02, 0.03, 0.12))
p
## [1] 10.00% 2.00%  3.00%  12.00%

The percent vector is no different from a numeric vector but has a percentage representation when printed. It works with arithmetic operations and other common functions and preserves its formatting.

p + 0.05
## [1] 15.00% 7.00%  8.00%  17.00%
p + percent(0.02)
## [1] 12.00% 4.00%  5.00%  14.00%
p * 1.1
## [1] 11.00% 2.20%  3.30%  13.20%
max(p)
## [1] 12.00%
mean(p)
## [1] 6.75%

It also works with subsetting and sub-assignment:

p[1:3]
## [1] 10.00% 2.00%  3.00%
p[[2]]
## [1] 2.00%
p[[3]] <- 0.05
p
## [1] 10.00% 2.00%  5.00%  12.00%
balance <- accounting(c(1000, 500, 200, -150, 0, 1200))
balance
## [1] 1,000.00 500.00   200.00   (150.00) 0.00     1,200.00
balance + 1000
## [1] 2,000.00 1,500.00 1,200.00 850.00   1,000.00 2,200.00

These functions are specialized applications of what formattable() is designed to do. formattable() applies customizable formatting functions to objects of a wide range of classes like numeric, logical, factor, Date, data.frame, etc.

When applied to Date, formattable() uses format.Date() as the default formatter function. The following code creates a formattable Date vector that is printed in the format of %Y%m%d. However, it is not a plain integer or character vector but of Date class and still allows date calculations.

dates <- formattable(as.Date(c("2016-05-01", "2016-05-10")), format = "%Y%m%d")
dates
## [1] 20160501 20160510
dates + 30
## [1] 20160531 20160609

When applied to a logical vector, we can customize how TRUE and FALSE values are printed.

lv <- formattable(c(TRUE, FALSE, FALSE, TRUE), "yes", "no")
lv
## [1] yes no  no  yes
!lv
## [1] no  yes yes no

Note that isTRUE() does not directly work with values of lv because isTRUE() uses identical(x, TRUE) and lv[[1]], as a formattable logical value is not identical to a plain TRUE.

lv[[1]]
## [1] yes
isTRUE(lv[[1]])
## [1] FALSE

If isTRUE() has to be applied, lv == TRUE returns a plain logical vector and works with isTRUE(). Other vectorized logical functions directly work with formattable logical vector with the formatting preserved.

all(lv)
## [1] no
any(lv)
## [1] yes

All formattable functions work with matrices and arrays.

pm <- matrix(rnorm(6, 0.8, 0.1), 2, 3,
dimnames = list(c("a", "b"), c("X", "Y", "Z")))
pm
##           X         Y         Z
## a 0.7424653 0.6382117 0.8519407
## b 0.8607964 0.7944438 0.8301153
fpm <- percent(pm)
fpm
##   X      Y      Z
## a 74.25% 63.82% 85.19%
## b 86.08% 79.44% 83.01%
fpm["a", c("Y", "Z")]
##      Y      Z
## 63.82% 85.19%
pa <- array(rnorm(12, 0.8, 0.1), c(2, 3, 2))
pa
## , , 1
##
##           [,1]      [,2]      [,3]
## [1,] 0.8105676 0.7150296 0.8117647
## [2,] 0.7359294 0.6975871 0.7052525
##
## , , 2
##
##           [,1]      [,2]      [,3]
## [1,] 0.7509443 0.9843862 0.8235387
## [2,] 0.7743908 0.7348050 0.8077961
percent(pa)
## , , 1
##
##      [,1]   [,2]   [,3]
## [1,] 81.06% 71.50% 81.18%
## [2,] 73.59% 69.76% 70.53%
##
## , , 2
##
##      [,1]   [,2]   [,3]
## [1,] 75.09% 98.44% 82.35%
## [2,] 77.44% 73.48% 80.78%

When the formattable vectors are used as columns of a data frame, the formatting of each column is well preserved. A typical data frame may look more friendly with formattable column vectors. For example,

p <- data.frame(
id = c(1, 2, 3, 4, 5),
name = c("A1", "A2", "B1", "B2", "C1"),
balance = accounting(c(52500, 36150, 25000, 18300, 7600), format = "d"),
growth = percent(c(0.3, 0.3, 0.1, 0.15, 0.15), format = "d"),
ready = formattable(c(TRUE, TRUE, FALSE, FALSE, TRUE), "yes", "no"))
p
##   id name balance growth ready
## 1  1   A1  52,500    30%   yes
## 2  2   A2  36,150    30%   yes
## 3  3   B1  25,000    10%    no
## 4  4   B2  18,300    15%    no
## 5  5   C1   7,600    15%   yes

The subset of a data frame also preserves the formatting of each column:

p[1:3, c("name", "balance", "growth")]
##   name balance growth
## 1   A1  52,500    30%
## 2   A2  36,150    30%
## 3   B1  25,000    10%