ale
Package Release
NotesOctober 19, 2023
This version introduces various ALE-based statistics that let ALE be used for statistical inference, not just interpretable machine learning. A dedicated vignette introduces this functionality (see “ALE-based statistics for statistical inference and effect sizes” from the vignettes link on the main CRAN page at https://CRAN.R-project.org/package=ale). We introduce these statistics in detail in a working paper: Okoli, Chitu. 2023. “Statistical Inference Using Machine Learning and Classical Techniques Based on Accumulated Local Effects (ALE).” arXiv. https://doi.org/10.48550/arXiv.2310.09877. Please note that they might be further refined after peer review.
ale
and model_bootstrap
now output these
statistics. (ale_ixn
will come later.)ale
package with
the reference ALEPlot
package: “Comparison
between ALEPlot
and ale
packages” (available
from the vignettes link on the main CRAN page at https://CRAN.R-project.org/package=ale).var_cars
is a modified version of mtcars that features
many different types of variables.census
is a polished version of the adult income
dataset used for a vignette in the ALEPlot
package.silent = TRUE
to
ale
, ale_ixn
, or
model_bootstrap
.seed
argument to ale
, ale_ixn
, or
model_bootstrap
.By far the most extensive changes have been to assure the accuracy and stability of the package from a software engineering perspective. Even though these are not visible to users, they make the package more robust with hopefully fewer bugs. Indeed, the extensive data validation may help users debug their own errors.
assertthat
package; if not,
the function fails quickly with an appropriate error message.ALEPlot
package. These
tests should ensure that any future code that breaks the accuracy of ALE
calculations will be caught quickly.ale_ixn
).ale_ixn
).August 29, 2023
This is the first CRAN release of the ale
package. Here
is its official description with the initial release:
Accumulated Local Effects (ALE) were initially developed as a model-agnostic approach for global explanations of the results of black-box machine learning algorithms. (Apley, Daniel W., and Jingyu Zhu. “Visualizing the effects of predictor variables in black box supervised learning models.” Journal of the Royal Statistical Society Series B: Statistical Methodology 82.4 (2020): 1059-1086 doi:10.1111/rssb.12377.) ALE has two primary advantages over other approaches like partial dependency plots (PDP) and SHapley Additive exPlanations (SHAP): its values are not affected by the presence of interactions among variables in a model and its computation is relatively rapid. This package rewrites the original code from the ‘ALEPlot’ package for calculating ALE data and it completely reimplements the plotting of ALE values.
(This package uses the same GPL-2 license as the ALEPlot
package.)
This initial release replicates the full functionality of the
ALEPlot
package and a lot more. It currently presents three
functions:
ale
: create data for and plot one-way ALE (single
variables). ALE values may be bootstrapped.ale_ixn
: create data for and plot two-way ALE
interactions. Bootstrapping of the interaction ALE values has not yet
been implemented.model_bootstrap
: bootstrap an entire model, not just
the ALE values. This function returns the bootstrapped model statistics
and coefficients as well as the bootstrapped ALE values. This is the
appropriate approach for small samples.This release provides more details in the following vignettes (they are all available from the vignettes link on the main CRAN page at https://CRAN.R-project.org/package=ale):
ale
packageale
function handling of various datatypes for x