Variables' importance calculation

This internal biomod2 function allows the user to compute a variable importance value for each variable involved in the given model.

bm_VariablesImportance(
  bm.model,
  expl.var,
  variables = NULL,
  method = "full_rand",
  nb.rep = 1,
  seed.val = NULL,
  do.progress = TRUE,
  temp.workdir = NULL
)

Arguments

bm.model: a biomod2_model object (or nnet, rpart, fda, gam, glm, lm, gbm, mars, randomForest, xgb.Booster) that can be obtained with the get_formal_model function
expl.var: a data.frame containing the explanatory variables that will be used to compute the variables importance
variables: (optional, default NULL)
A vector containing the names of the explanatory variables that will be considered
method: a character corresponding to the randomization method to be used, must be full_rand (only method available so far)
nb.rep: an integer corresponding to the number of permutations to be done for each variable
seed.val: (optional, default NULL)
An integer value corresponding to the new seed value to be set
do.progress: (optional, default TRUE)
A logical value defining whether the progress bar is to be rendered or not
temp.workdir: (optional, default NULL)
A character value corresponding to the folder name containing temporal prediction files when using MAXENT

Value

A 3 columns data.frame containing variable's importance scores for each permutation run :

expl.var : the considered explanatory variable (the one permuted)
rand : the ID of the permutation run
var.imp : the variable's importance score

Details

For each variable to be evaluated :

shuffle the original variable
compute model prediction with shuffled variable
calculate Pearson's correlation between reference and shuffled predictions
return score as 1 - cor

The highest the value, the less reference and shuffled predictions are correlated, and the more influence the variable has on the model. A value of 0 assumes no influence of the variable on the model.

Note that this calculation does not account for variables' interactions.

The same principle is used in randomForest.

Author

Damien Georges

Examples

## Create simple simulated data
myResp.s <- sample(c(0, 1), 20, replace = TRUE)
myExpl.s <- data.frame(var1 = sample(c(0, 1), 100, replace = TRUE),
                       var2 = rnorm(100),
                       var3 = 1:100)

## Compute variables importance
mod <- glm(var1 ~ var2 + var3, data = myExpl.s)
bm_VariablesImportance(bm.model = mod, 
                       expl.var = myExpl.s[, c('var2', 'var3')],
                       method = "full_rand",
                       nb.rep = 3)

Arguments

Value

Details

See also

Author

Examples