Configure the modeling options for each selected model

Parameterize and/or tune biomod2's single models options.

bm_ModelingOptions(
  data.type = "binary",
  models,
  strategy,
  user.val = NULL,
  user.base = "bigboss",
  bm.format = NULL,
  calib.lines = NULL
)

Arguments

data.type: a character corresponding to the data type to be used, must be either binary, count, relative, abundance, multiclass, ordinal
models: a vector containing model names to be computed, must be among ANN, CTA, DNN, FDA, GAM, GBM, GLM, MARS, MAXENT, MAXNET, RF, RFd, SRE, XGBOOST
strategy: a character corresponding to the method to select models' parameters values, must be either default, bigboss, user.defined, tuned (see Details)
user.val: (optional, default NULL)
A list containing parameters values for some (all) models
user.base: (optional, default bigboss)
If strategy = 'user.defined', a character corresponding to the basic set of options to be modified by user defined values, must be either default or bigboss (see Details)
bm.format: (optional, default NULL)
A BIOMOD.formated.data or BIOMOD.formated.data.PA object returned by the BIOMOD_FormatingData function
calib.lines: (optional, default NULL)
A data.frame object returned by get_calib_lines or bm_CrossValidation functions

Value

A BIOMOD.models.options of object that can be used to build species distribution model(s) with the BIOMOD_Modeling function.

Details

This function creates a BIOMOD.models.options object containing parameter values for each single model that can be run within biomod2 through BIOMOD_Modeling function.

11 different algorithms are currently available, with different versions for some (GAM, MAXENT, RF). These models are associated with binary or nonbinary datatypes, but all combinations are not available. They are all listed within the ModelsTable dataset.

Different strategies are available to set those parameters, through the strategy argument :

default: all parameters names and values are directly retrieve from functions to be called through formalArgs and formals functions
bigboss: default parameter values are updated with values predefined by biomod2 team (see OptionsBigboss)
user.defined: default parameter values are updated with values provided by the user
tuned: default parameter values are updated by calling bm_Tuning function

To define the same options for all datasets of a model, options can be provided through the user.val parameter as a list whose name is for_all_datasets.

Note

MAXENT being the only external model (not called through a R package), default parameters, and their values, are the following :

FLAG	default	Description
`path_to_maxent.jar`	getwd()	a `character` corresponding to path to `maxent.jar` file
`memory_allocated`	512	an `integer` corresponding to the amount of memory (in Mo) reserved for `java` to run `MAXENT`, must be either `64`, `128`, `256`, `512`, `1024`... or `NULL` to use default `java` memory limitation parameter
`initial_heap_size`	NULL	a `character` corresponding to initial heap space (shared memory space) allocated to `java` (argument `-Xms` when calling `java`), must be either `1024K`, `4096M`, `10G` ... or `NULL` to use default `java` parameter. Used in `BIOMOD_Projection` but not in `BIOMOD_Modeling`.
`max_heap_size`	NULL	a `character` corresponding to maximum heap space (shared memory space) allocated to `java` (argument `-Xmx` when calling `java`), must be either `1024K`, `4096M`, `10G` ... or `NULL` to use default `java` parameter, and must be larger than `initial_heap_size`. Used in `BIOMOD_Projection` but not in `BIOMOD_Modeling`.
`background_data_dir`	'default'	a `character` corresponding to path to folder where explanatory variables are stored as `ASCII` files (raster format). If specified, `MAXENT` will generate its own background data from rasters of explanatory variables (`'default'` value). Otherwise biomod2 pseudo-absences will be used (see `BIOMOD_FormatingData`).
`visible`	FALSE	a `logical` value defining whether `MAXENT` user interface is to be used or not
`linear`	TRUE	a `logical` value defining whether linear features are to be used or not
`quadratic`	TRUE	a `logical` value defining whether quadratic features are to be used or not
`product`	TRUE	a `logical` value defining whether product features are to be used or not
`threshold`	TRUE	a `logical` value defining whether threshold features are to be used or not
`hinge`	TRUE	a `logical` value defining whether hinge features are to be used or not
`l2lqthreshold`	10	an `integer` corresponding to the number of samples at which quadratic features start being used
`lq2lqptthreshold`	80	an `integer` corresponding to the number of samples at which product and threshold features start being used
`hingethreshold`	15	an `integer` corresponding to the number of samples at which hinge features start being used
`beta_lqp`	-1.0	a `numeric` corresponding to the regularization parameter to be applied to all linear, quadratic and product features (negative value enables automatic setting)
`beta_threshold`	-1.0	a `numeric` corresponding to the regularization parameter to be applied to all threshold features (negative value enables automatic setting)
`beta_hinge`	-1.0	a `numeric` corresponding to the regularization parameter to be applied to all hinge features (negative value enables automatic setting)
`beta_categorical`	-1.0	a `numeric` corresponding to the regularization parameter to be applied to all categorical features (negative value enables automatic setting)
`betamultiplier`	1	a `numeric` corresponding to the number by which multiply all automatic regularization parameters (higher number gives a more spread-out distribution)
`defaultprevalence`	0.5	a `numeric` corresponding to the default prevalence of the modelled species (probability of presence at ordinary occurrence points)
`togglelayerselected`	NULL	a `character` defining the prefix to be used to REMOVE environmental layers whose names begin with this prefix (all selected by default)
`maximumbackground`	10000	an `integer` corresponding to the maximum number of pixels to be selected randomly among background points if larger than this number
`maximumiterations`	500	an `integer` corresponding to the maximum number of iterations for training
`convergencethreshold`	0.00005	a `numeric` corresponding to the drop in log loss per iteration below which the training will be stopped
`autofeature`	TRUE	a `logical` value defining whether feature classes to be used will automatically be selected based on number of training samples or not
`jackknife`	FALSE	a `logical` value defining whether variables' importance is to be measured or not. If selected, training will be done with each environmental variable first omitted then used in isolation.
`writeclampgrid`	FALSE	a `logical` value defining whether clamping mask is to be saved or not (absolute difference between prediction values with and without clamping)
`writemess`	FALSE	a `logical` value defining whether multidimensional environmental similarity surface (MESS) is to be saved or not (degree of novelness of environmental conditions; and which variable is the most out of range at each point)
`logfile`	'maxent.log'	a `character` corresponding to file name in which debugging information will be written
`verbose`	FALSE	a `logical` value defining whether detailed diagnostics for debugging should be given or not

Author

Damien Georges, Wilfried Thuiller, Maya Guéguen

Examples

library(terra)

# Load species occurrences (6 species available)
data(DataSpecies)
head(DataSpecies)

# Select the name of the studied species
myRespName <- 'GuloGulo'

# Get corresponding presence/absence data
myResp <- as.numeric(DataSpecies[, myRespName])

# Get corresponding XY coordinates
myRespXY <- DataSpecies[, c('X_WGS84', 'Y_WGS84')]

# Load environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
data(bioclim_current)
myExpl <- terra::rast(bioclim_current)

DONTSHOW({
myExtent <- terra::ext(0,30,45,70)
myExpl <- terra::crop(myExpl, myExtent)
})

# ---------------------------------------------------------------#
# Format Data with true absences
myBiomodData <- BIOMOD_FormatingData(resp.name = myRespName,
                                     resp.var = myResp,
                                     resp.xy = myRespXY,
                                     expl.var = myExpl)

# k-fold selection
cv.k <- bm_CrossValidation(bm.format = myBiomodData,
                           strategy = 'kfold',
                           nb.rep = 2,
                           k = 3)


# ---------------------------------------------------------------#
allModels <- c('ANN', 'CTA', 'DNN', 'FDA', 'GAM', 'GBM', 'GLM', 'MARS'
               , 'MAXENT', 'MAXNET', 'RF', 'RFd', 'SRE', 'XGBOOST')

# default parameters
opt.d <- bm_ModelingOptions(data.type = 'binary',
                            models = allModels,
                            strategy = 'default')

# providing formated data
opt.df <- bm_ModelingOptions(data.type = 'binary',
                             models = allModels,
                             strategy = 'default',
                             bm.format = myBiomodData,
                             calib.lines = cv.k)

opt.d
opt.d@models
opt.d@options$ANN.binary.nnet.nnet
names(opt.d@options$ANN.binary.nnet.nnet@args.values)

opt.df@options$ANN.binary.nnet.nnet
names(opt.df@options$ANN.binary.nnet.nnet@args.values)


# ---------------------------------------------------------------#
# bigboss parameters
opt.b <- bm_ModelingOptions(data.type = 'binary',
                            models = allModels,
                            strategy = 'bigboss')

# user defined parameters
user.SRE <- list('_allData_allRun' = list(quant = 0.01))
user.XGBOOST <- list('_allData_allRun' = list(nrounds = 10))
user.val <- list(SRE.binary.biomod2.bm_SRE = user.SRE
                 , XGBOOST.binary.xgboost.xgboost = user.XGBOOST)

opt.u <- bm_ModelingOptions(data.type = 'binary',
                            models = c('SRE', 'XGBOOST'),
                            strategy = 'user.defined',
                            user.val = user.val)

opt.b
opt.u

if (FALSE) { # \dontrun{
# tuned parameters with formated data
opt.t <- bm_ModelingOptions(data.type = 'binary',
                            models = c('SRE', 'XGBOOST'),
                            strategy = 'tuned',
                            bm.format = myBiomodData)
opt.t
} # }