R/bm_FindOptimStat.R
bm_FindOptimStat.Rd
This internal biomod2 function allows the user to find the threshold to convert continuous values into binary ones leading to the best score for a given evaluation metric.
bm_FindOptimStat(
metric.eval = "TSS",
obs,
fit,
nb.thresh = 100,
threshold = NULL,
boyce.bg.env = NULL,
mpa.perc = 0.9
)
get_optim_value(metric.eval)
bm_CalculateStat(misc, metric.eval = "TSS")
a character
corresponding to the evaluation metric to be used, must
be either POD
, FAR
, POFD
, SR
, ACCURACY
, BIAS
,
ROC
, TSS
, KAPPA
, OR
, ORSS
, CSI
, ETS
,
BOYCE
, MPA
a vector
of observed values (binary, 0
or 1
)
a vector
of fitted values (continuous)
an integer
corresponding to the number of thresholds to be
tested over the range of fitted values
(optional, default NULL
)
A numeric
corresponding to the threshold used to convert the given data
(optional, default NULL
)
A matrix
, data.frame
, SpatVector
or SpatRaster
object containing values of
environmental variables (in columns or layers) extracted from the background
(if presences are to be compared to background instead of absences or
pseudo-absences selected for modeling)
Note that old format from raster and sp are still supported such as
RasterStack
and SpatialPointsDataFrame
objects.
a numeric
between 0
and 1
corresponding to the percentage
of correctly classified presences for Minimal Predicted Area (see ecospat.mpa()
in
ecospat)
a matrix
corresponding to a contingency table
A 1
row x 5
columns data.frame
containing :
metric.eval
: the chosen evaluation metric
cutoff
: the associated cut-off used to transform the continuous values into
binary
sensitivity
: the sensibility obtained on fitted values with this threshold
specificity
: the specificity obtained on fitted values with this threshold
best.stat
: the best score obtained for the chosen evaluation metric
POD
: Probability of detection (hit rate)
FAR
: False alarm ratio
POFD
: Probability of false detection (fall-out)
SR
: Success ratio
ACCURACY
: Accuracy (fraction correct)
BIAS
: Bias score (frequency bias)
ROC
: Relative operating characteristic
TSS
: True skill statistic (Hanssen and Kuipers discriminant, Peirce's
skill score)
KAPPA
: Cohen's Kappa (Heidke skill score)
OR
: Odds Ratio
ORSS
: Odds ratio skill score (Yule's Q)
CSI
: Critical success index (threat score)
ETS
: Equitable threat score (Gilbert skill score)
BOYCE
: Boyce index
MPA
: Minimal predicted area (cutoff optimising MPA to predict 90% of
presences)
Optimal value of each method can be obtained with the get_optim_value
function.
Please refer to the CAWRC website
(section "Methods for dichotomous forecasts") to get detailed description of each metric.
Note that if a value is given to threshold
, no optimisation will be done., and
only the score for this threshold will be returned.
The Boyce index returns NA
values for SRE
models because it can not be
calculated with binary predictions.
This is also the reason why some NA
values
might appear for GLM
models if they do not converge.
In order to break dependency loop between packages biomod2 and ecospat,
code of ecospat.boyce()
and ecospat.mpa()
in ecospat)
functions have been copied within this file from version 3.2.2 (august 2022).
Engler, R., Guisan, A., and Rechsteiner L. 2004. An improved approach for predicting the distribution of rare and endangered species from occurrence and pseudo-absence data. Journal of Applied Ecology, 41(2), 263-274.
Hirzel, A. H., Le Lay, G., Helfer, V., Randin, C., and Guisan, A. 2006. Evaluating the ability of habitat suitability models to predict species presences. Ecological Modelling, 199(2), 142-152.
ecospat.boyce()
and ecospat.mpa()
in ecospat,
BIOMOD_Modeling
, bm_RunModelsLoop
,
BIOMOD_EnsembleModeling
Other Secundary functions:
bm_BinaryTransformation()
,
bm_CrossValidation()
,
bm_MakeFormula()
,
bm_ModelingOptions()
,
bm_PlotEvalBoxplot()
,
bm_PlotEvalMean()
,
bm_PlotRangeSize()
,
bm_PlotResponseCurves()
,
bm_PlotVarImpBoxplot()
,
bm_PseudoAbsences()
,
bm_RunModelsLoop()
,
bm_SRE()
,
bm_SampleBinaryVector()
,
bm_SampleFactorLevels()
,
bm_Tuning()
,
bm_VariablesImportance()
## Generate a binary vector
vec.a <- sample(c(0, 1), 100, replace = TRUE)
## Generate a 0-1000 vector (random drawing)
vec.b <- runif(100, min = 0, max = 1000)
## Generate a 0-1000 vector (biased drawing)
BiasedDrawing <- function(x, m1 = 300, sd1 = 200, m2 = 700, sd2 = 200) {
return(ifelse(x < 0.5, rnorm(1, m1, sd1), rnorm(1, m2, sd2)))
}
vec.c <- sapply(vec.a, BiasedDrawing)
vec.c[which(vec.c < 0)] <- 0
vec.c[which(vec.c > 1000)] <- 1000
## Find optimal threshold for a specific evaluation metric
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.b, obs = vec.a)
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.c, obs = vec.a, nb.thresh = 100)
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.c, obs = vec.a, threshold = 280)