This internal biomod2 function allows the user to find the threshold to convert continuous values into binary ones leading to the best score for a given evaluation metric.

bm_FindOptimStat(
  metric.eval = "TSS",
  obs,
  fit,
  nb.thresh = 100,
  threshold = NULL,
  boyce.bg.env = NULL,
  mpa.perc = 0.9
)

get_optim_value(metric.eval)

bm_CalculateStat(misc, metric.eval = "TSS")

Arguments

metric.eval

a character corresponding to the evaluation metric to be used, must be either POD, FAR, POFD, SR, ACCURACY, BIAS, ROC, TSS, KAPPA, OR, ORSS, CSI, ETS, BOYCE, MPA

obs

a vector of observed values (binary, 0 or 1)

fit

a vector of fitted values (continuous)

nb.thresh

an integer corresponding to the number of thresholds to be tested over the range of fitted values

threshold

(optional, default NULL)
A numeric corresponding to the threshold used to convert the given data

boyce.bg.env

(optional, default NULL)
A matrix, data.frame, SpatVector or SpatRaster object containing values of environmental variables (in columns or layers) extracted from the background (if presences are to be compared to background instead of absences or pseudo-absences selected for modeling)
Note that old format from raster and sp are still supported such as RasterStack and SpatialPointsDataFrame objects.

mpa.perc

a numeric between 0 and 1 corresponding to the percentage of correctly classified presences for Minimal Predicted Area (see ecospat.mpa() in ecospat)

misc

a matrix corresponding to a contingency table

Value

A 1 row x 5 columns data.frame containing :

  • metric.eval : the chosen evaluation metric

  • cutoff : the associated cut-off used to transform the continuous values into binary

  • sensitivity : the sensibility obtained on fitted values with this threshold

  • specificity : the specificity obtained on fitted values with this threshold

  • best.stat : the best score obtained for the chosen evaluation metric

Details

simple

  • POD : Probability of detection (hit rate)

  • FAR : False alarm ratio

  • POFD : Probability of false detection (fall-out)

  • SR : Success ratio

  • ACCURACY : Accuracy (fraction correct)

  • BIAS : Bias score (frequency bias)

complex

  • ROC : Relative operating characteristic

  • TSS : True skill statistic (Hanssen and Kuipers discriminant, Peirce's skill score)

  • KAPPA : Cohen's Kappa (Heidke skill score)

  • OR : Odds Ratio

  • ORSS : Odds ratio skill score (Yule's Q)

  • CSI : Critical success index (threat score)

  • ETS : Equitable threat score (Gilbert skill score)

presence-only

  • BOYCE : Boyce index

  • MPA : Minimal predicted area (cutoff optimising MPA to predict 90% of presences)

Optimal value of each method can be obtained with the get_optim_value function.
Please refer to the CAWRC website (section "Methods for dichotomous forecasts") to get detailed description of each metric.

Note that if a value is given to threshold, no optimisation will be done., and only the score for this threshold will be returned.

The Boyce index returns NA values for SRE models because it can not be calculated with binary predictions.
This is also the reason why some NA values might appear for GLM models if they do not converge.

Note

In order to break dependency loop between packages biomod2 and ecospat, code of ecospat.boyce() and ecospat.mpa() in ecospat) functions have been copied within this file from version 3.2.2 (august 2022).

References

  • Engler, R., Guisan, A., and Rechsteiner L. 2004. An improved approach for predicting the distribution of rare and endangered species from occurrence and pseudo-absence data. Journal of Applied Ecology, 41(2), 263-274.

  • Hirzel, A. H., Le Lay, G., Helfer, V., Randin, C., and Guisan, A. 2006. Evaluating the ability of habitat suitability models to predict species presences. Ecological Modelling, 199(2), 142-152.

Author

Damien Georges

Examples

## Generate a binary vector
vec.a <- sample(c(0, 1), 100, replace = TRUE)

## Generate a 0-1000 vector (random drawing)
vec.b <- runif(100, min = 0, max = 1000)

## Generate a 0-1000 vector (biased drawing)
BiasedDrawing <- function(x, m1 = 300, sd1 = 200, m2 = 700, sd2 = 200) {
  return(ifelse(x < 0.5, rnorm(1, m1, sd1), rnorm(1, m2, sd2)))
}
vec.c <- sapply(vec.a, BiasedDrawing)
vec.c[which(vec.c < 0)] <- 0
vec.c[which(vec.c > 1000)] <- 1000

## Find optimal threshold for a specific evaluation metric
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.b, obs = vec.a)
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.c, obs = vec.a, nb.thresh = 100)
bm_FindOptimStat(metric.eval = 'TSS', fit = vec.c, obs = vec.a, threshold = 280)