Skip to content

Tools for SDM demos and education

The prediction pipeline

SDeMo.AbstractSDM Type
julia
AbstractSDM

This abstract type covers both the regular and the ensemble models.

source

SDeMo.AbstractEnsembleSDM Type
julia
AbstractEnsembleSDM

This abstract types covers model that combine different SDMs to make a prediction, which currently covers Bagging and Ensemble.

source

SDeMo.SDM Type
julia
SDM

This type specifies a full model, which is composed of a transformer (which applies a transformation on the data), a classifier (which returns a quantitative score), a threshold (above which the score corresponds to the prediction of a presence).

In addition, the SDM carries with it the training features and labels, as well as a vector of indices indicating which variables are actually used by the model.

source

SDeMo.Transformer Type
julia
Transformer

This abstract type covers all transformations that are applied to the data before fitting the classifier.

source

SDeMo.Classifier Type
julia
Classifier

This abstract type covers all algorithms to convert transformed data into prediction.

source

Utility functions

SDeMo.features Function
julia
features(sdm::SDM)

Returns the features stored in the field X of the SDM. Note that the features are an array, and this does not return a copy of it – any change made to the output of this function will change the content of the SDM features.

source

julia
features(sdm::SDM, n)

Returns the n-th feature stored in the field X of the SDM.

source

SDeMo.labels Function
julia
labels(sdm::SDM)

Returns the labels stored in the field y of the SDM – note that this is not a copy of the labels, but the object itself.

source

SDeMo.threshold Function
julia
threshold(sdm::SDM)

This returns the value above which the score returned by the SDM is considered to be a presence.

source

SDeMo.threshold! Function
julia
threshold!(sdm::SDM, τ)

Sets the value of the threshold.

source

SDeMo.variables Function
julia
variables(sdm::SDM)

Returns the list of variables used by the SDM – these may be ordered by importance. This does not return a copy of the variables array, but the array itself.

source

SDeMo.variables! Function
julia
variables!(sdm::SDM, v)

Sets the list of variables.

source

SDeMo.instance Function
julia
instance(sdm::SDM, n; strict=true)

Returns the n-th instance stored in the field X of the SDM. If the keyword argument strict is true, only the variables used for prediction are returned.

source

Training and predicting

SDeMo.train! Function
julia
train!(ensemble::Bagging; kwargs...)

Trains all the model in an ensemble model - the keyword arguments are passed to train! for each model. Note that this retrains the entire model, which includes the transformers.

source

julia
train!(ensemble::Ensemble; kwargs...)

Trains all the model in an heterogeneous ensemble model - the keyword arguments are passed to train! for each model. Note that this retrains the entire model, which includes the transformers.

The keywod arguments are passed to train! and can include the training indices.

source

julia
train!(sdm::SDM; threshold=true, training=:, optimality=mcc)

This is the main training function to train a SDM.

The three keyword arguments are:

  • training: defaults to :, and is the range (or alternatively the indices) of the data that are used to train the model

  • threshold: defaults to true, and performs moving threshold by evaluating 200 possible values between the minimum and maximum output of the model, and returning the one that is optimal

  • optimality: defaults to mcc, and is the function applied to the confusion matrix to evaluate which value of the threshold is the best

  • absences: defaults to false, and indicates whether the (pseudo) absences are used to train the transformer; when using actual absences, this should be set to true

Internally, this function trains the transformer, then projects the data, then trains the classifier. If threshold is true, the threshold is then optimized.

source

StatsAPI.predict Function
julia
predict(model::RegressionModel, [newX])

Form the predicted response of model. An object with new covariate values newX can be supplied, which should have the same type and structure as that used to fit model; e.g. for a GLM it would generally be a DataFrame with the same variable names as the original predictors.

source

SDeMo.reset! Function
julia
reset!(sdm::SDM, thr=0.5)

Resets a model, with a potentially specified value of the threshold. This amounts to re-using all the variables, and removing the tuned threshold version.

source