Multivariate transformations
The purpose of this vignette is to demonstrate how we can use the MultivariateStats package on layers.
This functionality is supported through an extension, which is only active when the MultivariateStats package is loaded.
using SpeciesDistributionToolkit
using CairoMakie
using Statistics
import StatsBase
using MultivariateStats The support is currently limited to PCA
spatial_extent = (left = 8.412, bottom = 41.325, right = 9.662, top = 43.060)
dataprovider = RasterData(CHELSA1, BioClim)
L = [SDMLayer(dataprovider; layer = i, spatial_extent...) for i in [1, 3, 8, 12]]4-element Vector{SDMLayer{Int16}}:
🗺️ A 209 × 151 layer (14432 Int16 cells)
🗺️ A 209 × 151 layer (14432 Int16 cells)
🗺️ A 209 × 151 layer (14432 Int16 cells)
🗺️ A 209 × 151 layer (14432 Int16 cells)We can fit a PCA the "normal" way:
M = fit(PCA, L; maxoutdim=2)PCA(indim = 4, outdim = 2, principalratio = 0.9861689593337618)
Pattern matrix (unstandardized loadings):
StatsBase.CoefTable(Any[[-25.351684398488732, 10.803036653834626, -28.257150860010537, 153.09190256751242], [17.68337136749613, -0.8569279333946745, 23.581494704564843, 7.341384543574963]], ["PC1", "PC2"], ["1", "2", "3", "4"], 0, 0)
Importance of components:
StatsBase.CoefTable(Any[[24995.01070925078, 0.9510338472239472, 0.9510338472239472, 0.9643721171942471, 0.9643721171942471], [923.4187679218759, 0.03513511210981445, 0.9861689593337617, 0.03562788280575279, 0.9999999999999999]], ["PC1", "PC2"], ["SS Loadings (Eigenvalues)", "Variance explained", "Cumulative variance", "Proportion explained", "Cumulative proportion"], 0, 0)Note that this will return a PCA result object. We can use it to project a vector of layers:
X = predict(M, L)2-element Vector{SDMLayer{Float64}}:
🗺️ A 209 × 151 layer (14432 Float64 cells)
🗺️ A 209 × 151 layer (14432 Float64 cells)The last argument must be a vector of layers (or a vector of layers), which will be used as a template to store the output values in.
Data leakage
Transforming the layers before extracting the values of environmental conditions for SDMs creates data leakage and should never be done. SDeMo handles projections the correct way.
We can visualize the result of this projection (first principal component):

Code for the figure
fig, ax, hm = heatmap(
X[1],
colormap = :Spectral,
figure = (; size = (800, 400)),
axis = (; aspect = DataAspect()),
)