alien.models
This module contains wrapper classes for various kinds of ML models. To use
an externally-built model with ALIEN’s selector classes, you must first wrap
it in the appropriate subclass of Model
.
The documentation for Model
explains the shared interface for all
models in the ALIEN universe.
Deep learning models
An easy solution is to wrap your regression model (Pytorch, Keras or DeepChem)
with the Regressor
class:
wrapped_model = alien.models.Regressor(model=model, uncertainty='dropout', **kwargs)
This will tool your model to use dropouts to produce uncertainties and embeddings. Alternatively,
you may use uncertainty='laplace'
, in which case we will use the Laplace approximation
on the last layer of weights to produce uncertainties.
How to choose?
If you have an existing labeled dataset in a similar problem domain, you can try running a retrospective experiment with the different options. However, we do have some hints:
MC dropout, with the CovarianceSelector
, does best for regression
problems in our extensive benchmarks, so that’s a good place to start.
Gradient boosting models
ALIEN directly supports a number of ensemble models, including
LightGBMRegressor
CatBoostRegressor
plus a number of Scikit-Learn models, listed below.
Other models
ALIEN supports linear models in the form of Bayesian ridge regression (which is convenient for getting covariances), in its Scikit-Learn implementation:
BayesianRidgeRegressor
In fact, we support a number of Scikit-Learn models:
GaussianProcessRegressor
RandomForestRegressor
ExtraTreesRegressor
GradientBoostingRegressor
Abstract base classes
The superclass for all model wrapper classes:
- class alien.models.Model(X=None, y=None, data=None, random_seed=None, reinitialize=True, init_seed=None, shape=None, ensemble_size=40, **kwargs)[source]
Abstract base class for wrapping a model. Implementers must provide prediction and fitting (training) methods.
- Parameters:
X –
You may provide training data at the time of initialization. You may do so by passing
X
andy
parameters, or by passing a combineddata
(from which the model will extractdata.X
anddata.y
, if available, otherwisedata[:-1]
anddata[-1]
).You may instead pass in the training data when you call
fit()
.y –
data –
shape – Specifies the
shape
of the feature space. This will be set automatically if you provide training data.random_seed – Random seed for those models that need it.
init_seed – Random seed for initializing model weights. This is stored, and after each call to
initialize()
, it is incremented byINIT_SEED_INCREMENT
.reinitialize – Whether to reinitialize model weights before each
fit()
. Defaults toTrue
.ensemble_size – Sets the ensemble size. This parameter is used by
predict_ensemble()
to determine how many observations to produce. It is also used by some ensemble models (eg.,RandomForestRegressor
andCatBoostRegressor
) to set the size of their ensemble of estimators.
- abstract predict(X)[source]
Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).
- predict_samples(X, n=1)[source]
Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape
(len(X), n)
.
- fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)[source]
Fits the model to the given training data. If
X
andy
are not specified, this method looks forself.X
andself.y
. Iffit()
finds anX
but not ay
, it treatsX
as a combined datasetdata
, and then usesX, y = data.X, data.y
. If we can’t finddata.X
anddata.y
, we instead useX, y = data[:-1], data[-1]
.fit()
should also fit any accompanying uncertainty model.- Parameters:
reinitialize – If
True
, reinitializes model weights before fitting. IfFalse
, starts training from previous weight values. If not specified, usesself.reinitialize
)fit_uncertainty – If
True
, a call tofit()
will also callfit_uncertainty()
. Defaults toTrue
.
- fit_model(X=None, y=None, **kwargs)[source]
Fit just the model component, and not the uncertainties (if these are computed separately)
- fit_uncertainty(X=None, y=None)[source]
Fit just the uncertainties (if these need additional fitting beyond just the model)
- property shape
The shape of the feature space. Can either be specified directly, or inferred from training data, in which case
self.shape == X.shape[1:]
, i.e., the first (batch) dimension is dropped.This property is used by any methods which use the
@flatten_batch
decorator.
- property ndim
The number of axes in the feature space. Equal to
len(self.shape)
. Most commonly equal to 1. If training data have been specified, thenself.ndim == X.ndim - 1
.This property is used by any methods which use the
@flatten_batch
decorator.
- initialize(init_seed=None, sample_input=None)[source]
(Re)initializes the model weights. If
self.reinitialize
is True, this should be called at the start of everyfit()
, and this should be the default behaviour offit()
.
The class you will instantiate to wrap your deep learning models:
- class alien.models.Regressor(model=None, X=None, y=None, **kwargs)[source]
This class can accept as its first argument (or
model
), any of the deep learning models we currently support. So, Pytorch, Keras or DeepChem.Regressor
’s constructor will build a specialized subclass depending on the type ofmodel
. The resulting wrapped model will compute uncertainties and covariances in the way prescribed byuncertainty
.- Parameters:
model – A Pytorch, Keras or DeepChem model, to be wrapped.
uncertainty (str) – can be
'dropout'
or'laplace'
. This determines how the model will compute uncertainties and covariances.**kwargs – You can pass in arguments to the destined subclass. So, for example, if
model
is a DeepChem model, then**kwargs
may carry any of the arguments accepted byalien.models.DeepChemRegressor
.
- abstract predict(X, return_std_dev=False)[source]
Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).
- Parameters:
return_std_dev – if True, returns a tuple
(prediction, std_dev)
- fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)
Fits the model to the given training data. If
X
andy
are not specified, this method looks forself.X
andself.y
. Iffit()
finds anX
but not ay
, it treatsX
as a combined datasetdata
, and then usesX, y = data.X, data.y
. If we can’t finddata.X
anddata.y
, we instead useX, y = data[:-1], data[-1]
.fit()
should also fit any accompanying uncertainty model.- Parameters:
reinitialize – If
True
, reinitializes model weights before fitting. IfFalse
, starts training from previous weight values. If not specified, usesself.reinitialize
)fit_uncertainty – If
True
, a call tofit()
will also callfit_uncertainty()
. Defaults toTrue
.
- fit_model(X=None, y=None, **kwargs)
Fit just the model component, and not the uncertainties (if these are computed separately)
- fit_uncertainty(X=None, y=None)
Fit just the uncertainties (if these need additional fitting beyond just the model)
- initialize(init_seed=None, sample_input=None)
(Re)initializes the model weights. If
self.reinitialize
is True, this should be called at the start of everyfit()
, and this should be the default behaviour offit()
.
- static load(path)
Loads a model. This particular implementation only works if
save(path)
hasn’t been overloaded.
- property ndim
The number of axes in the feature space. Equal to
len(self.shape)
. Most commonly equal to 1. If training data have been specified, thenself.ndim == X.ndim - 1
.This property is used by any methods which use the
@flatten_batch
decorator.
- predict_samples(X, n=1)
Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape
(len(X), n)
.
- save(path)
Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).
For any subclass, the
save()
andload()
methods should be compatible with each other.
- property shape
The shape of the feature space. Can either be specified directly, or inferred from training data, in which case
self.shape == X.shape[1:]
, i.e., the first (batch) dimension is dropped.This property is used by any methods which use the
@flatten_batch
decorator.
Then we have several abstract base classes, defining the class hierarchy:
- class alien.models.CovarianceRegressor(model=None, X=None, y=None, **kwargs)[source]
- covariance(X)[source]
Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.
- predict_ensemble(X, multiple=1.0)[source]
Returns a correlated ensemble of predictions for samples X.
Ensembles are correlated only over the last batch dimension, corresponding to axis (-1 - self.ndim) of X. Earlier dimensions have no guarantee of correlation.
- Parameters:
multiple – standard deviation will be multiplied by this
- predict_samples(X, n=1, multiple=1.0, use_covariance_for_ensemble=None)[source]
Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape
(len(X), n)
.
- fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)
Fits the model to the given training data. If
X
andy
are not specified, this method looks forself.X
andself.y
. Iffit()
finds anX
but not ay
, it treatsX
as a combined datasetdata
, and then usesX, y = data.X, data.y
. If we can’t finddata.X
anddata.y
, we instead useX, y = data[:-1], data[-1]
.fit()
should also fit any accompanying uncertainty model.- Parameters:
reinitialize – If
True
, reinitializes model weights before fitting. IfFalse
, starts training from previous weight values. If not specified, usesself.reinitialize
)fit_uncertainty – If
True
, a call tofit()
will also callfit_uncertainty()
. Defaults toTrue
.
- fit_model(X=None, y=None, **kwargs)
Fit just the model component, and not the uncertainties (if these are computed separately)
- fit_uncertainty(X=None, y=None)
Fit just the uncertainties (if these need additional fitting beyond just the model)
- initialize(init_seed=None, sample_input=None)
(Re)initializes the model weights. If
self.reinitialize
is True, this should be called at the start of everyfit()
, and this should be the default behaviour offit()
.
- static load(path)
Loads a model. This particular implementation only works if
save(path)
hasn’t been overloaded.
- property ndim
The number of axes in the feature space. Equal to
len(self.shape)
. Most commonly equal to 1. If training data have been specified, thenself.ndim == X.ndim - 1
.This property is used by any methods which use the
@flatten_batch
decorator.
- abstract predict(X, return_std_dev=False)
Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).
- Parameters:
return_std_dev – if True, returns a tuple
(prediction, std_dev)
- save(path)
Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).
For any subclass, the
save()
andload()
methods should be compatible with each other.
- property shape
The shape of the feature space. Can either be specified directly, or inferred from training data, in which case
self.shape == X.shape[1:]
, i.e., the first (batch) dimension is dropped.This property is used by any methods which use the
@flatten_batch
decorator.
- class alien.models.EnsembleRegressor(model=None, X=None, y=None, **kwargs)[source]
Inherit from EnsembleRegressor if you wish to compute ensembles directly. This class provides covariance and prediction for free, given these ensembles of predictions.
Subclasses must implement one of
predict_ensemble()
orpredict_samples()
.- predict(X, return_std_dev=False)[source]
Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).
- Parameters:
return_std_dev – if True, returns a tuple
(prediction, std_dev)
- predict_ensemble(X, **kwargs)[source]
Returns an ensemble of predictions.
- Parameters:
multiple – standard deviation should be this much larger
- predict_samples(X, n=1, **kwargs)[source]
Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape
(len(X), n)
.
- covariance_ensemble(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])[source]
Compute covariance from the ensemble of predictions
- covariance(X)
Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.
- fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)
Fits the model to the given training data. If
X
andy
are not specified, this method looks forself.X
andself.y
. Iffit()
finds anX
but not ay
, it treatsX
as a combined datasetdata
, and then usesX, y = data.X, data.y
. If we can’t finddata.X
anddata.y
, we instead useX, y = data[:-1], data[-1]
.fit()
should also fit any accompanying uncertainty model.- Parameters:
reinitialize – If
True
, reinitializes model weights before fitting. IfFalse
, starts training from previous weight values. If not specified, usesself.reinitialize
)fit_uncertainty – If
True
, a call tofit()
will also callfit_uncertainty()
. Defaults toTrue
.
- fit_model(X=None, y=None, **kwargs)
Fit just the model component, and not the uncertainties (if these are computed separately)
- fit_uncertainty(X=None, y=None)
Fit just the uncertainties (if these need additional fitting beyond just the model)
- initialize(init_seed=None, sample_input=None)
(Re)initializes the model weights. If
self.reinitialize
is True, this should be called at the start of everyfit()
, and this should be the default behaviour offit()
.
- static load(path)
Loads a model. This particular implementation only works if
save(path)
hasn’t been overloaded.
- property ndim
The number of axes in the feature space. Equal to
len(self.shape)
. Most commonly equal to 1. If training data have been specified, thenself.ndim == X.ndim - 1
.This property is used by any methods which use the
@flatten_batch
decorator.
- save(path)
Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).
For any subclass, the
save()
andload()
methods should be compatible with each other.
- property shape
The shape of the feature space. Can either be specified directly, or inferred from training data, in which case
self.shape == X.shape[1:]
, i.e., the first (batch) dimension is dropped.This property is used by any methods which use the
@flatten_batch
decorator.
- std_dev(X, **kwargs)
Returns the (epistemic) standard deviation of the model on input
X
.