alien.models

This module contains wrapper classes for various kinds of ML models. To use an externally-built model with ALIEN’s selector classes, you must first wrap it in the appropriate subclass of Model.

The documentation for Model explains the shared interface for all models in the ALIEN universe.

Deep learning models

An easy solution is to wrap your regression model (Pytorch, Keras or DeepChem) with the Regressor class:

wrapped_model = alien.models.Regressor(model=model, uncertainty='dropout', **kwargs)

This will tool your model to use dropouts to produce uncertainties and embeddings. Alternatively, you may use uncertainty='laplace', in which case we will use the Laplace approximation on the last layer of weights to produce uncertainties.

How to choose?

If you have an existing labeled dataset in a similar problem domain, you can try running a retrospective experiment with the different options. However, we do have some hints:

MC dropout, with the CovarianceSelector, does best for regression problems in our extensive benchmarks, so that’s a good place to start.

Gradient boosting models

ALIEN directly supports a number of ensemble models, including

LightGBMRegressor
CatBoostRegressor

plus a number of Scikit-Learn models, listed below.

Other models

ALIEN supports linear models in the form of Bayesian ridge regression (which is convenient for getting covariances), in its Scikit-Learn implementation:

BayesianRidgeRegressor

In fact, we support a number of Scikit-Learn models:

GaussianProcessRegressor
RandomForestRegressor
ExtraTreesRegressor
GradientBoostingRegressor

Abstract base classes

The superclass for all model wrapper classes:

class alien.models.Model(X=None, y=None, data=None, random_seed=None, reinitialize=True, init_seed=None, shape=None, ensemble_size=40, **kwargs)[source]

Abstract base class for wrapping a model. Implementers must provide prediction and fitting (training) methods.

Parameters:

X –
You may provide training data at the time of initialization. You may do so by passing X and y parameters, or by passing a combined data (from which the model will extract data.X and data.y, if available, otherwise data[:-1] and data[-1]).

You may instead pass in the training data when you call fit().
y –
data –
shape – Specifies the shape of the feature space. This will be set automatically if you provide training data.
random_seed – Random seed for those models that need it.
init_seed – Random seed for initializing model weights. This is stored, and after each call to initialize(), it is incremented by INIT_SEED_INCREMENT.
reinitialize – Whether to reinitialize model weights before each fit(). Defaults to True.
ensemble_size – Sets the ensemble size. This parameter is used by predict_ensemble() to determine how many observations to produce. It is also used by some ensemble models (eg., RandomForestRegressor and CatBoostRegressor) to set the size of their ensemble of estimators.

abstract predict(X)[source]: Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).

predict_samples(X, n=1)[source]: Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape (len(X), n).

fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)[source]

Fits the model to the given training data. If X and y are not specified, this method looks for self.X and self.y. If fit() finds an X but not a y, it treats X as a combined dataset data, and then uses X, y = data.X, data.y. If we can’t find data.X and data.y, we instead use X, y = data[:-1], data[-1].

fit() should also fit any accompanying uncertainty model.

Parameters:

reinitialize – If True, reinitializes model weights before fitting. If False, starts training from previous weight values. If not specified, uses self.reinitialize)
fit_uncertainty – If True, a call to fit() will also call fit_uncertainty(). Defaults to True.

fit_model(X=None, y=None, **kwargs)[source]: Fit just the model component, and not the uncertainties (if these are computed separately)

fit_uncertainty(X=None, y=None)[source]: Fit just the uncertainties (if these need additional fitting beyond just the model)

property shape

The shape of the feature space. Can either be specified directly, or inferred from training data, in which case self.shape == X.shape[1:], i.e., the first (batch) dimension is dropped.

This property is used by any methods which use the @flatten_batch decorator.

property ndim

The number of axes in the feature space. Equal to len(self.shape). Most commonly equal to 1. If training data have been specified, then self.ndim == X.ndim - 1.

This property is used by any methods which use the @flatten_batch decorator.

initialize(init_seed=None, sample_input=None)[source]: (Re)initializes the model weights. If self.reinitialize is True, this should be called at the start of every fit(), and this should be the default behaviour of fit().

save(path)[source]

Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).

For any subclass, the save() and load() methods should be compatible with each other.

static load(path)[source]: Loads a model. This particular implementation only works if save(path) hasn’t been overloaded.

The class you will instantiate to wrap your deep learning models:

class alien.models.Regressor(model=None, X=None, y=None, **kwargs)[source]

This class can accept as its first argument (or model), any of the deep learning models we currently support. So, Pytorch, Keras or DeepChem.

Regressor’s constructor will build a specialized subclass depending on the type of model. The resulting wrapped model will compute uncertainties and covariances in the way prescribed by uncertainty.

Parameters:

model – A Pytorch, Keras or DeepChem model, to be wrapped.
uncertainty (str) – can be 'dropout' or 'laplace'. This determines how the model will compute uncertainties and covariances.
**kwargs – You can pass in arguments to the destined subclass. So, for example, if model is a DeepChem model, then **kwargs may carry any of the arguments accepted by alien.models.DeepChemRegressor.

abstract predict(X, return_std_dev=False)[source]

Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).

Parameters:: return_std_dev – if True, returns a tuple (prediction, std_dev)

fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)

Fits the model to the given training data. If X and y are not specified, this method looks for self.X and self.y. If fit() finds an X but not a y, it treats X as a combined dataset data, and then uses X, y = data.X, data.y. If we can’t find data.X and data.y, we instead use X, y = data[:-1], data[-1].

fit() should also fit any accompanying uncertainty model.

Parameters:

reinitialize – If True, reinitializes model weights before fitting. If False, starts training from previous weight values. If not specified, uses self.reinitialize)
fit_uncertainty – If True, a call to fit() will also call fit_uncertainty(). Defaults to True.

fit_model(X=None, y=None, **kwargs): Fit just the model component, and not the uncertainties (if these are computed separately)

fit_uncertainty(X=None, y=None): Fit just the uncertainties (if these need additional fitting beyond just the model)

initialize(init_seed=None, sample_input=None): (Re)initializes the model weights. If self.reinitialize is True, this should be called at the start of every fit(), and this should be the default behaviour of fit().

static load(path): Loads a model. This particular implementation only works if save(path) hasn’t been overloaded.

property ndim

The number of axes in the feature space. Equal to len(self.shape). Most commonly equal to 1. If training data have been specified, then self.ndim == X.ndim - 1.

This property is used by any methods which use the @flatten_batch decorator.

predict_samples(X, n=1): Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape (len(X), n).

save(path)

Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).

For any subclass, the save() and load() methods should be compatible with each other.

property shape

The shape of the feature space. Can either be specified directly, or inferred from training data, in which case self.shape == X.shape[1:], i.e., the first (batch) dimension is dropped.

This property is used by any methods which use the @flatten_batch decorator.

Then we have several abstract base classes, defining the class hierarchy:

class alien.models.CovarianceRegressor(model=None, X=None, y=None, **kwargs)[source]

covariance(X)[source]: Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.

predict_ensemble(X, multiple=1.0)[source]

Returns a correlated ensemble of predictions for samples X.

Ensembles are correlated only over the last batch dimension, corresponding to axis (-1 - self.ndim) of X. Earlier dimensions have no guarantee of correlation.

Parameters:: multiple – standard deviation will be multiplied by this

predict_samples(X, n=1, multiple=1.0, use_covariance_for_ensemble=None)[source]: Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape (len(X), n).

std_dev(X, **kwargs)[source]: Returns the (epistemic) standard deviation of the model on input X.

fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)

Fits the model to the given training data. If X and y are not specified, this method looks for self.X and self.y. If fit() finds an X but not a y, it treats X as a combined dataset data, and then uses X, y = data.X, data.y. If we can’t find data.X and data.y, we instead use X, y = data[:-1], data[-1].

fit() should also fit any accompanying uncertainty model.

Parameters:

reinitialize – If True, reinitializes model weights before fitting. If False, starts training from previous weight values. If not specified, uses self.reinitialize)
fit_uncertainty – If True, a call to fit() will also call fit_uncertainty(). Defaults to True.

fit_model(X=None, y=None, **kwargs): Fit just the model component, and not the uncertainties (if these are computed separately)

fit_uncertainty(X=None, y=None): Fit just the uncertainties (if these need additional fitting beyond just the model)

initialize(init_seed=None, sample_input=None): (Re)initializes the model weights. If self.reinitialize is True, this should be called at the start of every fit(), and this should be the default behaviour of fit().

static load(path): Loads a model. This particular implementation only works if save(path) hasn’t been overloaded.

property ndim

The number of axes in the feature space. Equal to len(self.shape). Most commonly equal to 1. If training data have been specified, then self.ndim == X.ndim - 1.

This property is used by any methods which use the @flatten_batch decorator.

abstract predict(X, return_std_dev=False)

Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).

Parameters:: return_std_dev – if True, returns a tuple (prediction, std_dev)

save(path)

Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).

For any subclass, the save() and load() methods should be compatible with each other.

property shape

The shape of the feature space. Can either be specified directly, or inferred from training data, in which case self.shape == X.shape[1:], i.e., the first (batch) dimension is dropped.

This property is used by any methods which use the @flatten_batch decorator.

class alien.models.EnsembleRegressor(model=None, X=None, y=None, **kwargs)[source]

Inherit from EnsembleRegressor if you wish to compute ensembles directly. This class provides covariance and prediction for free, given these ensembles of predictions.

Subclasses must implement one of predict_ensemble() or predict_samples().

predict(X, return_std_dev=False)[source]

Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).

Parameters:: return_std_dev – if True, returns a tuple (prediction, std_dev)

predict_ensemble(X, **kwargs)[source]

Returns an ensemble of predictions.

Parameters:: multiple – standard deviation should be this much larger

predict_samples(X, n=1, **kwargs)[source]: Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape (len(X), n).

covariance_ensemble(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])[source]: Compute covariance from the ensemble of predictions

std_dev_ensemble(X)[source]: Returns the (epistemic) standard deviation of the model on input X.

covariance(X): Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.

fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)

Fits the model to the given training data. If X and y are not specified, this method looks for self.X and self.y. If fit() finds an X but not a y, it treats X as a combined dataset data, and then uses X, y = data.X, data.y. If we can’t find data.X and data.y, we instead use X, y = data[:-1], data[-1].

fit() should also fit any accompanying uncertainty model.

Parameters:

reinitialize – If True, reinitializes model weights before fitting. If False, starts training from previous weight values. If not specified, uses self.reinitialize)
fit_uncertainty – If True, a call to fit() will also call fit_uncertainty(). Defaults to True.

fit_model(X=None, y=None, **kwargs): Fit just the model component, and not the uncertainties (if these are computed separately)

fit_uncertainty(X=None, y=None): Fit just the uncertainties (if these need additional fitting beyond just the model)

initialize(init_seed=None, sample_input=None): (Re)initializes the model weights. If self.reinitialize is True, this should be called at the start of every fit(), and this should be the default behaviour of fit().

static load(path): Loads a model. This particular implementation only works if save(path) hasn’t been overloaded.

property ndim

The number of axes in the feature space. Equal to len(self.shape). Most commonly equal to 1. If training data have been specified, then self.ndim == X.ndim - 1.

This property is used by any methods which use the @flatten_batch decorator.

save(path)

Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).

For any subclass, the save() and load() methods should be compatible with each other.

property shape

The shape of the feature space. Can either be specified directly, or inferred from training data, in which case self.shape == X.shape[1:], i.e., the first (batch) dimension is dropped.

This property is used by any methods which use the @flatten_batch decorator.

std_dev(X, **kwargs): Returns the (epistemic) standard deviation of the model on input X.