alien.models.pytorch package

Submodules

alien.models.pytorch.last_layer module

Model wrapper for last layer linearization.

class alien.models.pytorch.last_layer.LastLayerPytorchLinearization(model=None, X=None, y=None, **kwargs)[source]

Bases: LastLayerLinearizableRegressor

“Last layer linearization for Pytorch-based models.

predict_with_embedding(X)[source]

Forward pass which returns the output of the penultimate layer along with the output of the last layer. If the last layer is not known yet, it will be determined when this function is called for the first time.

Parameters:

X – one batch of data to use as input for the forward pass

last_layer_embedding(X)[source]

Returns the activations of the last layer before the output.

linearization()[source]

Finds the last-layer linearization of the model in its current state.

Returns:

weights, bias

set_last_layer(last_layer_name: str) None[source]

Set the last layer of the model by its name. This sets the forward hook to get the output of the penultimate layer.

Parameters:

last_layer_name – the name of the last layer (fixed in model.named_modules()).

find_last_layer(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])[source]

Automatically determines the last layer of the model with one forward pass. It assumes that the last layer is the same for every forward pass and that it is an instance of torch.nn.Linear. Might not work with every architecture, but is tested with all PyTorch torchvision classification models (besides SqueezeNet, which has no linear last layer).

Parameters:

X – batch of samples used to find last layer.

Returns:

Returns the output of the forward pass, so as not to waste computation.

covariance(X)

Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.

covariance_linear(X, block_size=1000)
property data
property embedding_method
find_method()
fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)

Fits the model to the given training data. If X and y are not specified, this method looks for self.X and self.y. If fit() finds an X but not a y, it treats X as a combined dataset data, and then uses X, y = data.X, data.y. If we can’t find data.X and data.y, we instead use X, y = data[:-1], data[-1].

fit() should also fit any accompanying uncertainty model.

Parameters:
  • reinitialize – If True, reinitializes model weights before fitting. If False, starts training from previous weight values. If not specified, uses self.reinitialize)

  • fit_uncertainty – If True, a call to fit() will also call fit_uncertainty(). Defaults to True.

fit_model(X=None, y=None, **kwargs)

Fit just the model component, and not the uncertainties (if these are computed separately)

fit_uncertainty(X=None, y=None)

Fit just the uncertainties (if these need additional fitting beyond just the model)

initialize(init_seed=None, sample_input=None)

(Re)initializes the model weights. If self.reinitialize is True, this should be called at the start of every fit(), and this should be the default behaviour of fit().

input_embedding(X)
static load(path)

Loads a model. This particular implementation only works if save(path) hasn’t been overloaded.

method_names = {0: ['_embedding', 'embed', 'embeddings'], 1: ['last_layer_embedding', 'last_layer_embed', 'embed_last_layer', 'last_layer'], 2: ['input_embedding']}
property ndim

The number of axes in the feature space. Equal to len(self.shape). Most commonly equal to 1. If training data have been specified, then self.ndim == X.ndim - 1.

This property is used by any methods which use the @flatten_batch decorator.

abstract predict(X, return_std_dev=False)

Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).

Parameters:

return_std_dev – if True, returns a tuple (prediction, std_dev)

predict_ensemble(X, multiple=1.0)

Returns a correlated ensemble of predictions for samples X.

Ensembles are correlated only over the last batch dimension, corresponding to axis (-1 - self.ndim) of X. Earlier dimensions have no guarantee of correlation.

Parameters:

multiple – standard deviation will be multiplied by this

predict_linear(X)

Use the model’s linearization for a prediction.

predict_samples(X, n=1, multiple=1.0, use_covariance_for_ensemble=None)

Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape (len(X), n).

save(path)

Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).

For any subclass, the save() and load() methods should be compatible with each other.

property shape

The shape of the feature space. Can either be specified directly, or inferred from training data, in which case self.shape == X.shape[1:], i.e., the first (batch) dimension is dropped.

This property is used by any methods which use the @flatten_batch decorator.

std_dev(X, **kwargs)

Returns the (epistemic) standard deviation of the model on input X.

alien.models.pytorch.lightning module

alien.models.pytorch.pytorch module

alien.models.pytorch.pytorch.init_weights(module)[source]
alien.models.pytorch.pytorch.init_bias(module, torch)[source]
class alien.models.pytorch.pytorch.PytorchRegressor(model=None, X=None, y=None, **kwargs)[source]

Bases: LastLayerPytorchLinearization, MCDropoutRegressor, LinearizableLaplaceRegressor

Parameters:

trainer

Specifies how the model will be trained. May be:

’model’ — calls self.model.fit ‘lightning’ — trains with pytorch-lightning trainer — calls trainer.fit None — chooses from the above in order, if available

fix_dropouts()[source]

Retools dropouts for MC dropout prediction

get_lightning_trainer(random_seed=None, **kwargs)[source]

Rerturn a LightningTrainer object from current model.

fit_model(X=None, y=None, reinitialize=None, init_seed=None, **kwargs)[source]

Fit just the model component, and not the uncertainties (if these are computed separately)

initialize(init_seed=None, sample_input=None)[source]

(Re)initializes the model weights. If self.reinitialize is True, this should be called at the start of every fit(), and this should be the default behaviour of fit().

predict(X, return_std_dev=False, convert_dtype=True)[source]

Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).

Parameters:

return_std_dev – if True, returns a tuple (prediction, std_dev)

covariance(X)

Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.

covariance_ensemble(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])

Compute covariance from the ensemble of predictions

covariance_laplace(X, **kwargs)

Computes covariance using the Laplace approximation

covariance_linear(X, block_size=1000)
property data
property embedding_method
find_last_layer(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])

Automatically determines the last layer of the model with one forward pass. It assumes that the last layer is the same for every forward pass and that it is an instance of torch.nn.Linear. Might not work with every architecture, but is tested with all PyTorch torchvision classification models (besides SqueezeNet, which has no linear last layer).

Parameters:

X – batch of samples used to find last layer.

Returns:

Returns the output of the forward pass, so as not to waste computation.

find_method()
fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)

Fits the model to the given training data. If X and y are not specified, this method looks for self.X and self.y. If fit() finds an X but not a y, it treats X as a combined dataset data, and then uses X, y = data.X, data.y. If we can’t find data.X and data.y, we instead use X, y = data[:-1], data[-1].

fit() should also fit any accompanying uncertainty model.

Parameters:
  • reinitialize – If True, reinitializes model weights before fitting. If False, starts training from previous weight values. If not specified, uses self.reinitialize)

  • fit_uncertainty – If True, a call to fit() will also call fit_uncertainty(). Defaults to True.

fit_laplace(X=None, y=None, lamb=None)

Fits the Laplace approximation to the (last layer of) the model.

fit_uncertainty(X=None, y=None)

Fit just the uncertainties (if these need additional fitting beyond just the model)

input_embedding(X)
last_layer_embedding(X)

Returns the activations of the last layer before the output.

linearization()

Finds the last-layer linearization of the model in its current state.

Returns:

weights, bias

static load(path)

Loads a model. This particular implementation only works if save(path) hasn’t been overloaded.

method_names = {0: ['_embedding', 'embed', 'embeddings'], 1: ['last_layer_embedding', 'last_layer_embed', 'embed_last_layer', 'last_layer'], 2: ['input_embedding']}
property ndim

The number of axes in the feature space. Equal to len(self.shape). Most commonly equal to 1. If training data have been specified, then self.ndim == X.ndim - 1.

This property is used by any methods which use the @flatten_batch decorator.

predict_ensemble(X, **kwargs)

Returns an ensemble of predictions.

Parameters:

multiple – standard deviation should be this much larger

predict_linear(X)

Use the model’s linearization for a prediction.

predict_samples(X, n=1, multiple=1.0, convert_dtype=True)[source]

Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape (len(X), n).

predict_with_embedding(X)

Forward pass which returns the output of the penultimate layer along with the output of the last layer. If the last layer is not known yet, it will be determined when this function is called for the first time.

Parameters:

X – one batch of data to use as input for the forward pass

save(path)

Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).

For any subclass, the save() and load() methods should be compatible with each other.

set_last_layer(last_layer_name: str) None

Set the last layer of the model by its name. This sets the forward hook to get the output of the penultimate layer.

Parameters:

last_layer_name – the name of the last layer (fixed in model.named_modules()).

property shape

The shape of the feature space. Can either be specified directly, or inferred from training data, in which case self.shape == X.shape[1:], i.e., the first (batch) dimension is dropped.

This property is used by any methods which use the @flatten_batch decorator.

std_dev(X, **kwargs)

Returns the (epistemic) standard deviation of the model on input X.

std_dev_ensemble(X)

Returns the (epistemic) standard deviation of the model on input X.

property dtype

alien.models.pytorch.training_limits module

alien.models.pytorch.training_limits.get_training_limit(fn)[source]

* Decorator *

Modifies a function so that it can take any of a number of different naive arguments to specify limits to pytorch training. The inner (wrapped) function will see only an argument training_limit, which receives an instance of the fancy TrainingLimit class.

Thus, the inner fn must have an argument named training_limit (or **kwargs).

The outer, decorated fn may take additional optional kwargs:

'sample_limit', 'batch_limit', 'epoch_limit' 'samples', 'batches', 'epochs'

If these are given, they determine the value of training_limit according a calculation that favors them in the order provided (though you may only want to provide one).

class alien.models.pytorch.training_limits.TrainingLimit(min_samples: int = 0, min_epochs: float = 0, samples: int | None = None, epochs: float | None = None, batches: int | None = None, max_samples: float = inf, max_epochs: float = inf)[source]

Bases: object

Encapsulates the computation of training limits, which may depend on things like dataset length.

min_samples: int = 0
min_epochs: float = 0
samples: int | None = None
epochs: float | None = None
batches: int | None = None
max_samples: float = inf
max_epochs: float = inf
sample_limit(length=None)[source]
batch_limit(batch_size=None, length=None)[source]
class alien.models.pytorch.training_limits.StdLimit(**kwargs)[source]

Bases: TrainingLimit

batch_limit(batch_size=None, length=None)
batches: int | None = None
epochs: float | None = None
max_epochs: float = inf
max_samples: float = inf
min_epochs: float = 0
min_samples: int = 0
sample_limit(length=None)
samples: int | None = None

alien.models.pytorch.utils module

Helper functions for Pytorch models.

alien.models.pytorch.utils.as_tensor(x)[source]

Return the tensor version of x (i.e. itself it it already is ArrayLike, or x.data)

alien.models.pytorch.utils.dropout_forward(self, x)[source]
alien.models.pytorch.utils.submodules(module, include_names=True, skip=frozenset({}))[source]

Iterator through submodules of module (paired with their names, if include_names is True). A submodule is returned only once, on its first occurence in a depth-first traversal.

Any modules in skip (given either as the actual module, or its name) will be skipped, along with all their submodules.

Parameters:
  • module (torch.nn.module) – The module, whose submodules we will iterate through.

  • include_names (bool) – True, the iterator yields pairs (name, submodule), otherwise it yields just submodule. (The returned name is the name it’s indexed as the first time it occurs in the tree. If the submodule is not named, its name will return as None)

  • skip – A collection of modules to skip. Can contain either modules themselves, and/or their names.

Module contents