alien.models.pytorch package
Submodules
alien.models.pytorch.last_layer module
Model wrapper for last layer linearization.
- class alien.models.pytorch.last_layer.LastLayerPytorchLinearization(model=None, X=None, y=None, **kwargs)[source]
Bases:
LastLayerLinearizableRegressor
“Last layer linearization for Pytorch-based models.
- predict_with_embedding(X)[source]
Forward pass which returns the output of the penultimate layer along with the output of the last layer. If the last layer is not known yet, it will be determined when this function is called for the first time.
- Parameters:
X – one batch of data to use as input for the forward pass
- linearization()[source]
Finds the last-layer linearization of the model in its current state.
- Returns:
weights, bias
- set_last_layer(last_layer_name: str) None [source]
Set the last layer of the model by its name. This sets the forward hook to get the output of the penultimate layer.
- Parameters:
last_layer_name – the name of the last layer (fixed in
model.named_modules()
).
- find_last_layer(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])[source]
Automatically determines the last layer of the model with one forward pass. It assumes that the last layer is the same for every forward pass and that it is an instance of
torch.nn.Linear
. Might not work with every architecture, but is tested with all PyTorch torchvision classification models (besides SqueezeNet, which has no linear last layer).- Parameters:
X – batch of samples used to find last layer.
- Returns:
Returns the output of the forward pass, so as not to waste computation.
- covariance(X)
Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.
- covariance_linear(X, block_size=1000)
- property data
- property embedding_method
- find_method()
- fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)
Fits the model to the given training data. If
X
andy
are not specified, this method looks forself.X
andself.y
. Iffit()
finds anX
but not ay
, it treatsX
as a combined datasetdata
, and then usesX, y = data.X, data.y
. If we can’t finddata.X
anddata.y
, we instead useX, y = data[:-1], data[-1]
.fit()
should also fit any accompanying uncertainty model.- Parameters:
reinitialize – If
True
, reinitializes model weights before fitting. IfFalse
, starts training from previous weight values. If not specified, usesself.reinitialize
)fit_uncertainty – If
True
, a call tofit()
will also callfit_uncertainty()
. Defaults toTrue
.
- fit_model(X=None, y=None, **kwargs)
Fit just the model component, and not the uncertainties (if these are computed separately)
- fit_uncertainty(X=None, y=None)
Fit just the uncertainties (if these need additional fitting beyond just the model)
- initialize(init_seed=None, sample_input=None)
(Re)initializes the model weights. If
self.reinitialize
is True, this should be called at the start of everyfit()
, and this should be the default behaviour offit()
.
- input_embedding(X)
- static load(path)
Loads a model. This particular implementation only works if
save(path)
hasn’t been overloaded.
- method_names = {0: ['_embedding', 'embed', 'embeddings'], 1: ['last_layer_embedding', 'last_layer_embed', 'embed_last_layer', 'last_layer'], 2: ['input_embedding']}
- property ndim
The number of axes in the feature space. Equal to
len(self.shape)
. Most commonly equal to 1. If training data have been specified, thenself.ndim == X.ndim - 1
.This property is used by any methods which use the
@flatten_batch
decorator.
- abstract predict(X, return_std_dev=False)
Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).
- Parameters:
return_std_dev – if True, returns a tuple
(prediction, std_dev)
- predict_ensemble(X, multiple=1.0)
Returns a correlated ensemble of predictions for samples X.
Ensembles are correlated only over the last batch dimension, corresponding to axis (-1 - self.ndim) of X. Earlier dimensions have no guarantee of correlation.
- Parameters:
multiple – standard deviation will be multiplied by this
- predict_linear(X)
Use the model’s linearization for a prediction.
- predict_samples(X, n=1, multiple=1.0, use_covariance_for_ensemble=None)
Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape
(len(X), n)
.
- save(path)
Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).
For any subclass, the
save()
andload()
methods should be compatible with each other.
- property shape
The shape of the feature space. Can either be specified directly, or inferred from training data, in which case
self.shape == X.shape[1:]
, i.e., the first (batch) dimension is dropped.This property is used by any methods which use the
@flatten_batch
decorator.
- std_dev(X, **kwargs)
Returns the (epistemic) standard deviation of the model on input
X
.
alien.models.pytorch.lightning module
alien.models.pytorch.pytorch module
- class alien.models.pytorch.pytorch.PytorchRegressor(model=None, X=None, y=None, **kwargs)[source]
Bases:
LastLayerPytorchLinearization
,MCDropoutRegressor
,LinearizableLaplaceRegressor
- Parameters:
trainer –
Specifies how the model will be trained. May be:
’model’ — calls self.model.fit ‘lightning’ — trains with pytorch-lightning trainer — calls trainer.fit None — chooses from the above in order, if available
- get_lightning_trainer(random_seed=None, **kwargs)[source]
Rerturn a LightningTrainer object from current model.
- fit_model(X=None, y=None, reinitialize=None, init_seed=None, **kwargs)[source]
Fit just the model component, and not the uncertainties (if these are computed separately)
- initialize(init_seed=None, sample_input=None)[source]
(Re)initializes the model weights. If
self.reinitialize
is True, this should be called at the start of everyfit()
, and this should be the default behaviour offit()
.
- predict(X, return_std_dev=False, convert_dtype=True)[source]
Applies the model to input(s) X (with the last self.ndim axes corresponding to each sample), and returns prediction(s).
- Parameters:
return_std_dev – if True, returns a tuple
(prediction, std_dev)
- covariance(X)
Returns the covariance of the epistemic uncertainty between all rows of X. This is where memory bugs often appear, because of the large matrices involved.
- covariance_ensemble(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])
Compute covariance from the ensemble of predictions
- covariance_laplace(X, **kwargs)
Computes covariance using the Laplace approximation
- covariance_linear(X, block_size=1000)
- property data
- property embedding_method
- find_last_layer(X: _SupportsArray[dtype] | _NestedSequence[_SupportsArray[dtype]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes])
Automatically determines the last layer of the model with one forward pass. It assumes that the last layer is the same for every forward pass and that it is an instance of
torch.nn.Linear
. Might not work with every architecture, but is tested with all PyTorch torchvision classification models (besides SqueezeNet, which has no linear last layer).- Parameters:
X – batch of samples used to find last layer.
- Returns:
Returns the output of the forward pass, so as not to waste computation.
- find_method()
- fit(X=None, y=None, reinitialize=None, fit_uncertainty=True, **kwargs)
Fits the model to the given training data. If
X
andy
are not specified, this method looks forself.X
andself.y
. Iffit()
finds anX
but not ay
, it treatsX
as a combined datasetdata
, and then usesX, y = data.X, data.y
. If we can’t finddata.X
anddata.y
, we instead useX, y = data[:-1], data[-1]
.fit()
should also fit any accompanying uncertainty model.- Parameters:
reinitialize – If
True
, reinitializes model weights before fitting. IfFalse
, starts training from previous weight values. If not specified, usesself.reinitialize
)fit_uncertainty – If
True
, a call tofit()
will also callfit_uncertainty()
. Defaults toTrue
.
- fit_laplace(X=None, y=None, lamb=None)
Fits the Laplace approximation to the (last layer of) the model.
- fit_uncertainty(X=None, y=None)
Fit just the uncertainties (if these need additional fitting beyond just the model)
- input_embedding(X)
- last_layer_embedding(X)
Returns the activations of the last layer before the output.
- linearization()
Finds the last-layer linearization of the model in its current state.
- Returns:
weights, bias
- static load(path)
Loads a model. This particular implementation only works if
save(path)
hasn’t been overloaded.
- method_names = {0: ['_embedding', 'embed', 'embeddings'], 1: ['last_layer_embedding', 'last_layer_embed', 'embed_last_layer', 'last_layer'], 2: ['input_embedding']}
- property ndim
The number of axes in the feature space. Equal to
len(self.shape)
. Most commonly equal to 1. If training data have been specified, thenself.ndim == X.ndim - 1
.This property is used by any methods which use the
@flatten_batch
decorator.
- predict_ensemble(X, **kwargs)
Returns an ensemble of predictions.
- Parameters:
multiple – standard deviation should be this much larger
- predict_linear(X)
Use the model’s linearization for a prediction.
- predict_samples(X, n=1, multiple=1.0, convert_dtype=True)[source]
Makes a prediction for for the batch X, randomly selected from this model’s posterior distribution. Gives an ensemble of predictions, with shape
(len(X), n)
.
- predict_with_embedding(X)
Forward pass which returns the output of the penultimate layer along with the output of the last layer. If the last layer is not known yet, it will be determined when this function is called for the first time.
- Parameters:
X – one batch of data to use as input for the forward pass
- save(path)
Saves the model. May well be overloaded by subclasses, if they contain non-picklable components (or pickling would be inefficient).
For any subclass, the
save()
andload()
methods should be compatible with each other.
- set_last_layer(last_layer_name: str) None
Set the last layer of the model by its name. This sets the forward hook to get the output of the penultimate layer.
- Parameters:
last_layer_name – the name of the last layer (fixed in
model.named_modules()
).
- property shape
The shape of the feature space. Can either be specified directly, or inferred from training data, in which case
self.shape == X.shape[1:]
, i.e., the first (batch) dimension is dropped.This property is used by any methods which use the
@flatten_batch
decorator.
- std_dev(X, **kwargs)
Returns the (epistemic) standard deviation of the model on input
X
.
- std_dev_ensemble(X)
Returns the (epistemic) standard deviation of the model on input
X
.
- property dtype
alien.models.pytorch.training_limits module
- alien.models.pytorch.training_limits.get_training_limit(fn)[source]
* Decorator *
Modifies a function so that it can take any of a number of different naive arguments to specify limits to pytorch training. The inner (wrapped) function will see only an argument
training_limit
, which receives an instance of the fancy TrainingLimit class.Thus, the inner fn must have an argument named
training_limit
(or**kwargs
).- The outer, decorated fn may take additional optional kwargs:
'sample_limit'
,'batch_limit'
,'epoch_limit'
'samples'
,'batches'
,'epochs'
If these are given, they determine the value of training_limit according a calculation that favors them in the order provided (though you may only want to provide one).
- class alien.models.pytorch.training_limits.TrainingLimit(min_samples: int = 0, min_epochs: float = 0, samples: int | None = None, epochs: float | None = None, batches: int | None = None, max_samples: float = inf, max_epochs: float = inf)[source]
Bases:
object
Encapsulates the computation of training limits, which may depend on things like dataset length.
- min_samples: int = 0
- min_epochs: float = 0
- samples: int | None = None
- epochs: float | None = None
- batches: int | None = None
- max_samples: float = inf
- max_epochs: float = inf
- class alien.models.pytorch.training_limits.StdLimit(**kwargs)[source]
Bases:
TrainingLimit
- batch_limit(batch_size=None, length=None)
- batches: int | None = None
- epochs: float | None = None
- max_epochs: float = inf
- max_samples: float = inf
- min_epochs: float = 0
- min_samples: int = 0
- sample_limit(length=None)
- samples: int | None = None
alien.models.pytorch.utils module
Helper functions for Pytorch models.
- alien.models.pytorch.utils.as_tensor(x)[source]
Return the tensor version of x (i.e. itself it it already is ArrayLike, or x.data)
- alien.models.pytorch.utils.submodules(module, include_names=True, skip=frozenset({}))[source]
Iterator through submodules of
module
(paired with their names, ifinclude_names
is True). A submodule is returned only once, on its first occurence in a depth-first traversal.Any modules in
skip
(given either as the actual module, or its name) will be skipped, along with all their submodules.- Parameters:
module (torch.nn.module) – The module, whose submodules we will iterate through.
include_names (bool) – True, the iterator yields pairs
(name, submodule)
, otherwise it yields justsubmodule
. (The returned name is the name it’s indexed as the first time it occurs in the tree. If the submodule is not named, its name will return asNone
)skip – A collection of modules to skip. Can contain either modules themselves, and/or their names.