Influence function model
This module implements several implementations of InfluenceFunctionModel utilizing PyTorch.
TorchInfluenceFunctionModel(model, loss)
¶
Bases: InfluenceFunctionModel[Tensor, DataLoader]
, ABC
Abstract base class for influence computation related to torch models
Source code in src/pydvl/influence/torch/influence_function_model.py
is_fitted
abstractmethod
property
¶
Override this, to expose the fitting status of the instance.
fit(data)
abstractmethod
¶
Override this method to fit the influence function model to training data, e.g. pre-compute hessian matrix or matrix decompositions
PARAMETER | DESCRIPTION |
---|---|
data |
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
The fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)
¶
Compute the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
influence_factors(x, y)
¶
Compute approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)
¶
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
DirectInfluence(model, loss, hessian_regularization=0.0)
¶
Bases: TorchInfluenceFunctionModel
Given a model and training data, it finds x such that \(Hx = b\), with \(H\) being the model hessian.
PARAMETER | DESCRIPTION |
---|---|
model |
instance of torch.nn.Module.
TYPE:
|
hessian_regularization |
Regularization of the hessian.
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
influence_factors(x, y)
¶
Compute approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)
¶
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
fit(data)
¶
Compute the hessian matrix based on a provided dataloader
PARAMETER | DESCRIPTION |
---|---|
data |
Instance of [torch.utils.data.Dataloader][]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DirectInfluence
|
The fitted instance |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)
¶
Compute approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The action of \(H^{-1}\) is achieved via a direct solver using torch.linalg.solve.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
[torch.nn.Tensor][] representing the element-wise scalar products for the provided batch. |
Source code in src/pydvl/influence/torch/influence_function_model.py
CgInfluence(model, loss, hessian_regularization=0.0, x0=None, rtol=1e-07, atol=1e-07, maxiter=None, progress=False)
¶
Bases: TorchInfluenceFunctionModel
Given a model and training data, it uses conjugate gradient to calculate the inverse of the Hessian Vector Product. More precisely, it finds x such that \(Hx = b\), with \(H\) being the model hessian. For more info, see Conjugate Gradient.
PARAMETER | DESCRIPTION |
---|---|
model |
Instance of torch.nn.Module.
TYPE:
|
loss |
A callable that takes the model's output and target as input and returns the scalar loss. |
hessian_regularization |
Regularization of the hessian.
TYPE:
|
x0 |
Initial guess for hvp. If None, defaults to b. |
rtol |
Maximum relative tolerance of result.
TYPE:
|
atol |
Absolute tolerance of result.
TYPE:
|
maxiter |
Maximum number of iterations. If None, defaults to 10*len(b). |
progress |
If True, display progress bars.
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
influence_factors(x, y)
¶
Compute approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)
¶
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)
¶
Compute approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The approximate action of \(H^{-1}\) is achieved via the [conjugate gradient method] (https://en.wikipedia.org/wiki/Conjugate_gradient_method).
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
[torch.nn.Tensor][] representing the element-wise scalar products for the provided batch. |
Source code in src/pydvl/influence/torch/influence_function_model.py
LissaInfluence(model, loss, hessian_regularization=0.0, maxiter=1000, dampen=0.0, scale=10.0, h0=None, rtol=0.0001, progress=False)
¶
Bases: TorchInfluenceFunctionModel
Uses LISSA, Linear time Stochastic Second-Order Algorithm, to iteratively approximate the inverse Hessian. More precisely, it finds x s.t. \(Hx = b\), with \(H\) being the model's second derivative wrt. the parameters. This is done with the update
where \(I\) is the identity matrix, \(d\) is a dampening term and \(s\) a scaling factor that are applied to help convergence. For details, see Linear time Stochastic Second-Order Approximation (LiSSA)
PARAMETER | DESCRIPTION |
---|---|
model |
instance of torch.nn.Module.
TYPE:
|
hessian_regularization |
Regularization of the hessian.
TYPE:
|
maxiter |
Maximum number of iterations.
TYPE:
|
dampen |
Dampening factor, defaults to 0 for no dampening.
TYPE:
|
scale |
Scaling factor, defaults to 10.
TYPE:
|
h0 |
Initial guess for hvp. |
rtol |
tolerance to use for early stopping
TYPE:
|
progress |
If True, display progress bars.
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
influence_factors(x, y)
¶
Compute approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)
¶
Compute the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)
¶
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
ArnoldiInfluence(model, loss, hessian_regularization=0.0, rank_estimate=10, krylov_dimension=None, tol=1e-06, max_iter=None, eigen_computation_on_gpu=False)
¶
Bases: TorchInfluenceFunctionModel
Solves the linear system Hx = b, where H is the Hessian of the model's loss function and b is the given right-hand side vector. It employs the [implicitly restarted Arnoldi method] (https://en.wikipedia.org/wiki/Arnoldi_iteration) for computing a partial eigen decomposition, which is used fo the inversion i.e.
where \(D\) is a diagonal matrix with the top (in absolute value) rank_estimate
eigenvalues of the Hessian
and \(V\) contains the corresponding eigenvectors.
For more information, see Arnoldi.
PARAMETER | DESCRIPTION |
---|---|
model |
Instance of torch.nn.Module. The Hessian will be calculated with respect to this model's parameters.
|
hessian_regularization |
Optional regularization parameter added to the Hessian-vector product for numerical stability.
TYPE:
|
rank_estimate |
The number of eigenvalues and corresponding eigenvectors to compute. Represents the desired rank of the Hessian approximation.
TYPE:
|
krylov_dimension |
The number of Krylov vectors to use for the Lanczos method. Defaults to min(model's number of parameters, max(2 times rank_estimate + 1, 20)). |
tol |
The stopping criteria for the Lanczos algorithm.
Ignored if
TYPE:
|
max_iter |
The maximum number of iterations for the Lanczos method.
Ignored if |
eigen_computation_on_gpu |
If True, tries to execute the eigen pair approximation on the model's device via a cupy implementation. Ensure the model size or rank_estimate is appropriate for device memory. If False, the eigen pair approximation is executed on the CPU by the scipy wrapper to ARPACK.
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
influence_factors(x, y)
¶
Compute approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)
¶
Compute the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)
¶
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
fit(data)
¶
Fitting corresponds to the computation of the low rank decomposition
of the Hessian defined by the provided data loader.
PARAMETER | DESCRIPTION |
---|---|
data |
Instance of [torch.utils.data.Dataloader][]
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
ArnoldiInfluence
|
The fitted instance |
Source code in src/pydvl/influence/torch/influence_function_model.py
Created: 2023-12-21