pydvl.influence.torch.influence_function_model
¶
This module implements several implementations of InfluenceFunctionModel utilizing PyTorch.
TorchInfluenceFunctionModel
¶
Bases: InfluenceFunctionModel[Tensor, DataLoader]
, ABC
Abstract base class for influence computation related to torch models
Source code in src/pydvl/influence/torch/influence_function_model.py
is_fitted
abstractmethod
property
¶
Override this, to expose the fitting status of the instance.
fit
abstractmethod
¶
fit(data: DataLoaderType) -> InfluenceFunctionModel
Override this method to fit the influence function model to training data, e.g. pre-compute hessian matrix or matrix decompositions
PARAMETER | DESCRIPTION |
---|---|
data |
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
InfluenceFunctionModel
|
The fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influences
¶
influences(
x_test: Tensor,
y_test: Tensor,
x: Optional[Tensor] = None,
y: Optional[Tensor] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> Tensor
Compute the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. For all input tensors it is assumed, that the first dimension is the batch dimension (in case, you want to provide a single sample z, call z.unsqueeze(0) if no batch dimension is present).
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
influence_factors
¶
Compute approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension (in case, you want to provide a single sample z, call z.unsqueeze(0) if no batch dimension is present).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: Tensor,
x: Tensor,
y: Tensor,
mode: InfluenceMode = InfluenceMode.Up,
) -> Tensor
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension (in case, you want to provide a single sample z, call z.unsqueeze(0) if no batch dimension is present).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
DirectInfluence
¶
DirectInfluence(
model: Module,
loss: LossType,
regularization: Optional[Union[float, Dict[str, Optional[float]]]] = None,
block_structure: Union[
BlockMode, OrderedDict[str, List[str]]
] = BlockMode.FULL,
second_order_mode: SecondOrderMode = SecondOrderMode.HESSIAN,
)
Bases: TorchComposableInfluence[DirectSolveOperator]
Given a model and training data, it finds x such that \(Hx = b\), with \(H\) being the model hessian or Gauss-Newton matrix.
PARAMETER | DESCRIPTION |
---|---|
model |
The model.
TYPE:
|
loss |
The loss function.
TYPE:
|
regularization |
The regularization parameter. In case a dictionary is provided, the keys must be a subset of the block identifiers.
TYPE:
|
block_structure |
The blocking structure, either a pre-defined enum or a custom block structure, see the information regarding block-diagonal approximation.
TYPE:
|
second_order_mode |
The second order mode, either
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
fit
¶
fit(data: DataLoaderType) -> InfluenceFunctionModel
Fitting to provided data, by internally creating a block mapper instance from it. Args: data: iterable of tensors
RETURNS | DESCRIPTION |
---|---|
InfluenceFunctionModel
|
Fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors
¶
Computes the approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/base_influence_function_model.py
influences
¶
influences(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computes the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: TensorType,
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_by_block
¶
influences_by_block(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Compute the block-wise influence values for the provided data, i.e. an approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of the approximation of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors_by_block
¶
influence_factors_by_block(
x: TensorType, y: TensorType
) -> OrderedDict[str, TensorType]
Compute the block-wise approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise |
OrderedDict[str, TensorType]
|
approximate inverse Hessian matrix vector products per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors_by_block
¶
influences_from_factors_by_block(
z_test_factors: OrderedDict[str, TensorType],
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Block-wise computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block |
Source code in src/pydvl/influence/base_influence_function_model.py
with_regularization
¶
with_regularization(
regularization: Union[float, Dict[str, Optional[float]]]
) -> TorchComposableInfluence
Update the regularization parameter. Args: regularization: Either a positive float or a dictionary with the block names as keys and the regularization values as values.
RETURNS | DESCRIPTION |
---|---|
TorchComposableInfluence
|
The modified instance |
Source code in src/pydvl/influence/torch/influence_function_model.py
CgInfluence
¶
CgInfluence(
model: Module,
loss: Callable[[Tensor, Tensor], Tensor],
regularization: Optional[Union[float, Dict[str, Optional[float]]]] = None,
rtol: float = 0.0001,
atol: float = 1e-06,
maxiter: Optional[int] = None,
progress: bool = False,
precompute_grad: bool = False,
preconditioner: Optional[Preconditioner] = None,
solve_simultaneously: bool = False,
warn_on_max_iteration: bool = True,
block_structure: Union[
BlockMode, OrderedDict[str, List[str]]
] = BlockMode.FULL,
second_order_mode: SecondOrderMode = SecondOrderMode.HESSIAN,
)
Bases: TorchComposableInfluence[CgOperator]
Given a model and training data, it uses conjugate gradient to calculate the inverse of the Hessian Vector Product. More precisely, it finds x such that \(Hx = b\), with \(H\) being the model hessian. For more info, see Conjugate Gradient.
PARAMETER | DESCRIPTION |
---|---|
model |
A PyTorch model. The Hessian will be calculated with respect to this model's parameters.
TYPE:
|
loss |
A callable that takes the model's output and target as input and returns the scalar loss. |
regularization |
Optional regularization parameter added to the Hessian-vector product for numerical stability.
TYPE:
|
rtol |
Maximum relative tolerance of result.
TYPE:
|
atol |
Absolute tolerance of result.
TYPE:
|
maxiter |
Maximum number of iterations. If None, defaults to 10*len(b). |
progress |
If True, display progress bars for computing in the non-block mode (use_block_cg=False).
TYPE:
|
preconditioner |
Optional preconditioner to improve convergence of conjugate gradient method
TYPE:
|
solve_simultaneously |
If True, use a variant of conjugate gradient method to simultaneously solve for several right hand sides.
TYPE:
|
warn_on_max_iteration |
If True, logs a warning, if the desired tolerance is not
achieved within
TYPE:
|
block_structure |
Union[BlockMode, OrderedDict[str, List[str]]] = BlockMode.FULL,
TYPE:
|
second_order_mode |
SecondOrderMode = SecondOrderMode.HESSIAN,
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
fit
¶
fit(data: DataLoaderType) -> InfluenceFunctionModel
Fitting to provided data, by internally creating a block mapper instance from it. Args: data: iterable of tensors
RETURNS | DESCRIPTION |
---|---|
InfluenceFunctionModel
|
Fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors
¶
Computes the approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/base_influence_function_model.py
influences
¶
influences(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computes the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: TensorType,
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_by_block
¶
influences_by_block(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Compute the block-wise influence values for the provided data, i.e. an approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of the approximation of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors_by_block
¶
influence_factors_by_block(
x: TensorType, y: TensorType
) -> OrderedDict[str, TensorType]
Compute the block-wise approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise |
OrderedDict[str, TensorType]
|
approximate inverse Hessian matrix vector products per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors_by_block
¶
influences_from_factors_by_block(
z_test_factors: OrderedDict[str, TensorType],
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Block-wise computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block |
Source code in src/pydvl/influence/base_influence_function_model.py
with_regularization
¶
with_regularization(
regularization: Union[float, Dict[str, Optional[float]]]
) -> TorchComposableInfluence
Update the regularization parameter. Args: regularization: Either a positive float or a dictionary with the block names as keys and the regularization values as values.
RETURNS | DESCRIPTION |
---|---|
TorchComposableInfluence
|
The modified instance |
Source code in src/pydvl/influence/torch/influence_function_model.py
LissaInfluence
¶
LissaInfluence(
model: Module,
loss: Callable[[Tensor, Tensor], Tensor],
regularization: Optional[Union[float, Dict[str, Optional[float]]]] = None,
maxiter: int = 1000,
dampen: float = 0.0,
scale: float = 10.0,
rtol: float = 0.0001,
progress: bool = False,
warn_on_max_iteration: bool = True,
block_structure: Union[
BlockMode, OrderedDict[str, List[str]]
] = BlockMode.FULL,
second_order_mode: SecondOrderMode = SecondOrderMode.HESSIAN,
)
Bases: TorchComposableInfluence[LissaOperator[BatchOperationType]]
Uses LISSA, Linear time Stochastic Second-Order Algorithm, to iteratively approximate the inverse Hessian. More precisely, it finds x s.t. \(Hx = b\), with \(H\) being the model's second derivative wrt. the parameters. This is done with the update
where \(I\) is the identity matrix, \(d\) is a dampening term and \(s\) a scaling factor that are applied to help convergence. For details, see Linear time Stochastic Second-Order Approximation (LiSSA)
PARAMETER | DESCRIPTION |
---|---|
model |
A PyTorch model. The Hessian will be calculated with respect to this model's parameters.
TYPE:
|
loss |
A callable that takes the model's output and target as input and returns the scalar loss. |
regularization |
Optional regularization parameter added to the Hessian-vector product for numerical stability.
TYPE:
|
maxiter |
Maximum number of iterations.
TYPE:
|
dampen |
Dampening factor, defaults to 0 for no dampening.
TYPE:
|
scale |
Scaling factor, defaults to 10.
TYPE:
|
rtol |
tolerance to use for early stopping
TYPE:
|
progress |
If True, display progress bars.
TYPE:
|
warn_on_max_iteration |
If True, logs a warning, if the desired tolerance is not
achieved within
TYPE:
|
block_structure |
The blocking structure, either a pre-defined enum or a custom block structure, see the information regarding block-diagonal approximation.
TYPE:
|
second_order_mode |
The second order mode, either
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
fit
¶
fit(data: DataLoaderType) -> InfluenceFunctionModel
Fitting to provided data, by internally creating a block mapper instance from it. Args: data: iterable of tensors
RETURNS | DESCRIPTION |
---|---|
InfluenceFunctionModel
|
Fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors
¶
Computes the approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/base_influence_function_model.py
influences
¶
influences(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computes the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: TensorType,
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_by_block
¶
influences_by_block(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Compute the block-wise influence values for the provided data, i.e. an approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of the approximation of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors_by_block
¶
influence_factors_by_block(
x: TensorType, y: TensorType
) -> OrderedDict[str, TensorType]
Compute the block-wise approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise |
OrderedDict[str, TensorType]
|
approximate inverse Hessian matrix vector products per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors_by_block
¶
influences_from_factors_by_block(
z_test_factors: OrderedDict[str, TensorType],
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Block-wise computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block |
Source code in src/pydvl/influence/base_influence_function_model.py
with_regularization
¶
with_regularization(
regularization: Union[float, Dict[str, Optional[float]]]
) -> TorchComposableInfluence
Update the regularization parameter. Args: regularization: Either a positive float or a dictionary with the block names as keys and the regularization values as values.
RETURNS | DESCRIPTION |
---|---|
TorchComposableInfluence
|
The modified instance |
Source code in src/pydvl/influence/torch/influence_function_model.py
ArnoldiInfluence
¶
ArnoldiInfluence(
model: Module,
loss: Callable[[Tensor, Tensor], Tensor],
regularization: Optional[Union[float, Dict[str, Optional[float]]]] = None,
rank: int = 10,
krylov_dimension: Optional[int] = None,
tol: float = 1e-06,
max_iter: Optional[int] = None,
eigen_computation_on_gpu: bool = False,
block_structure: Union[
BlockMode, OrderedDict[str, List[str]]
] = BlockMode.FULL,
second_order_mode: SecondOrderMode = SecondOrderMode.HESSIAN,
use_woodbury: bool = False,
)
Bases: TorchComposableInfluence[LowRankOperator]
Solves the linear system Hx = b, where H is the Hessian of the model's loss function and b is the given right-hand side vector. It employs the [implicitly restarted Arnoldi method] (https://en.wikipedia.org/wiki/Arnoldi_iteration) for computing a partial eigen decomposition, which is used fo the inversion i.e.
where \(D\) is a diagonal matrix with the top (in absolute value) rank_estimate
eigenvalues of the Hessian
and \(V\) contains the corresponding eigenvectors.
For more information, see Arnoldi.
PARAMETER | DESCRIPTION |
---|---|
model |
A PyTorch model. The Hessian will be calculated with respect to this model's parameters.
TYPE:
|
loss |
A callable that takes the model's output and target as input and returns the scalar loss. |
regularization |
The regularization parameter. In case a dictionary is provided, the keys must be a subset of the block identifiers.
TYPE:
|
rank |
The number of eigenvalues and corresponding eigenvectors to compute. Represents the desired rank of the Hessian approximation.
TYPE:
|
krylov_dimension |
The number of Krylov vectors to use for the Lanczos method. Defaults to min(model's number of parameters, max(2 times rank + 1, 20)). |
tol |
The stopping criteria for the Lanczos algorithm.
TYPE:
|
max_iter |
The maximum number of iterations for the Lanczos method. |
eigen_computation_on_gpu |
If True, tries to execute the eigen pair approximation on the model's device via a cupy implementation. Ensure the model size or rank_estimate is appropriate for device memory. If False, the eigen pair approximation is executed on the CPU by the scipy wrapper to ARPACK.
TYPE:
|
use_woodbury |
If True, uses the Sherman–Morrison–Woodbury formula for the computation of the inverse action, which is more precise but needs additional computation.
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
fit
¶
fit(data: DataLoaderType) -> InfluenceFunctionModel
Fitting to provided data, by internally creating a block mapper instance from it. Args: data: iterable of tensors
RETURNS | DESCRIPTION |
---|---|
InfluenceFunctionModel
|
Fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors
¶
Computes the approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/base_influence_function_model.py
influences
¶
influences(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computes the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: TensorType,
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_by_block
¶
influences_by_block(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Compute the block-wise influence values for the provided data, i.e. an approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of the approximation of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors_by_block
¶
influence_factors_by_block(
x: TensorType, y: TensorType
) -> OrderedDict[str, TensorType]
Compute the block-wise approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise |
OrderedDict[str, TensorType]
|
approximate inverse Hessian matrix vector products per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors_by_block
¶
influences_from_factors_by_block(
z_test_factors: OrderedDict[str, TensorType],
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Block-wise computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block |
Source code in src/pydvl/influence/base_influence_function_model.py
EkfacInfluence
¶
EkfacInfluence(
model: Module,
update_diagonal: bool = False,
hessian_regularization: float = 0.0,
progress: bool = False,
)
Bases: TorchInfluenceFunctionModel
Approximately solves the linear system Hx = b, where H is the Hessian of a model with the empirical categorical cross entropy as loss function and b is the given right-hand side vector. It employs the EK-FAC method, which is based on the kronecker factorization of the Hessian.
Contrary to the other influence function methods, this implementation can only be used for classification tasks with a cross entropy loss function. However, it is much faster than the other methods and can be used efficiently for very large datasets and models. For more information, see Eigenvalue Corrected K-FAC.
PARAMETER | DESCRIPTION |
---|---|
model |
A PyTorch model. The Hessian will be calculated with respect to this model's parameters.
TYPE:
|
update_diagonal |
If True, the diagonal values in the ekfac representation are refitted from the training data after calculating the KFAC blocks. This provides a more accurate approximation of the Hessian, but it is computationally more expensive.
TYPE:
|
hessian_regularization |
Regularization of the hessian.
TYPE:
|
progress |
If True, display progress bars.
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors
¶
Compute approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension (in case, you want to provide a single sample z, call z.unsqueeze(0) if no batch dimension is present).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences
¶
influences(
x_test: Tensor,
y_test: Tensor,
x: Optional[Tensor] = None,
y: Optional[Tensor] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> Tensor
Compute the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. For all input tensors it is assumed, that the first dimension is the batch dimension (in case, you want to provide a single sample z, call z.unsqueeze(0) if no batch dimension is present).
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: Tensor,
x: Tensor,
y: Tensor,
mode: InfluenceMode = InfluenceMode.Up,
) -> Tensor
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension (in case, you want to provide a single sample z, call z.unsqueeze(0) if no batch dimension is present).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/torch/influence_function_model.py
fit
¶
fit(data: DataLoader) -> EkfacInfluence
Compute the KFAC blocks for each layer of the model, using the provided data. It then creates an EkfacRepresentation object that stores the KFAC blocks for each layer, their eigenvalue decomposition and diagonal values.
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_by_layer
¶
influences_by_layer(
x_test: Tensor,
y_test: Tensor,
x: Optional[Tensor] = None,
y: Optional[Tensor] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> Dict[str, Tensor]
Compute the influence of the data on the test data for each layer of the model.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\) |
y |
optional label tensor to compute gradients |
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Dict[str, Tensor]
|
A dictionary containing the influence of the data on the test data for each |
Dict[str, Tensor]
|
layer of the model, with the layer name as key. |
Source code in src/pydvl/influence/torch/influence_function_model.py
influence_factors_by_layer
¶
Computes the approximation of
for each layer of the model separately.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Dict[str, Tensor]
|
A dictionary containing the influence factors for each layer of the model, |
Dict[str, Tensor]
|
with the layer name as key. |
Source code in src/pydvl/influence/torch/influence_function_model.py
influences_from_factors_by_layer
¶
influences_from_factors_by_layer(
z_test_factors: Dict[str, Tensor],
x: Tensor,
y: Tensor,
mode: InfluenceMode = InfluenceMode.Up,
) -> Dict[str, Tensor]
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case for each layer of the model separately. The gradients are meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\) |
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Dict[str, Tensor]
|
A dictionary containing the influence of the data on the test data |
Dict[str, Tensor]
|
for each layer of the model, with the layer name as key. |
Source code in src/pydvl/influence/torch/influence_function_model.py
1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 |
|
explore_hessian_regularization
¶
explore_hessian_regularization(
x: Tensor, y: Tensor, regularization_values: List[float]
) -> Dict[float, Dict[str, Tensor]]
Efficiently computes the influence for input x and label y for each layer of the model, for different values of the hessian regularization parameter. This is done by computing the gradient of the loss function for the input x and label y only once and then solving the Hessian Vector Product for each regularization value. This is useful for finding the optimal regularization value and for exploring how robust the influence values are to changes in the regularization value.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
regularization_values |
list of regularization values to use |
RETURNS | DESCRIPTION |
---|---|
Dict[float, Dict[str, Tensor]]
|
A dictionary containing with keys being the regularization values and values |
Dict[float, Dict[str, Tensor]]
|
being dictionaries containing the influences for each layer of the model, |
Dict[float, Dict[str, Tensor]]
|
with the layer name as key. |
Source code in src/pydvl/influence/torch/influence_function_model.py
NystroemSketchInfluence
¶
NystroemSketchInfluence(
model: Module,
loss: Callable[[Tensor, Tensor], Tensor],
regularization: Union[float, Dict[str, float]],
rank: int,
block_structure: Union[
BlockMode, OrderedDict[str, List[str]]
] = BlockMode.FULL,
second_order_mode: SecondOrderMode = SecondOrderMode.HESSIAN,
)
Bases: TorchComposableInfluence[LowRankOperator]
Given a model and training data, it uses a low-rank approximation of the Hessian (derived via random projection Nyström approximation) in combination with the Sherman–Morrison–Woodbury formula to calculate the inverse of the Hessian Vector Product. More concrete, it computes a low-rank approximation
in factorized form and approximates the action of the inverse Hessian via
PARAMETER | DESCRIPTION |
---|---|
model |
A PyTorch model. The Hessian will be calculated with respect to this model's parameters.
TYPE:
|
loss |
A callable that takes the model's output and target as input and returns the scalar loss. |
regularization |
Optional regularization parameter added to the Hessian-vector product for numerical stability. |
rank |
rank of the low-rank approximation
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
fit
¶
fit(data: DataLoaderType) -> InfluenceFunctionModel
Fitting to provided data, by internally creating a block mapper instance from it. Args: data: iterable of tensors
RETURNS | DESCRIPTION |
---|---|
InfluenceFunctionModel
|
Fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors
¶
Computes the approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/base_influence_function_model.py
influences
¶
influences(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computes the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: TensorType,
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_by_block
¶
influences_by_block(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Compute the block-wise influence values for the provided data, i.e. an approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of the approximation of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors_by_block
¶
influence_factors_by_block(
x: TensorType, y: TensorType
) -> OrderedDict[str, TensorType]
Compute the block-wise approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise |
OrderedDict[str, TensorType]
|
approximate inverse Hessian matrix vector products per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors_by_block
¶
influences_from_factors_by_block(
z_test_factors: OrderedDict[str, TensorType],
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Block-wise computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block |
Source code in src/pydvl/influence/base_influence_function_model.py
InverseHarmonicMeanInfluence
¶
InverseHarmonicMeanInfluence(
model: Module,
loss: LossType,
regularization: Union[float, Dict[str, float]],
block_structure: Union[
BlockMode, OrderedDict[str, List[str]]
] = BlockMode.FULL,
)
Bases: TorchComposableInfluence[InverseHarmonicMeanOperator]
This implementation replaces the inverse Hessian matrix in the influence computation with an approximation of the inverse Gauss-Newton vector product.
Viewing the damped Gauss-newton matrix
as an arithmetic mean of the rank-\(1\) updates, this implementation replaces it with the harmonic mean of the rank-\(1\) updates, i.e.
and uses the matrix
instead of the inverse Hessian.
In other words, it switches the order of summation and inversion, which resolves
to the inverse harmonic mean
of the rank-\(1\) updates. The results are averaged
over the batches provided by the data loader.
The inverses of the rank-\(1\) updates are not calculated explicitly, but instead a vectorized version of the Sherman–Morrison formula is applied.
For more information, see Inverse Harmonic Mean.
PARAMETER | DESCRIPTION |
---|---|
model |
The model.
TYPE:
|
loss |
The loss function.
TYPE:
|
regularization |
The regularization parameter. In case a dictionary is provided, the keys must match the blocking structure and the specification must be complete, so every block needs a positive regularization value, which differs from the description in block-diagonal approximation. |
block_structure |
The blocking structure, either a pre-defined enum or a custom block structure, see the information regarding block-diagonal approximation.
TYPE:
|
Source code in src/pydvl/influence/torch/influence_function_model.py
fit
¶
fit(data: DataLoaderType) -> InfluenceFunctionModel
Fitting to provided data, by internally creating a block mapper instance from it. Args: data: iterable of tensors
RETURNS | DESCRIPTION |
---|---|
InfluenceFunctionModel
|
Fitted instance |
Source code in src/pydvl/influence/base_influence_function_model.py
fit_required
staticmethod
¶
Decorator to enforce the fitted check
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors
¶
Computes the approximation of
where the gradient is meant to be per sample of the batch \((x, y)\). For all input tensors it is assumed, that the first dimension is the batch dimension.
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise inverse Hessian matrix vector products |
Source code in src/pydvl/influence/base_influence_function_model.py
influences
¶
influences(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computes the approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors
¶
influences_from_factors(
z_test_factors: TensorType,
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> TensorType
Computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
TensorType
|
Tensor representing the element-wise scalar products for the provided batch |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_by_block
¶
influences_by_block(
x_test: TensorType,
y_test: TensorType,
x: Optional[TensorType] = None,
y: Optional[TensorType] = None,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Compute the block-wise influence values for the provided data, i.e. an approximation of
for the case of up-weighting influence, resp.
for the perturbation type influence case.
PARAMETER | DESCRIPTION |
---|---|
x_test |
model input to use in the gradient computations of the approximation of \(H^{-1}\nabla_{theta} \ell(y_{test}, f_{\theta}(x_{test}))\)
TYPE:
|
y_test |
label tensor to compute gradients
TYPE:
|
x |
optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{test}\)
TYPE:
|
y |
optional label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influence_factors_by_block
¶
influence_factors_by_block(
x: TensorType, y: TensorType
) -> OrderedDict[str, TensorType]
Compute the block-wise approximation of
where the gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
x |
model input to use in the gradient computations
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise |
OrderedDict[str, TensorType]
|
approximate inverse Hessian matrix vector products per block. |
Source code in src/pydvl/influence/base_influence_function_model.py
influences_from_factors_by_block
¶
influences_from_factors_by_block(
z_test_factors: OrderedDict[str, TensorType],
x: TensorType,
y: TensorType,
mode: InfluenceMode = InfluenceMode.Up,
) -> OrderedDict[str, TensorType]
Block-wise computation of
for the case of up-weighting influence, resp.
for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).
PARAMETER | DESCRIPTION |
---|---|
z_test_factors |
pre-computed array, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)
TYPE:
|
x |
model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)
TYPE:
|
y |
label tensor to compute gradients
TYPE:
|
mode |
enum value of InfluenceMode
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
OrderedDict[str, TensorType]
|
Ordered dictionary of tensors representing the element-wise scalar products |
OrderedDict[str, TensorType]
|
for the provided batch per block |
Source code in src/pydvl/influence/base_influence_function_model.py
with_regularization
¶
with_regularization(
regularization: Union[float, Dict[str, Optional[float]]]
) -> TorchComposableInfluence
Update the regularization parameter. Args: regularization: Either a positive float or a dictionary with the block names as keys and the regularization values as values.
RETURNS | DESCRIPTION |
---|---|
TorchComposableInfluence
|
The modified instance |