Skip to content

pydvl.influence.torch.operator

GaussNewtonOperator

GaussNewtonOperator(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _AveragingBatchOperator[GaussNewtonBatchOperation, PointAveraging]

Given a model and loss function computes the Gauss-Newton vector or matrix product with respect to the model parameters on a batch, i.e.

\[\begin{align*} G(\text{model}, \text{loss}, b, \theta) &\cdot v, \\\ G(\text{model}, \text{loss}, b, \theta) &= \frac{1}{|b|}\sum_{(x, y) \in b}\nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t, \\\ \ell(x,y; \theta) &= \text{loss}(\text{model}(x; \theta), y) \end{align*}\]

where model is a torch.nn.Module and \(v\) is a vector or matrix, and average the results over the batches provided by the data loader.

PARAMETER DESCRIPTION
model

The model.

TYPE: Module

loss

The loss function.

TYPE: Callable[[Tensor, Tensor], Tensor]

dataloader

The data loader providing batches of data.

TYPE: DataLoader

restrict_to

The parameters to restrict the differentiation to, i.e. the corresponding sub-matrix of the Jacobian. If None, the full Jacobian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property input_size.

TYPE: Optional[Dict[str, Parameter]] DEFAULT: None

Source code in src/pydvl/influence/torch/operator.py
def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, nn.Parameter]] = None,
):
    batch_op = GaussNewtonBatchOperation(
        model,
        loss,
        restrict_to=restrict_to,
    )
    averaging = PointAveraging()
    super().__init__(batch_op, dataloader, averaging)

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

apply_to_dict

apply_to_dict(mat: Dict[str, Tensor]) -> Dict[str, Tensor]

Applies the operator to a dictionary of tensors, compatible to the structure defined by the property input_dict_structure.

PARAMETER DESCRIPTION
mat

dictionary of tensors, whose keys and shapes match the property input_dict_structure.

TYPE: Dict[str, Tensor]

RETURNS DESCRIPTION
Dict[str, Tensor]

A dictionary of tensors after applying the operator

Source code in src/pydvl/influence/torch/base.py
def apply_to_dict(self, mat: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
    """
    Applies the operator to a dictionary of tensors, compatible to the structure
    defined by the property `input_dict_structure`.

    Args:
        mat: dictionary of tensors, whose keys and shapes match the property
            `input_dict_structure`.

    Returns:
        A dictionary of tensors after applying the operator
    """

    if not self._validate_mat_dict(mat):
        raise ValueError(
            f"Incompatible input structure, expected (excluding batch"
            f"dimension): \n {self.input_dict_structure}"
        )

    return self._apply_to_dict(self._dict_to_device(mat))

HessianOperator

HessianOperator(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _AveragingBatchOperator[HessianBatchOperation, ChunkAveraging]

Given a model and loss function computes the Hessian vector or matrix product with respect to the model parameters for a given batch, i.e.

\[\begin{align*} &\nabla^2_{\theta} L(b;\theta) \cdot v \\\ &L(b;\theta) = \left( \frac{1}{|b|} \sum_{(x,y) \in b} \text{loss}(\text{model}(x; \theta), y)\right), \end{align*}\]

where model is a torch.nn.Module and \(v\) is a vector or matrix, and average the results over the batches provided by the data loader.

PARAMETER DESCRIPTION
model

The model.

TYPE: Module

loss

The loss function.

TYPE: Callable[[Tensor, Tensor], Tensor]

dataloader

The data loader providing batches of data.

TYPE: DataLoader

restrict_to

The parameters to restrict the second order differentiation to, i.e. the corresponding sub-matrix of the Hessian. If None, the full Hessian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property input_size.

TYPE: Optional[Dict[str, Parameter]] DEFAULT: None

Source code in src/pydvl/influence/torch/operator.py
def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, nn.Parameter]] = None,
):
    batch_op = HessianBatchOperation(model, loss, restrict_to=restrict_to)
    averaging = ChunkAveraging()
    super().__init__(batch_op, dataloader, averaging)

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

apply_to_dict

apply_to_dict(mat: Dict[str, Tensor]) -> Dict[str, Tensor]

Applies the operator to a dictionary of tensors, compatible to the structure defined by the property input_dict_structure.

PARAMETER DESCRIPTION
mat

dictionary of tensors, whose keys and shapes match the property input_dict_structure.

TYPE: Dict[str, Tensor]

RETURNS DESCRIPTION
Dict[str, Tensor]

A dictionary of tensors after applying the operator

Source code in src/pydvl/influence/torch/base.py
def apply_to_dict(self, mat: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
    """
    Applies the operator to a dictionary of tensors, compatible to the structure
    defined by the property `input_dict_structure`.

    Args:
        mat: dictionary of tensors, whose keys and shapes match the property
            `input_dict_structure`.

    Returns:
        A dictionary of tensors after applying the operator
    """

    if not self._validate_mat_dict(mat):
        raise ValueError(
            f"Incompatible input structure, expected (excluding batch"
            f"dimension): \n {self.input_dict_structure}"
        )

    return self._apply_to_dict(self._dict_to_device(mat))

InverseHarmonicMeanOperator

InverseHarmonicMeanOperator(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    dataloader: DataLoader,
    regularization: float,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _AveragingBatchOperator[InverseHarmonicMeanBatchOperation, PointAveraging]

Given a model and loss function computes an approximation of the inverse Gauss-Newton vector or matrix product per batch and averages the results.

Viewing the damped Gauss-newton matrix

\[\begin{align*} G_{\lambda}(\text{model}, \text{loss}, b, \theta) &= \frac{1}{|b|}\sum_{(x, y) \in b}\nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t + \lambda \operatorname{I}, \\\ \ell(x,y; \theta) &= \text{loss}(\text{model}(x; \theta), y) \end{align*}\]

as an arithmetic mean of the rank-\(1\) updates, this operator replaces it with the harmonic mean of the rank-\(1\) updates, i.e.

\[ \tilde{G}_{\lambda}(\text{model}, \text{loss}, b, \theta) = \left(n \sum_{(x, y) \in b} \left( \nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t + \lambda \operatorname{I}\right)^{-1} \right)^{-1}\]

and computes

\[ \tilde{G}_{\lambda}^{-1}(\text{model}, \text{loss}, b, \theta) \cdot v.\]

for any given batch \(b\), where model is a torch.nn.Module and \(v\) is a vector or matrix.

In other words, it switches the order of summation and inversion, which resolves to the inverse harmonic mean of the rank-\(1\) updates. The results are averaged over the batches provided by the data loader.

The inverses of the rank-\(1\) updates are not calculated explicitly, but instead a vectorized version of the Sherman–Morrison formula is applied.

For more information, see Inverse Harmonic Mean.

PARAMETER DESCRIPTION
model

The model.

TYPE: Module

loss

The loss function.

TYPE: Callable[[Tensor, Tensor], Tensor]

dataloader

The data loader providing batches of data.

TYPE: DataLoader

restrict_to

The parameters to restrict the differentiation to, i.e. the corresponding sub-matrix of the Jacobian. If None, the full Jacobian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property input_size.

TYPE: Optional[Dict[str, Parameter]] DEFAULT: None

Source code in src/pydvl/influence/torch/operator.py
def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    dataloader: DataLoader,
    regularization: float,
    restrict_to: Optional[Dict[str, nn.Parameter]] = None,
):
    if regularization <= 0:
        raise ValueError("regularization must be positive")

    self._regularization = regularization

    batch_op = InverseHarmonicMeanBatchOperation(
        model,
        loss,
        regularization,
        restrict_to=restrict_to,
    )
    averaging = PointAveraging()
    super().__init__(batch_op, dataloader, averaging)

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

apply_to_dict

apply_to_dict(mat: Dict[str, Tensor]) -> Dict[str, Tensor]

Applies the operator to a dictionary of tensors, compatible to the structure defined by the property input_dict_structure.

PARAMETER DESCRIPTION
mat

dictionary of tensors, whose keys and shapes match the property input_dict_structure.

TYPE: Dict[str, Tensor]

RETURNS DESCRIPTION
Dict[str, Tensor]

A dictionary of tensors after applying the operator

Source code in src/pydvl/influence/torch/base.py
def apply_to_dict(self, mat: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
    """
    Applies the operator to a dictionary of tensors, compatible to the structure
    defined by the property `input_dict_structure`.

    Args:
        mat: dictionary of tensors, whose keys and shapes match the property
            `input_dict_structure`.

    Returns:
        A dictionary of tensors after applying the operator
    """

    if not self._validate_mat_dict(mat):
        raise ValueError(
            f"Incompatible input structure, expected (excluding batch"
            f"dimension): \n {self.input_dict_structure}"
        )

    return self._apply_to_dict(self._dict_to_device(mat))

DirectSolveOperator

DirectSolveOperator(
    matrix: Tensor,
    regularization: Optional[float] = None,
    in_place_regularization: bool = False,
)

Bases: TensorOperator

Given a matrix \(A\) and an optional regularization parameter \(\lambda\), computes the solution of the system \((A+\lambda I)x = b\), where \(b\) is a vector or a matrix. Internally, it uses the routine torch.linalg.solve.

PARAMETER DESCRIPTION
matrix

the system matrix

TYPE: Tensor

regularization

the regularization parameter

TYPE: Optional[float] DEFAULT: None

in_place_regularization

If True, the input matrix is modified in-place, by adding the regularization value to the diagonal.

TYPE: bool DEFAULT: False

Source code in src/pydvl/influence/torch/operator.py
def __init__(
    self,
    matrix: torch.Tensor,
    regularization: Optional[float] = None,
    in_place_regularization: bool = False,
):
    if regularization is None:
        self.matrix = matrix
    else:
        self.matrix = self._update_diagonal(
            matrix if in_place_regularization else matrix.clone(), regularization
        )
    self._regularization = regularization

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

LissaOperator

LissaOperator(
    batch_operation: BatchOperationType,
    data: DataLoader,
    regularization: Optional[float] = None,
    maxiter: int = 1000,
    dampen: float = 0.0,
    scale: float = 10.0,
    rtol: float = 0.0001,
    progress: bool = False,
    warn_on_max_iteration: bool = True,
)

Bases: TensorOperator, Generic[BatchOperationType]

Uses LISSA, Linear time Stochastic Second-Order Algorithm, to iteratively approximate the solution of the system \((A + \lambda I)x = b\). This is done with the update

\[(A + \lambda I)^{-1}_{j+1} b = b + (I - d) \ (A + \lambda I) - \frac{(A + \lambda I)^{-1}_j b}{s},\]

where \(I\) is the identity matrix, \(d\) is a dampening term and \(s\) a scaling factor that are applied to help convergence. For details, see Linear time Stochastic Second-Order Approximation (LiSSA)

PARAMETER DESCRIPTION
batch_operation

The BatchOperation representing the action of A on a batch of the data loader.

TYPE: BatchOperationType

data

a pytorch dataloader

TYPE: DataLoader

regularization

Optional regularization parameter added to the Hessian-vector product for numerical stability.

TYPE: Optional[float] DEFAULT: None

maxiter

Maximum number of iterations.

TYPE: int DEFAULT: 1000

dampen

Dampening factor, defaults to 0 for no dampening.

TYPE: float DEFAULT: 0.0

scale

Scaling factor, defaults to 10.

TYPE: float DEFAULT: 10.0

rtol

tolerance to use for early stopping

TYPE: float DEFAULT: 0.0001

progress

If True, display progress bars.

TYPE: bool DEFAULT: False

warn_on_max_iteration

If True, logs a warning, if the desired tolerance is not achieved within maxiter iterations. If False, the log level for this information is logging.DEBUG

TYPE: bool DEFAULT: True

Source code in src/pydvl/influence/torch/operator.py
def __init__(
    self,
    batch_operation: BatchOperationType,
    data: DataLoader,
    regularization: Optional[float] = None,
    maxiter: int = 1000,
    dampen: float = 0.0,
    scale: float = 10.0,
    rtol: float = 1e-4,
    progress: bool = False,
    warn_on_max_iteration: bool = True,
):

    if regularization is not None and regularization < 0:
        raise ValueError("regularization must be non-negative")

    self.data = data
    self.warn_on_max_iteration = warn_on_max_iteration
    self.progress = progress
    self.rtol = rtol
    self.scale = scale
    self.dampen = dampen
    self.maxiter = maxiter
    self.batch_operation = batch_operation
    self._regularization = regularization

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

LowRankOperator

LowRankOperator(
    low_rank_representation: LowRankProductRepresentation,
    regularization: Optional[float] = None,
    exact: bool = True,
)

Bases: TensorOperator

Given a low rank representation of a matrix

\[ A = V D V^T\]

with a diagonal matrix \(D\) and an optional regularization parameter \(\lambda\), computes

$$ (V D V^T+\lambda I)^{-1}b$$.

Depending on the value of the exact flag, the inverse action is computed exactly using the [Sherman–Morrison–Woodbury formula] (https://en.wikipedia.org/wiki/Woodbury_matrix_identity). If exact is set to False, the inverse action is approximated by

\[ V^T(D+\lambda I)^{-1}Vb\]

Args:

Source code in src/pydvl/influence/torch/operator.py
def __init__(
    self,
    low_rank_representation: LowRankProductRepresentation,
    regularization: Optional[float] = None,
    exact: bool = True,
):

    if exact and (regularization is None or regularization <= 0):
        raise ValueError("regularization must be positive when exact=True")
    elif regularization is not None and regularization < 0:
        raise ValueError("regularization must be non-negative")

    self._regularization = regularization
    self._exact = exact
    self._low_rank_representation = low_rank_representation

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

MatrixOperator

MatrixOperator(matrix: Tensor)

Bases: TensorOperator

A simple wrapper for a torch.Tensor acting as a matrix mapping.

Source code in src/pydvl/influence/torch/operator.py
def __init__(self, matrix: torch.Tensor):
    self.matrix = matrix

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

CgOperator

CgOperator(
    operator: TensorOperator,
    regularization: Optional[float] = None,
    rtol: float = 1e-07,
    atol: float = 1e-07,
    maxiter: Optional[int] = None,
    progress: bool = False,
    preconditioner: Optional[Preconditioner] = None,
    use_block_cg: bool = False,
    warn_on_max_iteration: bool = True,
)

Bases: TensorOperator

Given an operator , it uses conjugate gradient to calculate the action of its inverse. More precisely, it finds x such that \(Ax = A\), with \(A\) being the matrix represented by the operator. For more info, see Conjugate Gradient.

PARAMETER DESCRIPTION
operator

TYPE: TensorOperator

regularization

Optional regularization parameter added to the matrix vector product for numerical stability.

TYPE: Optional[float] DEFAULT: None

rtol

Maximum relative tolerance of result.

TYPE: float DEFAULT: 1e-07

atol

Absolute tolerance of result.

TYPE: float DEFAULT: 1e-07

maxiter

Maximum number of iterations. If None, defaults to 10*len(b).

TYPE: Optional[int] DEFAULT: None

progress

If True, display progress bars for computing in the non-block mode (use_block_cg=False).

TYPE: bool DEFAULT: False

preconditioner

Optional pre-conditioner to improve convergence of conjugate gradient method

TYPE: Optional[Preconditioner] DEFAULT: None

use_block_cg

If True, use block variant of conjugate gradient method, which solves several right hand sides simultaneously

TYPE: bool DEFAULT: False

warn_on_max_iteration

If True, logs a warning, if the desired tolerance is not achieved within maxiter iterations. If False, the log level for this information is logging.DEBUG

TYPE: bool DEFAULT: True

Source code in src/pydvl/influence/torch/operator.py
def __init__(
    self,
    operator: TensorOperator,
    regularization: Optional[float] = None,
    rtol: float = 1e-7,
    atol: float = 1e-7,
    maxiter: Optional[int] = None,
    progress: bool = False,
    preconditioner: Optional[Preconditioner] = None,
    use_block_cg: bool = False,
    warn_on_max_iteration: bool = True,
):

    if regularization is not None and regularization < 0:
        raise ValueError("regularization must be non-negative")

    self.progress = progress
    self.warn_on_max_iteration = warn_on_max_iteration
    self.use_block_cg = use_block_cg
    self.preconditioner = preconditioner
    self.maxiter = maxiter
    self.atol = atol
    self.rtol = rtol
    self._regularization = regularization
    self.operator = operator

apply

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER DESCRIPTION
tensor

A tensor, whose tailing dimension must conform to the operator's input size

TYPE: TensorType

RETURNS DESCRIPTION
TensorType

A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py
def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)