pydvl.influence.torch.operator ¶

GaussNewtonOperator ¶

GaussNewtonOperator(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _AveragingBatchOperator[GaussNewtonBatchOperation, PointAveraging]

Given a model and loss function computes the Gauss-Newton vector or matrix product with respect to the model parameters on a batch, i.e.

\[\begin{align*} G(\text{model}, \text{loss}, b, \theta) &\cdot v, \\\ G(\text{model}, \text{loss}, b, \theta) &= \frac{1}{|b|}\sum_{(x, y) \in b}\nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t, \\\ \ell(x,y; \theta) &= \text{loss}(\text{model}(x; \theta), y) \end{align*}\]

where model is a torch.nn.Module and $v$ is a vector or matrix, and average the results over the batches provided by the data loader.

PARAMETER	DESCRIPTION
`model`	The model. TYPE: `Module`
`loss`	The loss function. TYPE: `Callable[[Tensor, Tensor], Tensor]`
`dataloader`	The data loader providing batches of data. TYPE: `DataLoader`
`restrict_to`	The parameters to restrict the differentiation to, i.e. the corresponding sub-matrix of the Jacobian. If None, the full Jacobian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property `input_size`. TYPE: `Optional[Dict[str, Parameter]]` DEFAULT: `None`

Source code in src/pydvl/influence/torch/operator.py

def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, nn.Parameter]] = None,
):
    batch_op = GaussNewtonBatchOperation(
        model,
        loss,
        restrict_to=restrict_to,
    )
    averaging = PointAveraging()
    super().__init__(batch_op, dataloader, averaging)

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

apply_to_dict ¶

apply_to_dict(mat: Dict[str, Tensor]) -> Dict[str, Tensor]

Applies the operator to a dictionary of tensors, compatible to the structure defined by the property input_dict_structure.

PARAMETER	DESCRIPTION
`mat`	dictionary of tensors, whose keys and shapes match the property `input_dict_structure`. TYPE: `Dict[str, Tensor]`

RETURNS	DESCRIPTION
`Dict[str, Tensor]`	A dictionary of tensors after applying the operator

Source code in src/pydvl/influence/torch/base.py

def apply_to_dict(self, mat: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
    """
    Applies the operator to a dictionary of tensors, compatible to the structure
    defined by the property `input_dict_structure`.

    Args:
        mat: dictionary of tensors, whose keys and shapes match the property
            `input_dict_structure`.

    Returns:
        A dictionary of tensors after applying the operator
    """

    if not self._validate_mat_dict(mat):
        raise ValueError(
            f"Incompatible input structure, expected (excluding batch"
            f"dimension): \n {self.input_dict_structure}"
        )

    return self._apply_to_dict(self._dict_to_device(mat))

HessianOperator ¶

HessianOperator(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _AveragingBatchOperator[HessianBatchOperation, ChunkAveraging]

Given a model and loss function computes the Hessian vector or matrix product with respect to the model parameters for a given batch, i.e.

\[\begin{align*} &\nabla^2_{\theta} L(b;\theta) \cdot v \\\ &L(b;\theta) = \left( \frac{1}{|b|} \sum_{(x,y) \in b} \text{loss}(\text{model}(x; \theta), y)\right), \end{align*}\]

where model is a torch.nn.Module and $v$ is a vector or matrix, and average the results over the batches provided by the data loader.

PARAMETER	DESCRIPTION
`model`	The model. TYPE: `Module`
`loss`	The loss function. TYPE: `Callable[[Tensor, Tensor], Tensor]`
`dataloader`	The data loader providing batches of data. TYPE: `DataLoader`
`restrict_to`	The parameters to restrict the second order differentiation to, i.e. the corresponding sub-matrix of the Hessian. If None, the full Hessian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property `input_size`. TYPE: `Optional[Dict[str, Parameter]]` DEFAULT: `None`

Source code in src/pydvl/influence/torch/operator.py

def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    dataloader: DataLoader,
    restrict_to: Optional[Dict[str, nn.Parameter]] = None,
):
    batch_op = HessianBatchOperation(model, loss, restrict_to=restrict_to)
    averaging = ChunkAveraging()
    super().__init__(batch_op, dataloader, averaging)

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

apply_to_dict ¶

apply_to_dict(mat: Dict[str, Tensor]) -> Dict[str, Tensor]

Applies the operator to a dictionary of tensors, compatible to the structure defined by the property input_dict_structure.

PARAMETER	DESCRIPTION
`mat`	dictionary of tensors, whose keys and shapes match the property `input_dict_structure`. TYPE: `Dict[str, Tensor]`

RETURNS	DESCRIPTION
`Dict[str, Tensor]`	A dictionary of tensors after applying the operator

Source code in src/pydvl/influence/torch/base.py

def apply_to_dict(self, mat: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
    """
    Applies the operator to a dictionary of tensors, compatible to the structure
    defined by the property `input_dict_structure`.

    Args:
        mat: dictionary of tensors, whose keys and shapes match the property
            `input_dict_structure`.

    Returns:
        A dictionary of tensors after applying the operator
    """

    if not self._validate_mat_dict(mat):
        raise ValueError(
            f"Incompatible input structure, expected (excluding batch"
            f"dimension): \n {self.input_dict_structure}"
        )

    return self._apply_to_dict(self._dict_to_device(mat))

InverseHarmonicMeanOperator ¶

InverseHarmonicMeanOperator(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    dataloader: DataLoader,
    regularization: float,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _AveragingBatchOperator[InverseHarmonicMeanBatchOperation, PointAveraging]

Given a model and loss function computes an approximation of the inverse Gauss-Newton vector or matrix product per batch and averages the results.

Viewing the damped Gauss-newton matrix

\[\begin{align*} G_{\lambda}(\text{model}, \text{loss}, b, \theta) &= \frac{1}{|b|}\sum_{(x, y) \in b}\nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t + \lambda \operatorname{I}, \\\ \ell(x,y; \theta) &= \text{loss}(\text{model}(x; \theta), y) \end{align*}\]

as an arithmetic mean of the rank-$1$ updates, this operator replaces it with the harmonic mean of the rank-$1$ updates, i.e.

\[ \tilde{G}_{\lambda}(\text{model}, \text{loss}, b, \theta) = \left(n \sum_{(x, y) \in b} \left( \nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t + \lambda \operatorname{I}\right)^{-1} \right)^{-1}\]

and computes

\[ \tilde{G}_{\lambda}^{-1}(\text{model}, \text{loss}, b, \theta) \cdot v.\]

for any given batch $b$, where model is a torch.nn.Module and $v$ is a vector or matrix.

In other words, it switches the order of summation and inversion, which resolves to the inverse harmonic mean of the rank-$1$ updates. The results are averaged over the batches provided by the data loader.

The inverses of the rank-$1$ updates are not calculated explicitly, but instead a vectorized version of the Sherman–Morrison formula is applied.

For more information, see Inverse Harmonic Mean.

PARAMETER	DESCRIPTION
`model`	The model. TYPE: `Module`
`loss`	The loss function. TYPE: `Callable[[Tensor, Tensor], Tensor]`
`dataloader`	The data loader providing batches of data. TYPE: `DataLoader`
`restrict_to`	The parameters to restrict the differentiation to, i.e. the corresponding sub-matrix of the Jacobian. If None, the full Jacobian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property `input_size`. TYPE: `Optional[Dict[str, Parameter]]` DEFAULT: `None`

Source code in src/pydvl/influence/torch/operator.py

def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    dataloader: DataLoader,
    regularization: float,
    restrict_to: Optional[Dict[str, nn.Parameter]] = None,
):
    if regularization <= 0:
        raise ValueError("regularization must be positive")

    self._regularization = regularization

    batch_op = InverseHarmonicMeanBatchOperation(
        model,
        loss,
        regularization,
        restrict_to=restrict_to,
    )
    averaging = PointAveraging()
    super().__init__(batch_op, dataloader, averaging)

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

apply_to_dict ¶

apply_to_dict(mat: Dict[str, Tensor]) -> Dict[str, Tensor]

Applies the operator to a dictionary of tensors, compatible to the structure defined by the property input_dict_structure.

PARAMETER	DESCRIPTION
`mat`	dictionary of tensors, whose keys and shapes match the property `input_dict_structure`. TYPE: `Dict[str, Tensor]`

RETURNS	DESCRIPTION
`Dict[str, Tensor]`	A dictionary of tensors after applying the operator

Source code in src/pydvl/influence/torch/base.py

def apply_to_dict(self, mat: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
    """
    Applies the operator to a dictionary of tensors, compatible to the structure
    defined by the property `input_dict_structure`.

    Args:
        mat: dictionary of tensors, whose keys and shapes match the property
            `input_dict_structure`.

    Returns:
        A dictionary of tensors after applying the operator
    """

    if not self._validate_mat_dict(mat):
        raise ValueError(
            f"Incompatible input structure, expected (excluding batch"
            f"dimension): \n {self.input_dict_structure}"
        )

    return self._apply_to_dict(self._dict_to_device(mat))

DirectSolveOperator ¶

DirectSolveOperator(
    matrix: Tensor,
    regularization: Optional[float] = None,
    in_place_regularization: bool = False,
)

Bases: TensorOperator

Given a matrix $A$ and an optional regularization parameter $\lambda$, computes the solution of the system $(A+\lambda I)x = b$, where $b$ is a vector or a matrix. Internally, it uses the routine torch.linalg.solve.

PARAMETER	DESCRIPTION
`matrix`	the system matrix TYPE: `Tensor`
`regularization`	the regularization parameter TYPE: `Optional[float]` DEFAULT: `None`
`in_place_regularization`	If True, the input matrix is modified in-place, by adding the regularization value to the diagonal. TYPE: `bool` DEFAULT: `False`

Source code in src/pydvl/influence/torch/operator.py

def __init__(
    self,
    matrix: torch.Tensor,
    regularization: Optional[float] = None,
    in_place_regularization: bool = False,
):
    if regularization is None:
        self.matrix = matrix
    else:
        self.matrix = self._update_diagonal(
            matrix if in_place_regularization else matrix.clone(), regularization
        )
    self._regularization = regularization

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

LissaOperator ¶

LissaOperator(
    batch_operation: BatchOperationType,
    data: DataLoader,
    regularization: Optional[float] = None,
    maxiter: int = 1000,
    dampen: float = 0.0,
    scale: float = 10.0,
    rtol: float = 0.0001,
    progress: bool = False,
    warn_on_max_iteration: bool = True,
)

Bases: TensorOperator, Generic[BatchOperationType]

Uses LISSA, Linear time Stochastic Second-Order Algorithm, to iteratively approximate the solution of the system $(A + \lambda I)x = b$. This is done with the update

\[(A + \lambda I)^{-1}_{j+1} b = b + (I - d) \ (A + \lambda I) - \frac{(A + \lambda I)^{-1}_j b}{s},\]

where $I$ is the identity matrix, $d$ is a dampening term and $s$ a scaling factor that are applied to help convergence. For details, see Linear time Stochastic Second-Order Approximation (LiSSA)

PARAMETER	DESCRIPTION
`batch_operation`	The `BatchOperation` representing the action of A on a batch of the data loader. TYPE: `BatchOperationType`
`data`	a pytorch dataloader TYPE: `DataLoader`
`regularization`	Optional regularization parameter added to the Hessian-vector product for numerical stability. TYPE: `Optional[float]` DEFAULT: `None`
`maxiter`	Maximum number of iterations. TYPE: `int` DEFAULT: `1000`
`dampen`	Dampening factor, defaults to 0 for no dampening. TYPE: `float` DEFAULT: `0.0`
`scale`	Scaling factor, defaults to 10. TYPE: `float` DEFAULT: `10.0`
`rtol`	tolerance to use for early stopping TYPE: `float` DEFAULT: `0.0001`
`progress`	If True, display progress bars. TYPE: `bool` DEFAULT: `False`
`warn_on_max_iteration`	If True, logs a warning, if the desired tolerance is not achieved within `maxiter` iterations. If False, the log level for this information is `logging.DEBUG` TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/influence/torch/operator.py

def __init__(
    self,
    batch_operation: BatchOperationType,
    data: DataLoader,
    regularization: Optional[float] = None,
    maxiter: int = 1000,
    dampen: float = 0.0,
    scale: float = 10.0,
    rtol: float = 1e-4,
    progress: bool = False,
    warn_on_max_iteration: bool = True,
):

    if regularization is not None and regularization < 0:
        raise ValueError("regularization must be non-negative")

    self.data = data
    self.warn_on_max_iteration = warn_on_max_iteration
    self.progress = progress
    self.rtol = rtol
    self.scale = scale
    self.dampen = dampen
    self.maxiter = maxiter
    self.batch_operation = batch_operation
    self._regularization = regularization

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

LowRankOperator ¶

LowRankOperator(
    low_rank_representation: LowRankProductRepresentation,
    regularization: Optional[float] = None,
    exact: bool = True,
)

Bases: TensorOperator

Given a low rank representation of a matrix

\[ A = V D V^T\]

with a diagonal matrix $D$ and an optional regularization parameter $\lambda$, computes

$$ (V D V^T+\lambda I)^{-1}b$$.

Depending on the value of the exact flag, the inverse action is computed exactly using the [Sherman–Morrison–Woodbury formula] (https://en.wikipedia.org/wiki/Woodbury_matrix_identity). If exact is set to False, the inverse action is approximated by

\[ V^T(D+\lambda I)^{-1}Vb\]

Args:

Source code in src/pydvl/influence/torch/operator.py

def __init__(
    self,
    low_rank_representation: LowRankProductRepresentation,
    regularization: Optional[float] = None,
    exact: bool = True,
):

    if exact and (regularization is None or regularization <= 0):
        raise ValueError("regularization must be positive when exact=True")
    elif regularization is not None and regularization < 0:
        raise ValueError("regularization must be non-negative")

    self._regularization = regularization
    self._exact = exact
    self._low_rank_representation = low_rank_representation

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

MatrixOperator ¶

MatrixOperator(matrix: Tensor)

Bases: TensorOperator

A simple wrapper for a torch.Tensor acting as a matrix mapping.

Source code in src/pydvl/influence/torch/operator.py

def __init__(self, matrix: torch.Tensor):
    self.matrix = matrix

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)

CgOperator ¶

CgOperator(
    operator: TensorOperator,
    regularization: Optional[float] = None,
    rtol: float = 1e-07,
    atol: float = 1e-07,
    maxiter: Optional[int] = None,
    progress: bool = False,
    preconditioner: Optional[Preconditioner] = None,
    use_block_cg: bool = False,
    warn_on_max_iteration: bool = True,
)

Bases: TensorOperator

Given an operator , it uses conjugate gradient to calculate the action of its inverse. More precisely, it finds x such that $Ax = A$, with $A$ being the matrix represented by the operator. For more info, see Conjugate Gradient.

PARAMETER	DESCRIPTION
`operator`	TYPE: `TensorOperator`
`regularization`	Optional regularization parameter added to the matrix vector product for numerical stability. TYPE: `Optional[float]` DEFAULT: `None`
`rtol`	Maximum relative tolerance of result. TYPE: `float` DEFAULT: `1e-07`
`atol`	Absolute tolerance of result. TYPE: `float` DEFAULT: `1e-07`
`maxiter`	Maximum number of iterations. If None, defaults to 10len(b). TYPE:* `Optional[int]` DEFAULT: `None`
`progress`	If True, display progress bars for computing in the non-block mode (use_block_cg=False). TYPE: `bool` DEFAULT: `False`
`preconditioner`	Optional pre-conditioner to improve convergence of conjugate gradient method TYPE: `Optional[Preconditioner]` DEFAULT: `None`
`use_block_cg`	If True, use block variant of conjugate gradient method, which solves several right hand sides simultaneously TYPE: `bool` DEFAULT: `False`
`warn_on_max_iteration`	If True, logs a warning, if the desired tolerance is not achieved within `maxiter` iterations. If False, the log level for this information is `logging.DEBUG` TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/influence/torch/operator.py

def __init__(
    self,
    operator: TensorOperator,
    regularization: Optional[float] = None,
    rtol: float = 1e-7,
    atol: float = 1e-7,
    maxiter: Optional[int] = None,
    progress: bool = False,
    preconditioner: Optional[Preconditioner] = None,
    use_block_cg: bool = False,
    warn_on_max_iteration: bool = True,
):

    if regularization is not None and regularization < 0:
        raise ValueError("regularization must be non-negative")

    self.progress = progress
    self.warn_on_max_iteration = warn_on_max_iteration
    self.use_block_cg = use_block_cg
    self.preconditioner = preconditioner
    self.maxiter = maxiter
    self.atol = atol
    self.rtol = rtol
    self._regularization = regularization
    self.operator = operator

apply ¶

apply(tensor: TensorType) -> TensorType

Applies the operator to a tensor.

PARAMETER	DESCRIPTION
`tensor`	A tensor, whose tailing dimension must conform to the operator's input size TYPE: `TensorType`

RETURNS	DESCRIPTION
`TensorType`	A tensor representing the result of the operator application.

Source code in src/pydvl/influence/types.py

def apply(self, tensor: TensorType) -> TensorType:
    """
    Applies the operator to a tensor.

    Args:
        tensor: A tensor, whose tailing dimension must conform to the
            operator's input size

    Returns:
        A tensor representing the result of the operator application.
    """
    self._validate_tensor_input(tensor)
    return self._apply(tensor)