pydvl.influence.torch.batch_operation ¶

This module contains abstractions and implementations for operations carried out on a batch $b$. These operations are of the form

$$ m(b) \cdot v$$,

where $m(b)$ is a matrix defined by the data in the batch and $v$ is a vector or matrix. These batch operations can be used to conveniently build aggregations or recursions over sequence of batches, e.g. an average of the form

$$ \frac{1}{|B|} \sum_{b in B}m(b)\cdot v$$,

which is useful in the case that keeping $B$ in memory is not feasible.

ChunkAveraging ¶

Bases: _TensorAveraging[_TensorDictChunkAveraging]

Averages tensors, provided by a generator, and normalizes by the number of tensors.

GaussNewtonBatchOperation ¶

GaussNewtonBatchOperation(
    model: Module,
    loss: LossType,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _ModelBasedBatchOperation

Given a model and loss function computes the Gauss-Newton vector or matrix product with respect to the model parameters, i.e.

\[\begin{align*} G(\text{model}, \text{loss}, b, \theta) &\cdot v, \\\ G(\text{model}, \text{loss}, b, \theta) &= \frac{1}{|b|}\sum_{(x, y) \in b}\nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t, \\\ \ell(x,y; \theta) &= \text{loss}(\text{model}(x; \theta), y) \end{align*}\]

where model is a torch.nn.Module and $v$ is a vector or matrix.

PARAMETER	DESCRIPTION
`model`	The model. TYPE: `Module`
`loss`	The loss function. TYPE: `LossType`
`restrict_to`	The parameters to restrict the differentiation to, i.e. the corresponding sub-matrix of the Jacobian. If None, the full Jacobian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property `input_size`. TYPE: `Optional[Dict[str, Parameter]]` DEFAULT: `None`

Source code in src/pydvl/influence/torch/batch_operation.py

def __init__(
    self,
    model: torch.nn.Module,
    loss: LossType,
    restrict_to: Optional[Dict[str, torch.nn.Parameter]] = None,
):
    super().__init__(model, restrict_to=restrict_to)
    self.gradient_provider = TorchGradientProvider(
        model, loss, self.params_to_restrict_to
    )

apply ¶

apply(batch: TorchBatch, tensor: Tensor) -> Tensor

Applies the batch operation to a tensor. Args: batch: Batch of data for computation tensor: A tensor consistent to the operation, i.e. it must be at most 2-dim, and it's tailing dimension must be equal to the property input_size.

RETURNS	DESCRIPTION
`Tensor`	A tensor after applying the batch operation

Source code in src/pydvl/influence/torch/batch_operation.py

def apply(self, batch: TorchBatch, tensor: torch.Tensor) -> torch.Tensor:
    """
    Applies the batch operation to a tensor.
    Args:
        batch: Batch of data for computation
        tensor: A tensor consistent to the operation, i.e. it must be
            at most 2-dim, and it's tailing dimension must
            be equal to the property `input_size`.

    Returns:
        A tensor after applying the batch operation
    """

    if not tensor.ndim <= 2:
        raise ValueError(
            f"The input tensor must be at most 2-dimensional, got {tensor.ndim}"
        )

    if tensor.shape[-1] != self.input_size:
        raise ValueError(
            "The last dimension of the input tensor must be equal to the "
            "property `input_size`."
        )

    if tensor.ndim == 2:
        return self._apply_to_mat(batch.to(self.device), tensor.to(self.device))
    return self._apply_to_vec(batch.to(self.device), tensor.to(self.device))

HessianBatchOperation ¶

HessianBatchOperation(
    model: Module,
    loss: LossType,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _ModelBasedBatchOperation

Given a model and loss function computes the Hessian vector or matrix product with respect to the model parameters, i.e.

\[\begin{align*} &\nabla^2_{\theta} L(b;\theta) \cdot v \\\ &L(b;\theta) = \left( \frac{1}{|b|} \sum_{(x,y) \in b} \text{loss}(\text{model}(x; \theta), y)\right), \end{align*}\]

where model is a torch.nn.Module and $v$ is a vector or matrix.

PARAMETER	DESCRIPTION
`model`	The model. TYPE: `Module`
`loss`	The loss function. TYPE: `LossType`
`restrict_to`	The parameters to restrict the second order differentiation to, i.e. the corresponding sub-matrix of the Hessian. If None, the full Hessian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property `input_size`. TYPE: `Optional[Dict[str, Parameter]]` DEFAULT: `None`

Source code in src/pydvl/influence/torch/batch_operation.py

def __init__(
    self,
    model: torch.nn.Module,
    loss: LossType,
    restrict_to: Optional[Dict[str, torch.nn.Parameter]] = None,
):
    super().__init__(model, restrict_to=restrict_to)
    self._batch_hvp = create_batch_hvp_function(model, loss, reverse_only=True)
    self.loss = loss

apply ¶

apply(batch: TorchBatch, tensor: Tensor) -> Tensor

Applies the batch operation to a tensor. Args: batch: Batch of data for computation tensor: A tensor consistent to the operation, i.e. it must be at most 2-dim, and it's tailing dimension must be equal to the property input_size.

RETURNS	DESCRIPTION
`Tensor`	A tensor after applying the batch operation

Source code in src/pydvl/influence/torch/batch_operation.py

def apply(self, batch: TorchBatch, tensor: torch.Tensor) -> torch.Tensor:
    """
    Applies the batch operation to a tensor.
    Args:
        batch: Batch of data for computation
        tensor: A tensor consistent to the operation, i.e. it must be
            at most 2-dim, and it's tailing dimension must
            be equal to the property `input_size`.

    Returns:
        A tensor after applying the batch operation
    """

    if not tensor.ndim <= 2:
        raise ValueError(
            f"The input tensor must be at most 2-dimensional, got {tensor.ndim}"
        )

    if tensor.shape[-1] != self.input_size:
        raise ValueError(
            "The last dimension of the input tensor must be equal to the "
            "property `input_size`."
        )

    if tensor.ndim == 2:
        return self._apply_to_mat(batch.to(self.device), tensor.to(self.device))
    return self._apply_to_vec(batch.to(self.device), tensor.to(self.device))

InverseHarmonicMeanBatchOperation ¶

InverseHarmonicMeanBatchOperation(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    regularization: float,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)

Bases: _ModelBasedBatchOperation

Given a model and loss function computes an approximation of the inverse Gauss-Newton vector or matrix product. Viewing the damped Gauss-newton matrix

\[\begin{align*} G_{\lambda}(\text{model}, \text{loss}, b, \theta) &= \frac{1}{|b|}\sum_{(x, y) \in b}\nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t + \lambda \operatorname{I}, \\\ \ell(x,y; \theta) &= \text{loss}(\text{model}(x; \theta), y) \end{align*}\]

as an arithmetic mean of the rank-$1$ updates, this operation replaces it with the harmonic mean of the rank-$1$ updates, i.e.

\[ \tilde{G}_{\lambda}(\text{model}, \text{loss}, b, \theta) = \left(n \sum_{(x, y) \in b} \left( \nabla_{\theta}\ell (x,y; \theta) \nabla_{\theta}\ell (x,y; \theta)^t + \lambda \operatorname{I}\right)^{-1} \right)^{-1}\]

and computes

\[ \tilde{G}_{\lambda}^{-1}(\text{model}, \text{loss}, b, \theta) \cdot v.\]

where model is a torch.nn.Module and $v$ is a vector or matrix. In other words, it switches the order of summation and inversion, which resolves to the inverse harmonic mean of the rank-$1$ updates.

The inverses of the rank-$1$ updates are not calculated explicitly, but instead a vectorized version of the Sherman–Morrison formula is applied.

For more information, see Inverse Harmonic Mean.

PARAMETER	DESCRIPTION
`model`	The model. TYPE: `Module`
`loss`	The loss function. TYPE: `Callable[[Tensor, Tensor], Tensor]`
`restrict_to`	The parameters to restrict the differentiation to, i.e. the corresponding sub-matrix of the Jacobian. If None, the full Jacobian is used. Make sure the input matches the corrct dimension, i.e. the last dimension must be equal to the property `input_size`. TYPE: `Optional[Dict[str, Parameter]]` DEFAULT: `None`

Source code in src/pydvl/influence/torch/batch_operation.py

def __init__(
    self,
    model: torch.nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    regularization: float,
    restrict_to: Optional[Dict[str, torch.nn.Parameter]] = None,
):
    if regularization <= 0:
        raise ValueError("regularization must be positive")
    self.regularization = regularization

    super().__init__(model, restrict_to=restrict_to)
    self.gradient_provider = TorchGradientProvider(
        model, loss, self.params_to_restrict_to
    )

apply ¶

apply(batch: TorchBatch, tensor: Tensor) -> Tensor

Applies the batch operation to a tensor. Args: batch: Batch of data for computation tensor: A tensor consistent to the operation, i.e. it must be at most 2-dim, and it's tailing dimension must be equal to the property input_size.

RETURNS	DESCRIPTION
`Tensor`	A tensor after applying the batch operation

Source code in src/pydvl/influence/torch/batch_operation.py

def apply(self, batch: TorchBatch, tensor: torch.Tensor) -> torch.Tensor:
    """
    Applies the batch operation to a tensor.
    Args:
        batch: Batch of data for computation
        tensor: A tensor consistent to the operation, i.e. it must be
            at most 2-dim, and it's tailing dimension must
            be equal to the property `input_size`.

    Returns:
        A tensor after applying the batch operation
    """

    if not tensor.ndim <= 2:
        raise ValueError(
            f"The input tensor must be at most 2-dimensional, got {tensor.ndim}"
        )

    if tensor.shape[-1] != self.input_size:
        raise ValueError(
            "The last dimension of the input tensor must be equal to the "
            "property `input_size`."
        )

    if tensor.ndim == 2:
        return self._apply_to_mat(batch.to(self.device), tensor.to(self.device))
    return self._apply_to_vec(batch.to(self.device), tensor.to(self.device))

PointAveraging ¶

PointAveraging(batch_dim: int = 0)

Bases: _TensorAveraging[_TensorDictPointAveraging]

Averages tensors provided by a generator. The averaging is weighted by the number of points in each tensor and the final result is normalized by the number of total points.

PARAMETER	DESCRIPTION
`batch_dim`	Dimension to extract the number of points for the weighting. TYPE: `int` DEFAULT: `0`

Source code in src/pydvl/influence/torch/batch_operation.py

def __init__(self, batch_dim: int = 0):
    self.batch_dim = batch_dim