Skip to content

Influence function model

This module implements several implementations of InfluenceFunctionModel utilizing PyTorch.

TorchInfluenceFunctionModel(model, loss)

Bases: InfluenceFunctionModel[Tensor, DataLoader], ABC

Abstract base class for influence computation related to torch models

Source code in src/pydvl/influence/torch/influence_function_model.py
def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
):
    self.loss = loss
    self.model = model
    self._n_parameters = sum(
        [p.numel() for p in model.parameters() if p.requires_grad]
    )
    self._model_device = next(
        (p.device for p in model.parameters() if p.requires_grad)
    )
    self._model_params = {
        k: p.detach() for k, p in self.model.named_parameters() if p.requires_grad
    }
    super().__init__()

is_fitted abstractmethod property

Override this, to expose the fitting status of the instance.

fit(data) abstractmethod

Override this method to fit the influence function model to training data, e.g. pre-compute hessian matrix or matrix decompositions

PARAMETER DESCRIPTION
data

TYPE: DataLoaderType

RETURNS DESCRIPTION

The fitted instance

Source code in src/pydvl/influence/base_influence_function_model.py
@abstractmethod
def fit(self, data: DataLoaderType):
    """
    Override this method to fit the influence function model to training data,
    e.g. pre-compute hessian matrix or matrix decompositions

    Args:
        data:

    Returns:
        The fitted instance
    """

influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)

Compute the approximation of

\[ \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case.

PARAMETER DESCRIPTION
x_test

model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

y_test

label tensor to compute gradients

TYPE: Tensor

x

optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)

TYPE: Optional[Tensor] DEFAULT: None

y

optional label tensor to compute gradients

TYPE: Optional[Tensor] DEFAULT: None

mode

enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences(
    self,
    x_test: torch.Tensor,
    y_test: torch.Tensor,
    x: Optional[torch.Tensor] = None,
    y: Optional[torch.Tensor] = None,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Compute the approximation of

    \[
    \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})),
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle
    \]

    for the case of up-weighting influence, resp.

    \[
    \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})),
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle
    \]

    for the perturbation type influence case.

    Args:
        x_test: model input to use in the gradient computations
            of $H^{-1}\nabla_{theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
        y_test: label tensor to compute gradients
        x: optional model input to use in the gradient computations
            $\nabla_{theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))$,
            if None, use $x=x_{\text{test}}$
        y: optional label tensor to compute gradients
        mode: enum value of [InfluenceType]
            [pydvl.influence.base_influence_model.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    t: torch.Tensor = super().influences(x_test, y_test, x, y, mode=mode)
    return t

influence_factors(x, y)

Compute approximation of

\[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

where the gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
x

model input to use in the gradient computations

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise inverse Hessian matrix vector products

Source code in src/pydvl/influence/torch/influence_function_model.py
def influence_factors(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    r"""
    Compute approximation of

    \[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

    where the gradient is meant to be per sample of the batch $(x, y)$.

    Args:
        x: model input to use in the gradient computations
        y: label tensor to compute gradients

    Returns:
        Tensor representing the element-wise inverse Hessian matrix vector products

    """
    return super().influence_factors(x, y)

influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)

Computation of

\[ \langle z_{\text{test_factors}}, \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle z_{\text{test_factors}}, \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
z_test_factors

pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

x

model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

mode

enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences_from_factors(
    self,
    z_test_factors: torch.Tensor,
    x: torch.Tensor,
    y: torch.Tensor,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Computation of

    \[ \langle z_{\text{test_factors}},
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the case of up-weighting influence, resp.

    \[ \langle z_{\text{test_factors}},
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the perturbation type influence case. The gradient is meant to be per sample
    of the batch $(x, y)$.

    Args:
         z_test_factors: pre-computed tensor, approximating
            $H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
         x: model input to use in the gradient computations
            $\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))$
         y: label tensor to compute gradients
         mode: enum value of [InfluenceType]
            [pydvl.influence.twice_differentiable.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    if mode == InfluenceMode.Up:
        return (
            z_test_factors
            @ self._loss_grad(x.to(self.model_device), y.to(self.model_device)).T
        )
    elif mode == InfluenceMode.Perturbation:
        return torch.einsum(
            "ia,j...a->ij...",
            z_test_factors,
            self._flat_loss_mixed_grad(
                x.to(self.model_device), y.to(self.model_device)
            ),
        )
    else:
        raise UnsupportedInfluenceModeException(mode)

DirectInfluence(model, loss, hessian_regularization=0.0)

Bases: TorchInfluenceFunctionModel

Given a model and training data, it finds x such that \(Hx = b\), with \(H\) being the model hessian.

PARAMETER DESCRIPTION
model

instance of torch.nn.Module.

TYPE: Module

hessian_regularization

Regularization of the hessian.

TYPE: float DEFAULT: 0.0

Source code in src/pydvl/influence/torch/influence_function_model.py
def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    hessian_regularization: float = 0.0,
):
    super().__init__(model, loss)
    self.hessian_regularization = hessian_regularization

influence_factors(x, y)

Compute approximation of

\[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

where the gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
x

model input to use in the gradient computations

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise inverse Hessian matrix vector products

Source code in src/pydvl/influence/torch/influence_function_model.py
def influence_factors(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    r"""
    Compute approximation of

    \[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

    where the gradient is meant to be per sample of the batch $(x, y)$.

    Args:
        x: model input to use in the gradient computations
        y: label tensor to compute gradients

    Returns:
        Tensor representing the element-wise inverse Hessian matrix vector products

    """
    return super().influence_factors(x, y)

influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)

Computation of

\[ \langle z_{\text{test_factors}}, \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle z_{\text{test_factors}}, \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
z_test_factors

pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

x

model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

mode

enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences_from_factors(
    self,
    z_test_factors: torch.Tensor,
    x: torch.Tensor,
    y: torch.Tensor,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Computation of

    \[ \langle z_{\text{test_factors}},
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the case of up-weighting influence, resp.

    \[ \langle z_{\text{test_factors}},
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the perturbation type influence case. The gradient is meant to be per sample
    of the batch $(x, y)$.

    Args:
         z_test_factors: pre-computed tensor, approximating
            $H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
         x: model input to use in the gradient computations
            $\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))$
         y: label tensor to compute gradients
         mode: enum value of [InfluenceType]
            [pydvl.influence.twice_differentiable.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    if mode == InfluenceMode.Up:
        return (
            z_test_factors
            @ self._loss_grad(x.to(self.model_device), y.to(self.model_device)).T
        )
    elif mode == InfluenceMode.Perturbation:
        return torch.einsum(
            "ia,j...a->ij...",
            z_test_factors,
            self._flat_loss_mixed_grad(
                x.to(self.model_device), y.to(self.model_device)
            ),
        )
    else:
        raise UnsupportedInfluenceModeException(mode)

fit(data)

Compute the hessian matrix based on a provided dataloader

PARAMETER DESCRIPTION
data

Instance of [torch.utils.data.Dataloader][]

TYPE: DataLoader

RETURNS DESCRIPTION
DirectInfluence

The fitted instance

Source code in src/pydvl/influence/torch/influence_function_model.py
def fit(self, data: DataLoader) -> DirectInfluence:
    """
    Compute the hessian matrix based on a provided dataloader

    Args:
        data: Instance of [torch.utils.data.Dataloader]
            [torch.utils.data.Dataloader]

    Returns:
        The fitted instance
    """
    self.hessian = hessian(self.model, self.loss, data)
    return self

influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)

Compute approximation of

\[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle, \]

for the case of up-weighting influence, resp.

\[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case. The action of \(H^{-1}\) is achieved via a direct solver using torch.linalg.solve.

PARAMETER DESCRIPTION
x_test

model input to use in the gradient computations of \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

y_test

label tensor to compute gradients

TYPE: Tensor

x

optional model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)

TYPE: Optional[Tensor] DEFAULT: None

y

optional label tensor to compute gradients

TYPE: Optional[Tensor] DEFAULT: None

mode

enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

[torch.nn.Tensor][] representing the element-wise scalar products for the provided batch.

Source code in src/pydvl/influence/torch/influence_function_model.py
@log_duration
def influences(
    self,
    x_test: torch.Tensor,
    y_test: torch.Tensor,
    x: Optional[torch.Tensor] = None,
    y: Optional[torch.Tensor] = None,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Compute approximation of

    \[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
        f_{\theta}(x_{\text{test}})),
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle, \]

    for the case of up-weighting influence, resp.

    \[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
        f_{\theta}(x_{\text{test}})),
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the perturbation type influence case. The action of $H^{-1}$ is achieved
    via a direct solver using [torch.linalg.solve][torch.linalg.solve].

    Args:
        x_test: model input to use in the gradient computations of
            $H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
        y_test: label tensor to compute gradients
        x: optional model input to use in the gradient computations
            $\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            if None, use $x=x_{\text{test}}$
        y: optional label tensor to compute gradients
        mode: enum value of [InfluenceType]
            [pydvl.influence.base_influence_model.InfluenceType]

    Returns:
        [torch.nn.Tensor][torch.nn.Tensor] representing the element-wise
            scalar products for the provided batch.

    """
    return super().influences(x_test, y_test, x, y, mode=mode)

CgInfluence(model, loss, hessian_regularization=0.0, x0=None, rtol=1e-07, atol=1e-07, maxiter=None, progress=False)

Bases: TorchInfluenceFunctionModel

Given a model and training data, it uses conjugate gradient to calculate the inverse of the Hessian Vector Product. More precisely, it finds x such that \(Hx = b\), with \(H\) being the model hessian. For more info, see Conjugate Gradient.

PARAMETER DESCRIPTION
model

Instance of torch.nn.Module.

TYPE: Module

loss

A callable that takes the model's output and target as input and returns the scalar loss.

TYPE: Callable[[Tensor, Tensor], Tensor]

hessian_regularization

Regularization of the hessian.

TYPE: float DEFAULT: 0.0

x0

Initial guess for hvp. If None, defaults to b.

TYPE: Optional[Tensor] DEFAULT: None

rtol

Maximum relative tolerance of result.

TYPE: float DEFAULT: 1e-07

atol

Absolute tolerance of result.

TYPE: float DEFAULT: 1e-07

maxiter

Maximum number of iterations. If None, defaults to 10*len(b).

TYPE: Optional[int] DEFAULT: None

progress

If True, display progress bars.

TYPE: bool DEFAULT: False

Source code in src/pydvl/influence/torch/influence_function_model.py
def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    hessian_regularization: float = 0.0,
    x0: Optional[torch.Tensor] = None,
    rtol: float = 1e-7,
    atol: float = 1e-7,
    maxiter: Optional[int] = None,
    progress: bool = False,
):
    super().__init__(model, loss)
    self.progress = progress
    self.maxiter = maxiter
    self.atol = atol
    self.rtol = rtol
    self.x0 = x0
    self.hessian_regularization = hessian_regularization

influence_factors(x, y)

Compute approximation of

\[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

where the gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
x

model input to use in the gradient computations

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise inverse Hessian matrix vector products

Source code in src/pydvl/influence/torch/influence_function_model.py
def influence_factors(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    r"""
    Compute approximation of

    \[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

    where the gradient is meant to be per sample of the batch $(x, y)$.

    Args:
        x: model input to use in the gradient computations
        y: label tensor to compute gradients

    Returns:
        Tensor representing the element-wise inverse Hessian matrix vector products

    """
    return super().influence_factors(x, y)

influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)

Computation of

\[ \langle z_{\text{test_factors}}, \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle z_{\text{test_factors}}, \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
z_test_factors

pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

x

model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

mode

enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences_from_factors(
    self,
    z_test_factors: torch.Tensor,
    x: torch.Tensor,
    y: torch.Tensor,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Computation of

    \[ \langle z_{\text{test_factors}},
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the case of up-weighting influence, resp.

    \[ \langle z_{\text{test_factors}},
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the perturbation type influence case. The gradient is meant to be per sample
    of the batch $(x, y)$.

    Args:
         z_test_factors: pre-computed tensor, approximating
            $H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
         x: model input to use in the gradient computations
            $\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))$
         y: label tensor to compute gradients
         mode: enum value of [InfluenceType]
            [pydvl.influence.twice_differentiable.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    if mode == InfluenceMode.Up:
        return (
            z_test_factors
            @ self._loss_grad(x.to(self.model_device), y.to(self.model_device)).T
        )
    elif mode == InfluenceMode.Perturbation:
        return torch.einsum(
            "ia,j...a->ij...",
            z_test_factors,
            self._flat_loss_mixed_grad(
                x.to(self.model_device), y.to(self.model_device)
            ),
        )
    else:
        raise UnsupportedInfluenceModeException(mode)

influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)

Compute approximation of

\[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle, \]

for the case of up-weighting influence, resp.

\[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case. The approximate action of \(H^{-1}\) is achieved via the [conjugate gradient method] (https://en.wikipedia.org/wiki/Conjugate_gradient_method).

PARAMETER DESCRIPTION
x_test

model input to use in the gradient computations of \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

y_test

label tensor to compute gradients

TYPE: Tensor

x

optional model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)

TYPE: Optional[Tensor] DEFAULT: None

y

optional label tensor to compute gradients

TYPE: Optional[Tensor] DEFAULT: None

mode

enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

[torch.nn.Tensor][] representing the element-wise scalar products for the provided batch.

Source code in src/pydvl/influence/torch/influence_function_model.py
@log_duration
def influences(
    self,
    x_test: torch.Tensor,
    y_test: torch.Tensor,
    x: Optional[torch.Tensor] = None,
    y: Optional[torch.Tensor] = None,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Compute approximation of

    \[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
        f_{\theta}(x_{\text{test}})),
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle, \]

    for the case of up-weighting influence, resp.

    \[ \langle H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
        f_{\theta}(x_{\text{test}})),
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the perturbation type influence case. The approximate action of $H^{-1}$
    is achieved via the [conjugate gradient method]
    (https://en.wikipedia.org/wiki/Conjugate_gradient_method).

    Args:
        x_test: model input to use in the gradient computations of
            $H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
        y_test: label tensor to compute gradients
        x: optional model input to use in the gradient computations
            $\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            if None, use $x=x_{\text{test}}$
        y: optional label tensor to compute gradients
        mode: enum value of [InfluenceType]
            [pydvl.influence.base_influence_model.InfluenceType]

    Returns:
        [torch.nn.Tensor][torch.nn.Tensor] representing the element-wise
            scalar products for the provided batch.

    """
    return super().influences(x_test, y_test, x, y, mode=mode)

LissaInfluence(model, loss, hessian_regularization=0.0, maxiter=1000, dampen=0.0, scale=10.0, h0=None, rtol=0.0001, progress=False)

Bases: TorchInfluenceFunctionModel

Uses LISSA, Linear time Stochastic Second-Order Algorithm, to iteratively approximate the inverse Hessian. More precisely, it finds x s.t. \(Hx = b\), with \(H\) being the model's second derivative wrt. the parameters. This is done with the update

\[H^{-1}_{j+1} b = b + (I - d) \ H - \frac{H^{-1}_j b}{s},\]

where \(I\) is the identity matrix, \(d\) is a dampening term and \(s\) a scaling factor that are applied to help convergence. For details, see Linear time Stochastic Second-Order Approximation (LiSSA)

PARAMETER DESCRIPTION
model

instance of torch.nn.Module.

TYPE: Module

hessian_regularization

Regularization of the hessian.

TYPE: float DEFAULT: 0.0

maxiter

Maximum number of iterations.

TYPE: int DEFAULT: 1000

dampen

Dampening factor, defaults to 0 for no dampening.

TYPE: float DEFAULT: 0.0

scale

Scaling factor, defaults to 10.

TYPE: float DEFAULT: 10.0

h0

Initial guess for hvp.

TYPE: Optional[Tensor] DEFAULT: None

rtol

tolerance to use for early stopping

TYPE: float DEFAULT: 0.0001

progress

If True, display progress bars.

TYPE: bool DEFAULT: False

Source code in src/pydvl/influence/torch/influence_function_model.py
def __init__(
    self,
    model: nn.Module,
    loss: Callable[[torch.Tensor, torch.Tensor], torch.Tensor],
    hessian_regularization: float = 0.0,
    maxiter: int = 1000,
    dampen: float = 0.0,
    scale: float = 10.0,
    h0: Optional[torch.Tensor] = None,
    rtol: float = 1e-4,
    progress: bool = False,
):
    super().__init__(model, loss)
    self.maxiter = maxiter
    self.hessian_regularization = hessian_regularization
    self.progress = progress
    self.rtol = rtol
    self.h0 = h0
    self.scale = scale
    self.dampen = dampen

influence_factors(x, y)

Compute approximation of

\[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

where the gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
x

model input to use in the gradient computations

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise inverse Hessian matrix vector products

Source code in src/pydvl/influence/torch/influence_function_model.py
def influence_factors(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    r"""
    Compute approximation of

    \[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

    where the gradient is meant to be per sample of the batch $(x, y)$.

    Args:
        x: model input to use in the gradient computations
        y: label tensor to compute gradients

    Returns:
        Tensor representing the element-wise inverse Hessian matrix vector products

    """
    return super().influence_factors(x, y)

influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)

Compute the approximation of

\[ \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case.

PARAMETER DESCRIPTION
x_test

model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

y_test

label tensor to compute gradients

TYPE: Tensor

x

optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)

TYPE: Optional[Tensor] DEFAULT: None

y

optional label tensor to compute gradients

TYPE: Optional[Tensor] DEFAULT: None

mode

enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences(
    self,
    x_test: torch.Tensor,
    y_test: torch.Tensor,
    x: Optional[torch.Tensor] = None,
    y: Optional[torch.Tensor] = None,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Compute the approximation of

    \[
    \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})),
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle
    \]

    for the case of up-weighting influence, resp.

    \[
    \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})),
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle
    \]

    for the perturbation type influence case.

    Args:
        x_test: model input to use in the gradient computations
            of $H^{-1}\nabla_{theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
        y_test: label tensor to compute gradients
        x: optional model input to use in the gradient computations
            $\nabla_{theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))$,
            if None, use $x=x_{\text{test}}$
        y: optional label tensor to compute gradients
        mode: enum value of [InfluenceType]
            [pydvl.influence.base_influence_model.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    t: torch.Tensor = super().influences(x_test, y_test, x, y, mode=mode)
    return t

influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)

Computation of

\[ \langle z_{\text{test_factors}}, \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle z_{\text{test_factors}}, \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
z_test_factors

pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

x

model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

mode

enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences_from_factors(
    self,
    z_test_factors: torch.Tensor,
    x: torch.Tensor,
    y: torch.Tensor,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Computation of

    \[ \langle z_{\text{test_factors}},
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the case of up-weighting influence, resp.

    \[ \langle z_{\text{test_factors}},
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the perturbation type influence case. The gradient is meant to be per sample
    of the batch $(x, y)$.

    Args:
         z_test_factors: pre-computed tensor, approximating
            $H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
         x: model input to use in the gradient computations
            $\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))$
         y: label tensor to compute gradients
         mode: enum value of [InfluenceType]
            [pydvl.influence.twice_differentiable.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    if mode == InfluenceMode.Up:
        return (
            z_test_factors
            @ self._loss_grad(x.to(self.model_device), y.to(self.model_device)).T
        )
    elif mode == InfluenceMode.Perturbation:
        return torch.einsum(
            "ia,j...a->ij...",
            z_test_factors,
            self._flat_loss_mixed_grad(
                x.to(self.model_device), y.to(self.model_device)
            ),
        )
    else:
        raise UnsupportedInfluenceModeException(mode)

ArnoldiInfluence(model, loss, hessian_regularization=0.0, rank_estimate=10, krylov_dimension=None, tol=1e-06, max_iter=None, eigen_computation_on_gpu=False)

Bases: TorchInfluenceFunctionModel

Solves the linear system Hx = b, where H is the Hessian of the model's loss function and b is the given right-hand side vector. It employs the [implicitly restarted Arnoldi method] (https://en.wikipedia.org/wiki/Arnoldi_iteration) for computing a partial eigen decomposition, which is used fo the inversion i.e.

\[x = V D^{-1} V^T b\]

where \(D\) is a diagonal matrix with the top (in absolute value) rank_estimate eigenvalues of the Hessian and \(V\) contains the corresponding eigenvectors. For more information, see Arnoldi.

PARAMETER DESCRIPTION
model

Instance of torch.nn.Module. The Hessian will be calculated with respect to this model's parameters.

hessian_regularization

Optional regularization parameter added to the Hessian-vector product for numerical stability.

TYPE: float DEFAULT: 0.0

rank_estimate

The number of eigenvalues and corresponding eigenvectors to compute. Represents the desired rank of the Hessian approximation.

TYPE: int DEFAULT: 10

krylov_dimension

The number of Krylov vectors to use for the Lanczos method. Defaults to min(model's number of parameters, max(2 times rank_estimate + 1, 20)).

TYPE: Optional[int] DEFAULT: None

tol

The stopping criteria for the Lanczos algorithm. Ignored if low_rank_representation is provided.

TYPE: float DEFAULT: 1e-06

max_iter

The maximum number of iterations for the Lanczos method. Ignored if low_rank_representation is provided.

TYPE: Optional[int] DEFAULT: None

eigen_computation_on_gpu

If True, tries to execute the eigen pair approximation on the model's device via a cupy implementation. Ensure the model size or rank_estimate is appropriate for device memory. If False, the eigen pair approximation is executed on the CPU by the scipy wrapper to ARPACK.

TYPE: bool DEFAULT: False

Source code in src/pydvl/influence/torch/influence_function_model.py
def __init__(
    self,
    model,
    loss,
    hessian_regularization: float = 0.0,
    rank_estimate: int = 10,
    krylov_dimension: Optional[int] = None,
    tol: float = 1e-6,
    max_iter: Optional[int] = None,
    eigen_computation_on_gpu: bool = False,
):

    super().__init__(model, loss)
    self.hessian_regularization = hessian_regularization
    self.rank_estimate = rank_estimate
    self.tol = tol
    self.max_iter = max_iter
    self.krylov_dimension = krylov_dimension
    self.eigen_computation_on_gpu = eigen_computation_on_gpu

influence_factors(x, y)

Compute approximation of

\[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

where the gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
x

model input to use in the gradient computations

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise inverse Hessian matrix vector products

Source code in src/pydvl/influence/torch/influence_function_model.py
def influence_factors(self, x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
    r"""
    Compute approximation of

    \[ H^{-1}\nabla_{\theta} \ell(y, f_{\theta}(x)) \]

    where the gradient is meant to be per sample of the batch $(x, y)$.

    Args:
        x: model input to use in the gradient computations
        y: label tensor to compute gradients

    Returns:
        Tensor representing the element-wise inverse Hessian matrix vector products

    """
    return super().influence_factors(x, y)

influences(x_test, y_test, x=None, y=None, mode=InfluenceMode.Up)

Compute the approximation of

\[ \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})), \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case.

PARAMETER DESCRIPTION
x_test

model input to use in the gradient computations of \(H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

y_test

label tensor to compute gradients

TYPE: Tensor

x

optional model input to use in the gradient computations \(\nabla_{theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))\), if None, use \(x=x_{\text{test}}\)

TYPE: Optional[Tensor] DEFAULT: None

y

optional label tensor to compute gradients

TYPE: Optional[Tensor] DEFAULT: None

mode

enum value of [InfluenceType][pydvl.influence.base_influence_model.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences(
    self,
    x_test: torch.Tensor,
    y_test: torch.Tensor,
    x: Optional[torch.Tensor] = None,
    y: Optional[torch.Tensor] = None,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Compute the approximation of

    \[
    \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})),
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle
    \]

    for the case of up-weighting influence, resp.

    \[
    \langle H^{-1}\nabla_{theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}})),
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle
    \]

    for the perturbation type influence case.

    Args:
        x_test: model input to use in the gradient computations
            of $H^{-1}\nabla_{theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
        y_test: label tensor to compute gradients
        x: optional model input to use in the gradient computations
            $\nabla_{theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{theta}\ell(y, f_{\theta}(x))$,
            if None, use $x=x_{\text{test}}$
        y: optional label tensor to compute gradients
        mode: enum value of [InfluenceType]
            [pydvl.influence.base_influence_model.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    t: torch.Tensor = super().influences(x_test, y_test, x, y, mode=mode)
    return t

influences_from_factors(z_test_factors, x, y, mode=InfluenceMode.Up)

Computation of

\[ \langle z_{\text{test_factors}}, \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the case of up-weighting influence, resp.

\[ \langle z_{\text{test_factors}}, \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

for the perturbation type influence case. The gradient is meant to be per sample of the batch \((x, y)\).

PARAMETER DESCRIPTION
z_test_factors

pre-computed tensor, approximating \(H^{-1}\nabla_{\theta} \ell(y_{\text{test}}, f_{\theta}(x_{\text{test}}))\)

TYPE: Tensor

x

model input to use in the gradient computations \(\nabla_{\theta}\ell(y, f_{\theta}(x))\), resp. \(\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))\)

TYPE: Tensor

y

label tensor to compute gradients

TYPE: Tensor

mode

enum value of [InfluenceType][pydvl.influence.twice_differentiable.InfluenceType]

TYPE: InfluenceMode DEFAULT: Up

RETURNS DESCRIPTION
Tensor

Tensor representing the element-wise scalar products for the provided batch

Source code in src/pydvl/influence/torch/influence_function_model.py
def influences_from_factors(
    self,
    z_test_factors: torch.Tensor,
    x: torch.Tensor,
    y: torch.Tensor,
    mode: InfluenceMode = InfluenceMode.Up,
) -> torch.Tensor:
    r"""
    Computation of

    \[ \langle z_{\text{test_factors}},
        \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the case of up-weighting influence, resp.

    \[ \langle z_{\text{test_factors}},
        \nabla_{x} \nabla_{\theta} \ell(y, f_{\theta}(x)) \rangle \]

    for the perturbation type influence case. The gradient is meant to be per sample
    of the batch $(x, y)$.

    Args:
         z_test_factors: pre-computed tensor, approximating
            $H^{-1}\nabla_{\theta} \ell(y_{\text{test}},
                f_{\theta}(x_{\text{test}}))$
         x: model input to use in the gradient computations
            $\nabla_{\theta}\ell(y, f_{\theta}(x))$,
            resp. $\nabla_{x}\nabla_{\theta}\ell(y, f_{\theta}(x))$
         y: label tensor to compute gradients
         mode: enum value of [InfluenceType]
            [pydvl.influence.twice_differentiable.InfluenceType]

    Returns:
        Tensor representing the element-wise scalar products for the provided batch

    """
    if mode == InfluenceMode.Up:
        return (
            z_test_factors
            @ self._loss_grad(x.to(self.model_device), y.to(self.model_device)).T
        )
    elif mode == InfluenceMode.Perturbation:
        return torch.einsum(
            "ia,j...a->ij...",
            z_test_factors,
            self._flat_loss_mixed_grad(
                x.to(self.model_device), y.to(self.model_device)
            ),
        )
    else:
        raise UnsupportedInfluenceModeException(mode)

fit(data)

Fitting corresponds to the computation of the low rank decomposition

\[ V D^{-1} V^T \]

of the Hessian defined by the provided data loader.

PARAMETER DESCRIPTION
data

Instance of [torch.utils.data.Dataloader][]

TYPE: DataLoader

RETURNS DESCRIPTION
ArnoldiInfluence

The fitted instance

Source code in src/pydvl/influence/torch/influence_function_model.py
def fit(self, data: DataLoader) -> ArnoldiInfluence:
    r"""
    Fitting corresponds to the computation of the low rank decomposition

    \[ V D^{-1} V^T \]

    of the Hessian defined by the provided data loader.

    Args:
        data: Instance of [torch.utils.data.Dataloader][torch.utils.data.Dataloader]

    Returns:
        The fitted instance

    """
    low_rank_representation = model_hessian_low_rank(
        self.model,
        self.loss,
        data,
        hessian_perturbation=0.0,  # regularization is applied, when computing values
        rank_estimate=self.rank_estimate,
        krylov_dimension=self.krylov_dimension,
        tol=self.tol,
        max_iter=self.max_iter,
        eigen_computation_on_gpu=self.eigen_computation_on_gpu,
    )
    self.low_rank_representation = low_rank_representation.to(self.model_device)
    return self

Last update: 2023-12-21
Created: 2023-12-21