Skip to content

pydvl.influence.torch.pre_conditioner

PreConditioner

Bases: ABC

Abstract base class for implementing pre-conditioners for improving the convergence of CG for systems of the form

\[ ( A + \lambda \operatorname{I})x = \operatorname{rhs} \]

i.e. a matrix \(M\) such that \(M^{-1}(A + \lambda \operatorname{I})\) has a better condition number than \(A + \lambda \operatorname{I}\).

fit abstractmethod

fit(
    mat_mat_prod: Callable[[Tensor], Tensor],
    size: int,
    dtype: dtype,
    device: device,
    regularization: float = 0.0,
)

Implement this to fit the pre-conditioner to the matrix represented by the mat_mat_prod Args: mat_mat_prod: a callable that computes the matrix-matrix product size: size of the matrix represented by mat_mat_prod dtype: data type of the matrix represented by mat_mat_prod device: device of the matrix represented by mat_mat_prod regularization: regularization parameter \(\lambda\) in the equation $ ( A + \lambda \operatorname{I})x = \operatorname{rhs} $ Returns: self

Source code in src/pydvl/influence/torch/pre_conditioner.py
@abstractmethod
def fit(
    self,
    mat_mat_prod: Callable[[torch.Tensor], torch.Tensor],
    size: int,
    dtype: torch.dtype,
    device: torch.device,
    regularization: float = 0.0,
):
    r"""
    Implement this to fit the pre-conditioner to the matrix represented by the
    mat_mat_prod
    Args:
        mat_mat_prod: a callable that computes the matrix-matrix product
        size: size of the matrix represented by `mat_mat_prod`
        dtype: data type of the matrix represented by `mat_mat_prod`
        device: device of the matrix represented by `mat_mat_prod`
        regularization: regularization parameter $\lambda$ in the equation
            $ ( A + \lambda \operatorname{I})x = \operatorname{rhs} $
    Returns:
        self
    """
    pass

solve

solve(rhs: Tensor)

Solve the equation \(M@Z = \operatorname{rhs}\) Args: rhs: right hand side of the equation, corresponds to the residuum vector (or matrix) in the conjugate gradient method

RETURNS DESCRIPTION

solution \(M^{-1}\operatorname{rhs}\)

Source code in src/pydvl/influence/torch/pre_conditioner.py
def solve(self, rhs: torch.Tensor):
    r"""
    Solve the equation $M@Z = \operatorname{rhs}$
    Args:
        rhs: right hand side of the equation, corresponds to the residuum vector
            (or matrix) in the conjugate gradient method

    Returns:
        solution $M^{-1}\operatorname{rhs}$

    """
    if not self.is_fitted:
        raise NotFittedException(type(self))

    return self._solve(rhs)

to abstractmethod

to(device: device) -> PreConditioner

Implement this to move the (potentially fitted) preconditioner to a specific device

Source code in src/pydvl/influence/torch/pre_conditioner.py
@abstractmethod
def to(self, device: torch.device) -> PreConditioner:
    """Implement this to move the (potentially fitted) preconditioner to a
    specific device"""

JacobiPreConditioner

JacobiPreConditioner(num_samples_estimator: int = 1)

Bases: PreConditioner

Pre-conditioner for improving the convergence of CG for systems of the form

\[ ( A + \lambda \operatorname{I})x = \operatorname{rhs} \]

The JacobiPreConditioner uses the diagonal information of the matrix \(A\). The diagonal elements are not computed directly but estimated via Hutchinson's estimator.

\[ M = \frac{1}{m} \sum_{i=1}^m u_i \odot Au_i + \lambda \operatorname{I} \]

where \(u_i\) are i.i.d. Gaussian random vectors. Works well in the case the matrix \(A + \lambda \operatorname{I}\) is diagonal dominant. For more information, see the documentation of Conjugate Gradient Args: num_samples_estimator: number of samples to use in computation of Hutchinson's estimator

Source code in src/pydvl/influence/torch/pre_conditioner.py
def __init__(self, num_samples_estimator: int = 1):
    self.num_samples_estimator = num_samples_estimator

solve

solve(rhs: Tensor)

Solve the equation \(M@Z = \operatorname{rhs}\) Args: rhs: right hand side of the equation, corresponds to the residuum vector (or matrix) in the conjugate gradient method

RETURNS DESCRIPTION

solution \(M^{-1}\operatorname{rhs}\)

Source code in src/pydvl/influence/torch/pre_conditioner.py
def solve(self, rhs: torch.Tensor):
    r"""
    Solve the equation $M@Z = \operatorname{rhs}$
    Args:
        rhs: right hand side of the equation, corresponds to the residuum vector
            (or matrix) in the conjugate gradient method

    Returns:
        solution $M^{-1}\operatorname{rhs}$

    """
    if not self.is_fitted:
        raise NotFittedException(type(self))

    return self._solve(rhs)

fit

fit(
    mat_mat_prod: Callable[[Tensor], Tensor],
    size: int,
    dtype: dtype,
    device: device,
    regularization: float = 0.0,
)

Fits by computing an estimate of the diagonal of the matrix represented by mat_mat_prod via Hutchinson's estimator

PARAMETER DESCRIPTION
mat_mat_prod

a callable representing the matrix-matrix product

TYPE: Callable[[Tensor], Tensor]

size

size of the square matrix

TYPE: int

dtype

needed data type of inputs for the mat_mat_prod

TYPE: dtype

device

needed device for inputs of mat_mat_prod

TYPE: device

regularization

regularization parameter \(\lambda\) in \((A+\lambda I)x=b\)

TYPE: float DEFAULT: 0.0

Source code in src/pydvl/influence/torch/pre_conditioner.py
def fit(
    self,
    mat_mat_prod: Callable[[torch.Tensor], torch.Tensor],
    size: int,
    dtype: torch.dtype,
    device: torch.device,
    regularization: float = 0.0,
):
    r"""
    Fits by computing an estimate of the diagonal of the matrix represented by
    `mat_mat_prod` via Hutchinson's estimator

    Args:
        mat_mat_prod: a callable representing the matrix-matrix product
        size: size of the square matrix
        dtype: needed data type of inputs for the mat_mat_prod
        device: needed device for inputs of mat_mat_prod
        regularization: regularization parameter
            $\lambda$ in $(A+\lambda I)x=b$
    """
    random_samples = torch.randn(
        size, self.num_samples_estimator, device=device, dtype=dtype
    )
    diagonal_estimate = torch.sum(
        torch.mul(random_samples, mat_mat_prod(random_samples)), dim=1
    )
    diagonal_estimate /= self.num_samples_estimator
    self._diag = diagonal_estimate
    self._reg = regularization

NystroemPreConditioner

NystroemPreConditioner(rank: int)

Bases: PreConditioner

Pre-conditioner for improving the convergence of CG for systems of the form

\[ (A + \lambda \operatorname{I})x = \operatorname{rhs} \]

The NystroemPreConditioner computes a low-rank approximation

\[ A_{\text{nys}} = (A \Omega)(\Omega^T A \Omega)^{\dagger}(A \Omega)^T = U \Sigma U^T, \]

where \((\cdot)^{\dagger}\) denotes the Moore-Penrose inverse, and uses the matrix

\[ M^{-1} = (\lambda + \sigma_{\text{rank}})U(\Sigma+ \lambda \operatorname{I})^{-1}U^T+(\operatorname{I} - UU^T) \]

for pre-conditioning, where \( \sigma_{\text{rank}} \) is the smallest eigenvalue of the low-rank approximation.

Source code in src/pydvl/influence/torch/pre_conditioner.py
def __init__(self, rank: int):
    self._rank = rank

solve

solve(rhs: Tensor)

Solve the equation \(M@Z = \operatorname{rhs}\) Args: rhs: right hand side of the equation, corresponds to the residuum vector (or matrix) in the conjugate gradient method

RETURNS DESCRIPTION

solution \(M^{-1}\operatorname{rhs}\)

Source code in src/pydvl/influence/torch/pre_conditioner.py
def solve(self, rhs: torch.Tensor):
    r"""
    Solve the equation $M@Z = \operatorname{rhs}$
    Args:
        rhs: right hand side of the equation, corresponds to the residuum vector
            (or matrix) in the conjugate gradient method

    Returns:
        solution $M^{-1}\operatorname{rhs}$

    """
    if not self.is_fitted:
        raise NotFittedException(type(self))

    return self._solve(rhs)

fit

fit(
    mat_mat_prod: Callable[[Tensor], Tensor],
    size: int,
    dtype: dtype,
    device: device,
    regularization: float = 0.0,
)

Fits by computing a low-rank approximation of the matrix represented by mat_mat_prod via Nystroem approximation

PARAMETER DESCRIPTION
mat_mat_prod

a callable representing the matrix-matrix product

TYPE: Callable[[Tensor], Tensor]

size

size of the square matrix

TYPE: int

dtype

needed data type of inputs for the mat_mat_prod

TYPE: dtype

device

needed device for inputs of mat_mat_prod

TYPE: device

regularization

regularization parameter \(\lambda\) in \((A+\lambda I)x=b\)

TYPE: float DEFAULT: 0.0

Source code in src/pydvl/influence/torch/pre_conditioner.py
def fit(
    self,
    mat_mat_prod: Callable[[torch.Tensor], torch.Tensor],
    size: int,
    dtype: torch.dtype,
    device: torch.device,
    regularization: float = 0.0,
):
    r"""
    Fits by computing a low-rank approximation of the matrix represented by
    `mat_mat_prod` via Nystroem approximation

    Args:
        mat_mat_prod: a callable representing the matrix-matrix product
        size: size of the square matrix
        dtype: needed data type of inputs for the mat_mat_prod
        device: needed device for inputs of mat_mat_prod
        regularization: regularization parameter
            $\lambda$  in $(A+\lambda I)x=b$
    """

    self._low_rank_approx = randomized_nystroem_approximation(
        mat_mat_prod, size, self._rank, dtype, mat_vec_device=device
    )
    self._regularization = regularization