pydvl.influence.torch.pre_conditioner ¶

PreConditioner ¶

Bases: ABC

Abstract base class for implementing pre-conditioners for improving the convergence of CG for systems of the form

\[ ( A + \lambda \operatorname{I})x = \operatorname{rhs} \]

i.e. a matrix $M$ such that $M^{-1}(A + \lambda \operatorname{I})$ has a better condition number than $A + \lambda \operatorname{I}$.

fit `abstractmethod` ¶

fit(
    mat_mat_prod: Callable[[Tensor], Tensor],
    size: int,
    dtype: dtype,
    device: device,
    regularization: float = 0.0,
)

Implement this to fit the pre-conditioner to the matrix represented by the mat_mat_prod Args: mat_mat_prod: a callable that computes the matrix-matrix product size: size of the matrix represented by mat_mat_prod dtype: data type of the matrix represented by mat_mat_prod device: device of the matrix represented by mat_mat_prod regularization: regularization parameter $\lambda$ in the equation $ ( A + \lambda \operatorname{I})x = \operatorname{rhs} $ Returns: self

Source code in src/pydvl/influence/torch/pre_conditioner.py

@abstractmethod
def fit(
    self,
    mat_mat_prod: Callable[[torch.Tensor], torch.Tensor],
    size: int,
    dtype: torch.dtype,
    device: torch.device,
    regularization: float = 0.0,
):
    r"""
    Implement this to fit the pre-conditioner to the matrix represented by the
    mat_mat_prod
    Args:
        mat_mat_prod: a callable that computes the matrix-matrix product
        size: size of the matrix represented by `mat_mat_prod`
        dtype: data type of the matrix represented by `mat_mat_prod`
        device: device of the matrix represented by `mat_mat_prod`
        regularization: regularization parameter $\lambda$ in the equation
            $ ( A + \lambda \operatorname{I})x = \operatorname{rhs} $
    Returns:
        self
    """
    pass

solve ¶

solve(rhs: Tensor)

Solve the equation $M@Z = \operatorname{rhs}$ Args: rhs: right hand side of the equation, corresponds to the residuum vector (or matrix) in the conjugate gradient method

RETURNS	DESCRIPTION
	solution $M^{-1}\operatorname{rhs}$

Source code in src/pydvl/influence/torch/pre_conditioner.py

def solve(self, rhs: torch.Tensor):
    r"""
    Solve the equation $M@Z = \operatorname{rhs}$
    Args:
        rhs: right hand side of the equation, corresponds to the residuum vector
            (or matrix) in the conjugate gradient method

    Returns:
        solution $M^{-1}\operatorname{rhs}$

    """
    if not self.is_fitted:
        raise NotFittedException(type(self))

    return self._solve(rhs)

to `abstractmethod` ¶

to(device: device) -> PreConditioner

Implement this to move the (potentially fitted) preconditioner to a specific device

Source code in src/pydvl/influence/torch/pre_conditioner.py

@abstractmethod
def to(self, device: torch.device) -> PreConditioner:
    """Implement this to move the (potentially fitted) preconditioner to a
    specific device"""

JacobiPreConditioner ¶

JacobiPreConditioner(num_samples_estimator: int = 1)

Bases: PreConditioner

Pre-conditioner for improving the convergence of CG for systems of the form

\[ ( A + \lambda \operatorname{I})x = \operatorname{rhs} \]

The JacobiPreConditioner uses the diagonal information of the matrix $A$. The diagonal elements are not computed directly but estimated via Hutchinson's estimator.

\[ M = \frac{1}{m} \sum_{i=1}^m u_i \odot Au_i + \lambda \operatorname{I} \]

where $u_i$ are i.i.d. Gaussian random vectors. Works well in the case the matrix $A + \lambda \operatorname{I}$ is diagonal dominant. For more information, see the documentation of Conjugate Gradient Args: num_samples_estimator: number of samples to use in computation of Hutchinson's estimator

Source code in src/pydvl/influence/torch/pre_conditioner.py

def __init__(self, num_samples_estimator: int = 1):
    self.num_samples_estimator = num_samples_estimator

solve ¶

solve(rhs: Tensor)

Solve the equation $M@Z = \operatorname{rhs}$ Args: rhs: right hand side of the equation, corresponds to the residuum vector (or matrix) in the conjugate gradient method

RETURNS	DESCRIPTION
	solution $M^{-1}\operatorname{rhs}$

Source code in src/pydvl/influence/torch/pre_conditioner.py

def solve(self, rhs: torch.Tensor):
    r"""
    Solve the equation $M@Z = \operatorname{rhs}$
    Args:
        rhs: right hand side of the equation, corresponds to the residuum vector
            (or matrix) in the conjugate gradient method

    Returns:
        solution $M^{-1}\operatorname{rhs}$

    """
    if not self.is_fitted:
        raise NotFittedException(type(self))

    return self._solve(rhs)

fit ¶

fit(
    mat_mat_prod: Callable[[Tensor], Tensor],
    size: int,
    dtype: dtype,
    device: device,
    regularization: float = 0.0,
)

Fits by computing an estimate of the diagonal of the matrix represented by mat_mat_prod via Hutchinson's estimator

PARAMETER	DESCRIPTION
`mat_mat_prod`	a callable representing the matrix-matrix product TYPE: `Callable[[Tensor], Tensor]`
`size`	size of the square matrix TYPE: `int`
`dtype`	needed data type of inputs for the mat_mat_prod TYPE: `dtype`
`device`	needed device for inputs of mat_mat_prod TYPE: `device`
`regularization`	regularization parameter $\lambda$ in $(A+\lambda I)x=b$ TYPE: `float` DEFAULT: `0.0`

Source code in src/pydvl/influence/torch/pre_conditioner.py

def fit(
    self,
    mat_mat_prod: Callable[[torch.Tensor], torch.Tensor],
    size: int,
    dtype: torch.dtype,
    device: torch.device,
    regularization: float = 0.0,
):
    r"""
    Fits by computing an estimate of the diagonal of the matrix represented by
    `mat_mat_prod` via Hutchinson's estimator

    Args:
        mat_mat_prod: a callable representing the matrix-matrix product
        size: size of the square matrix
        dtype: needed data type of inputs for the mat_mat_prod
        device: needed device for inputs of mat_mat_prod
        regularization: regularization parameter
            $\lambda$ in $(A+\lambda I)x=b$
    """
    random_samples = torch.randn(
        size, self.num_samples_estimator, device=device, dtype=dtype
    )
    diagonal_estimate = torch.sum(
        torch.mul(random_samples, mat_mat_prod(random_samples)), dim=1
    )
    diagonal_estimate /= self.num_samples_estimator
    self._diag = diagonal_estimate
    self._reg = regularization

NystroemPreConditioner ¶

NystroemPreConditioner(rank: int)

Bases: PreConditioner

Pre-conditioner for improving the convergence of CG for systems of the form

\[ (A + \lambda \operatorname{I})x = \operatorname{rhs} \]

The NystroemPreConditioner computes a low-rank approximation

\[ A_{\text{nys}} = (A \Omega)(\Omega^T A \Omega)^{\dagger}(A \Omega)^T = U \Sigma U^T, \]

where $(\cdot)^{\dagger}$ denotes the Moore-Penrose inverse, and uses the matrix

\[ M^{-1} = (\lambda + \sigma_{\text{rank}})U(\Sigma+ \lambda \operatorname{I})^{-1}U^T+(\operatorname{I} - UU^T) \]

for pre-conditioning, where $ \sigma_{\text{rank}} $ is the smallest eigenvalue of the low-rank approximation.

Source code in src/pydvl/influence/torch/pre_conditioner.py

def __init__(self, rank: int):
    self._rank = rank

solve ¶

solve(rhs: Tensor)

Solve the equation $M@Z = \operatorname{rhs}$ Args: rhs: right hand side of the equation, corresponds to the residuum vector (or matrix) in the conjugate gradient method

RETURNS	DESCRIPTION
	solution $M^{-1}\operatorname{rhs}$

Source code in src/pydvl/influence/torch/pre_conditioner.py

def solve(self, rhs: torch.Tensor):
    r"""
    Solve the equation $M@Z = \operatorname{rhs}$
    Args:
        rhs: right hand side of the equation, corresponds to the residuum vector
            (or matrix) in the conjugate gradient method

    Returns:
        solution $M^{-1}\operatorname{rhs}$

    """
    if not self.is_fitted:
        raise NotFittedException(type(self))

    return self._solve(rhs)

fit ¶

fit(
    mat_mat_prod: Callable[[Tensor], Tensor],
    size: int,
    dtype: dtype,
    device: device,
    regularization: float = 0.0,
)

Fits by computing a low-rank approximation of the matrix represented by mat_mat_prod via Nystroem approximation

PARAMETER	DESCRIPTION
`mat_mat_prod`	a callable representing the matrix-matrix product TYPE: `Callable[[Tensor], Tensor]`
`size`	size of the square matrix TYPE: `int`
`dtype`	needed data type of inputs for the mat_mat_prod TYPE: `dtype`
`device`	needed device for inputs of mat_mat_prod TYPE: `device`
`regularization`	regularization parameter $\lambda$ in $(A+\lambda I)x=b$ TYPE: `float` DEFAULT: `0.0`

Source code in src/pydvl/influence/torch/pre_conditioner.py

def fit(
    self,
    mat_mat_prod: Callable[[torch.Tensor], torch.Tensor],
    size: int,
    dtype: torch.dtype,
    device: torch.device,
    regularization: float = 0.0,
):
    r"""
    Fits by computing a low-rank approximation of the matrix represented by
    `mat_mat_prod` via Nystroem approximation

    Args:
        mat_mat_prod: a callable representing the matrix-matrix product
        size: size of the square matrix
        dtype: needed data type of inputs for the mat_mat_prod
        device: needed device for inputs of mat_mat_prod
        regularization: regularization parameter
            $\lambda$  in $(A+\lambda I)x=b$
    """

    self._low_rank_approx = randomized_nystroem_approximation(
        mat_mat_prod, size, self._rank, dtype, mat_vec_device=device
    )
    self._regularization = regularization

pydvl.influence.torch.pre_conditioner ¶

PreConditioner ¶

fit abstractmethod ¶

solve ¶

to abstractmethod ¶

JacobiPreConditioner ¶

solve ¶

fit ¶

NystroemPreConditioner ¶

solve ¶

fit ¶

fit `abstractmethod` ¶

to `abstractmethod` ¶