Skip to content

pydvl.valuation.methods.delta_shapley

This module implements the \(\delta\)-Shapley valuation method.

The value of a point \(i\) is defined as:

\[ v_\delta(i) = \sum_{k=l}^u w(k) \sum_{S \subset D_{-i}^{(k)}} [U(S_{+i}) - U(S)], \]

where \(l\) and \(u\) are the lower and upper bounds of the size of the subsets to sample from, and \(w(k)\) is the weight of a subset of size \(k\) in the complement of \(\{i\}\), and is given by:

\[ \begin{array}{ll} w (k) = \left \{ \begin{array}{ll} \frac{1}{u - l + 1} & \text{if} l \ \leq k \leq u,\\ 0 & \text{otherwise.} \end{array} \right. & \end{array} \]

DeltaShapleyValuation

DeltaShapleyValuation(
    utility: UtilityBase,
    is_done: StoppingCriterion,
    lower_bound: int,
    upper_bound: int,
    seed: Seed | None = None,
    skip_converged: bool = False,
    progress: bool = False,
)

Bases: SemivalueValuation

Computes \(\delta\)-Shapley values.

\(\delta\)-Shapley does not accept custom samplers. Instead, it uses a StratifiedSampler with a lower and upper bound on the size of the sets to sample from.

PARAMETER DESCRIPTION
utility

Object to compute utilities.

TYPE: UtilityBase

is_done

Stopping criterion to use.

TYPE: StoppingCriterion

lower_bound

The lower bound of the size of the subsets to sample from.

TYPE: int

upper_bound

The upper bound of the size of the subsets to sample from.

TYPE: int

seed

The seed for the random number generator used by the sampler.

TYPE: Seed | None DEFAULT: None

progress

Whether to show a progress bar. If a dictionary, it is passed to tqdm as keyword arguments, and the progress bar is displayed.

TYPE: bool DEFAULT: False

skip_converged

Whether to skip converged indices, as determined by the stopping criterion's converged array.

TYPE: bool DEFAULT: False

Source code in src/pydvl/valuation/methods/delta_shapley.py
def __init__(
    self,
    utility: UtilityBase,
    is_done: StoppingCriterion,
    lower_bound: int,
    upper_bound: int,
    seed: Seed | None = None,
    skip_converged: bool = False,
    progress: bool = False,
):
    sampler = StratifiedSampler(
        sample_sizes=ConstantSampleSize(
            1, lower_bound=lower_bound, upper_bound=upper_bound
        ),
        sample_sizes_iteration=RandomSizeIteration,
        index_iteration=RandomIndexIteration,
        seed=seed,
    )
    self.lower_bound = lower_bound
    self.upper_bound = upper_bound
    super().__init__(
        utility, sampler, is_done, progress=progress, skip_converged=skip_converged
    )

values

values(sort: bool = False) -> ValuationResult

Returns a copy of the valuation result.

The valuation must have been run with fit() before calling this method.

PARAMETER DESCRIPTION
sort

Whether to sort the valuation result by value before returning it.

TYPE: bool DEFAULT: False

Returns: The result of the valuation.

Source code in src/pydvl/valuation/base.py
def values(self, sort: bool = False) -> ValuationResult:
    """Returns a copy of the valuation result.

    The valuation must have been run with `fit()` before calling this method.

    Args:
        sort: Whether to sort the valuation result by value before returning it.
    Returns:
        The result of the valuation.
    """
    if not self.is_fitted:
        raise NotFittedException(type(self))
    assert self.result is not None

    from copy import copy

    r = copy(self.result)
    if sort:
        r.sort()
    return r