Skip to content

pydvl.valuation.methods.loo

This module implements Leave-One-Out (LOO) valuation.

It is defined as:

\[ v_\text{loo}(i) = u(N) - u(N_{-i}), \]

where \(u\) is the utility function, \(N\) is the set of all indices, and \(i\) is the index of interest.

Changing LOO

LOOValuation is preconfigured to stop once all indices have been visited once. In particular, it uses a default LOOSampler with a FiniteSequentialIndexIteration. If you want to change this behaviour, the easiest way is to subclass and replace the constructor.

LOOValuation

LOOValuation(utility: UtilityBase, progress: bool = False)

Bases: SemivalueValuation

Computes LOO values for a dataset.

Source code in src/pydvl/valuation/methods/loo.py
def __init__(self, utility: UtilityBase, progress: bool = False):
    self._result: ValuationResult | None = None
    super().__init__(
        utility,
        LOOSampler(batch_size=1, index_iteration=FiniteSequentialIndexIteration),
        # LOO is done when every index has been updated once
        MinUpdates(n_updates=1),
        progress=progress,
    )

log_coefficient property

log_coefficient: SemivalueCoefficient | None

Disable importance sampling for this method since we have a fixed sampler that already provides the correct weights for the Monte Carlo approximation.

result property

The current valuation result (not a copy).

fit

fit(data: Dataset, continue_from: ValuationResult | None = None) -> Self

Fits the semi-value valuation to the data.

Access the results through the result property.

PARAMETER DESCRIPTION
data

Data for which to compute values

TYPE: Dataset

continue_from

A previously computed valuation result to continue from.

TYPE: ValuationResult | None DEFAULT: None

Source code in src/pydvl/valuation/methods/semivalue.py
@suppress_warnings(flag="show_warnings")
def fit(self, data: Dataset, continue_from: ValuationResult | None = None) -> Self:
    """Fits the semi-value valuation to the data.

    Access the results through the `result` property.

    Args:
        data: Data for which to compute values
        continue_from: A previously computed valuation result to continue from.

    """
    self._result = self._init_or_check_result(data, continue_from)
    ensure_backend_has_generator_return()

    self.is_done.reset()
    self.utility = self.utility.with_dataset(data)

    strategy = self.sampler.make_strategy(self.utility, self.log_coefficient)
    updater = self.sampler.result_updater(self._result)
    processor = delayed(strategy.process)

    with Parallel(return_as="generator_unordered") as parallel:
        with make_parallel_flag() as flag:
            delayed_evals = parallel(
                processor(batch=list(batch), is_interrupted=flag)
                for batch in self.sampler.generate_batches(data.indices)
            )
            for batch in Progress(delayed_evals, self.is_done, **self.tqdm_args):
                for update in batch:
                    self._result = updater.process(update)
                if self.is_done(self._result):
                    flag.set()
                    self.sampler.interrupt()
                    break
                if self.skip_converged:
                    self.sampler.skip_indices = data.indices[self.is_done.converged]
    logger.debug(f"Fitting done after {updater.n_updates} value updates.")
    return self

values

values(sort: bool = False) -> ValuationResult

Returns a copy of the valuation result.

The valuation must have been run with fit() before calling this method.

PARAMETER DESCRIPTION
sort

Whether to sort the valuation result by value before returning it.

TYPE: bool DEFAULT: False

Returns: The result of the valuation.

Source code in src/pydvl/valuation/base.py
@deprecated(
    target=None,
    deprecated_in="0.10.0",
    remove_in="0.11.0",
)
def values(self, sort: bool = False) -> ValuationResult:
    """Returns a copy of the valuation result.

    The valuation must have been run with `fit()` before calling this method.

    Args:
        sort: Whether to sort the valuation result by value before returning it.
    Returns:
        The result of the valuation.
    """
    if not self.is_fitted:
        raise NotFittedException(type(self))
    assert self._result is not None

    r = self._result.copy()
    if sort:
        r.sort(inplace=True)
    return r