Skip to content

pydvl.valuation.stopping

This module provides a basic set of criteria to stop valuation algorithms, in particular all semi-values. Common examples are MinUpdates, MaxTime, or HistoryDeviation.

Stopping criteria can behave in different ways depending on the context. For example, MaxUpdates limits the number of updates to values, which depending on the algorithm may mean a different number of utility evaluations or imply other computations like solving a linear or quadratic program. In the case of SemivalueValuation, the criteria are evaluated once per batch, which might lead to different behavior depending on the batch size (e.g. for certain batch sizes it might happen that the number of updates to values after convergence is not exactly what was required, since multiple updates might happen at once).

Stopping criteria are callables that are evaluated on a ValuationResult and return a Status object. They can be combined using boolean operators.

Saving a history of values

The special stopping criterion History can be used to store a rolling history of the values, e.g. for comparing methods as they evolve.

from pydvl.valuation import ShapleyValuation, History
history = History(n_steps=1000)
stopping = MaxUpdates(10000) | history
valuation = ShapleyValuation(utility=utility, sampler=sampler, is_done=stopping)
valuation.fit(training_data)
history.data[0]  # The last update
history.data[-1]  # The 1000th update before last

Combining stopping criteria

Objects of type StoppingCriterion can be combined with the binary operators & (and), and | (or), following the truth tables of Status. The unary operator ~ (not) is also supported. See StoppingCriterion for details on how these operations affect the behavior of the stopping criteria.

How convergence is determined

Most stopping criteria keep track of the convergence of each index separately but make global decisions based on the overall convergence of some fraction of all indices. For example, if we have a stopping criterion that checks whether the standard error of 90% of values is below a threshold, then methods will keep updating all indices until 90% of them have converged, irrespective of the quality of the individual estimates, and without freezing updates for indices along the way as values individually attain low standard error.

This has some practical implications, because some values do tend to converge sooner than others. For example, assume we use the criterion AbsoluteStandardError(0.02) | MaxUpdates(1000). Then values close to 0 might be marked as "converged" rather quickly because they fulfill the first criterion, say after 20 iterations, despite being poor estimates (see the section on pitfalls below for commentary on stopping criteria based on standard errors). Because other indices take longer to reach a low standard error and the criterion is a global check, the "converged" ones keep being updated and end up being good estimates. In this case, this has been beneficial, but one might not wish for converged values to be updated, if one is sure that the criterion is adequate for individual values.

Semi-value methods include a parameter skip_converged that allows to skip the computation of values that have converged. The way to avoid doing this too early is to use a more stringent check, e.g. AbsoluteStandardError(1e-3) | MaxUpdates(1000). With skip_converged=True this check can still take less time than the first one, despite requiring more iterations for some indices.

Stopping criterion for finite samplers

Using a finite sampler naturally defines when the valuation algorithm terminates. However, in order to properly report progress, we need to use a stopping criterion that keeps track of the number of iterations. In this case, one can use NoStopping with the sampler as an argument. This quirk is due to progress reported depending on the completion attribute of a criterion. Here's how it's done:

from pydvl.valuation import ShapleyValuation, NoStopping

sampler = DeterministicUniformSampler()
stopping = NoStopping(sampler)
valuation = ShapleyValuation(
    utility=utility, sampler=sampler, is_done=stopping, progress=True
)
with parallel_config(n_jobs=4):
    valuation.fit(data)

Choosing a stopping criterion

The choice of a stopping criterion greatly depends on the algorithm and the context. A safe bet is to combine a MaxUpdates or a MaxTime with a HistoryDeviation or an RankCorrelation. The former will ensure that the computation does not run for too long, while the latter will try to achieve results that are stable enough. Note however that if the threshold is too strict, one will always end up running until a maximum number of iterations or time. Also keep in mind that different values converge at different times, so you might want to use tight thresholds and skip_converged as described above for semi-values.

Example

from pydvl.valuation import BanzhafValuation, MinUpdates, MSRSampler, RankCorrelation

model = ... # Some sklearn-compatible model
scorer = SupervisedScorer("accuracy", test_data, default=0.0)
utility = ModelUtility(model, scorer)
sampler = MSRSampler(seed=seed)
stopping = RankCorrelation(rtol=1e-2, burn_in=32) | MinUpdates(1000)
valuation = BanzhafValuation(utility=utility, sampler=sampler, is_done=stopping)
with parallel_config(n_jobs=4):
    valuation.fit(training_data)
result = valuation.result
This will compute the Banzhaf semivalues for utility until either the change in Spearman rank correlation between updates is below 1e-2 or 1000 updates have been performed. The burn_in parameter is used to discard the first 32 updates from the computation of the standard error.

Warning

Be careful not to reuse the same stopping criterion for different computations. The object has state, which is reset by fit() for some valuation methods, but this is not guaranteed for all methods. If you need to reuse the same criterion, it's safer to create a new instance.

Interactions with sampling schemes and other pitfalls of stopping criteria

When sampling over powersets with a sequential index iteration, indices' values are updated sequentially, as expected. Now, if the number of samples per index is high, it might be a long while until the next index is updated. In this case, criteria like MinUpdates will seem to stall after each index has reached the specified number of updates, even though the computation is still ongoing. A "fix" is to set the skip_converged parameter of Semi-value methods to True, so that as soon as the stopping criterion is fulfilled for an index, the computation continues. Note that this will probably break any desirable properties of certain samplers, for instance the StratifiedSampler.

Problem with 'skip_converged'

Alas, the above will fail under some circumstances, until we fix this bug

Different samplers define different "update strategies" for values. For example, MSRSampler updates the counts field of a ValuationResult only for about half of the utility evaluations, because it reuses samples. This means that a stopping criterion like MaxChecks will not work as expected, because it will count the number of calls to the criterion, not the number of updates to the values. In this case, one should use MaxUpdates or, more likely, MinUpdates instead.

Finally, stopping criteria that rely on the standard error of the values, like AbsoluteStandardError, should be used with care. The standard error is a measure of the uncertainty of the estimate, but it does not guarantee that the estimate is close to the true value. For example, if the utility function is very noisy, the standard error might be very low, but the estimate might be far from the true value. In this case, one might want to use a RankCorrelation instead, which checks whether the rank of the values is stable.

Creating stopping criteria

In order to create a new stopping criterion, one can subclass StoppingCriterion and implement the _check method. This method should return a Status and update the _converged attribute, which is a boolean array indicating whether the value for each index has converged. When this does not make sense for a particular stopping criterion, completion should be overridden to provide an overall completion value, since its default implementation attempts to compute the mean of _converged.

References


  1. Ghorbani, A., Zou, J., 2019. Data Shapley: Equitable Valuation of Data for Machine Learning. In: Proceedings of the 36th International Conference on Machine Learning, PMLR, pp. 2242–2251. 

AbsoluteStandardError

AbsoluteStandardError(
    threshold: float,
    fraction: float = 1.0,
    burn_in: int = 4,
    modify_result: bool = True,
)

Bases: StoppingCriterion

Determine convergence based on the standard error of the values.

If \(s_i\) is the standard error for datum \(i\), then this criterion returns Converged if \(s_i < \epsilon\) for all \(i\) and a threshold value \(\epsilon \gt 0\).

Warning

This criterion should be used with care. The standard error is a measure of the uncertainty of the estimate, but it does not guarantee that the estimate is close to the true value. For example, if the utility function is very noisy, the standard error might be very low, but the estimate might be far from the true value. In this case, one might want to use a RankCorrelation instead, which checks whether the rank of the values is stable.

PARAMETER DESCRIPTION
threshold

A value is considered to have converged if the standard error is below this threshold. A way of choosing it is to pick some percentage of the range of the values. For Shapley values this is the difference between the maximum and minimum of the utility function (to see this substitute the maximum and minimum values of the utility into the marginal contribution formula).

TYPE: float

fraction

The fraction of values that must have converged for the criterion to return Converged.

TYPE: float DEFAULT: 1.0

burn_in

The number of iterations to ignore before checking for convergence. This is required because computations typically start with zero variance, as a result of using zeros(). The default is set to an arbitrary minimum which is usually enough but may need to be increased.

TYPE: int DEFAULT: 4

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    threshold: float,
    fraction: float = 1.0,
    burn_in: int = 4,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)
    self.threshold = threshold
    self.fraction = fraction
    self.burn_in = burn_in

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

completion

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py
def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

History

History(n_steps: int, skip_steps: int = 0, modify_result: bool = False)

Bases: StoppingCriterion

A dummy stopping criterion that stores the last steps values in a rolling memory.

You can access the values via indexing with the [] operator. Indices should always be negative, with the most recent value being -1, the second most recent being -2, and so on.

Typical usage
stopping = MaxSamples(5000) | (history := History(5000))
valuation = TMCShapleyValuation(utility, sampler, is_done=stopping)
with parallel_config(n_jobs=-1):
    valuation.fit(training_data)
# history[-10:]  # contains the last 10 steps
Comparing histories across valuation methods

Care must be taken when comparing the histories saved while fitting different methods. The rate at which stopping criteria are checked, and hence a History is updated is not guaranteed to be the same, due to differences in sampling and batching. For instance, a deterministic powerset sampler with a FiniteSequentialIndexIteration with a batch_size=2**(n-1) will update the history exactly n times (if given enough iterations by other stopping criteria), where n is the number of indices. If instead the batch size is 1, the history will be updated 2**n times, but all the values except for one index will remain constant during 2**(n-1) iterations. Comparing any of these to, say, a PermutationSampler which results in one update to the history per permutation, requires setting the parameter skip_steps adequately in the respective History objects.

Example
result = ValuationResult.from_random(size=7)
history = History(n_steps=5)
history(result)
assert all(history[-1] == result)

Args: n_steps: The number of steps to remember. skip_steps: The number of steps to skip between updates. If 0, the memory is updated at every check of the criterion. If 1, the memory is updated every other step, and so on. This is useful to synchronize the memory of methods with different update rates. modify_result: Ignored.

Source code in src/pydvl/valuation/stopping.py
def __init__(self, n_steps: int, skip_steps: int = 0, modify_result: bool = False):
    super().__init__(modify_result=False)
    self.memory = RollingMemory(
        size=n_steps, skip_steps=skip_steps, default=np.inf, dtype=np.float64
    )

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

data property

A view on the data. Rows are the steps, columns are the indices

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

__len__

__len__() -> int

The number of steps that the memory has saved. This is guaranteed to be between 0 and n_steps, inclusive.

Source code in src/pydvl/valuation/stopping.py
def __len__(self) -> int:
    """The number of steps that the memory has saved. This is guaranteed to be
    between 0 and `n_steps`, inclusive."""
    return len(self.memory)

HistoryDeviation

HistoryDeviation(
    n_steps: int,
    rtol: float,
    pin_converged: bool = True,
    modify_result: bool = True,
)

Bases: StoppingCriterion

A simple check for relative distance to a previous step in the computation.

The method used by Ghorbani and Zou, (2019)1 computes the relative distances between the current values \(v_i^t\) and the values at the previous checkpoint \(v_i^{t-\tau}\). If the sum is below a given threshold, the computation is terminated.

\[\sum_{i=1}^n \frac{\left| v_i^t - v_i^{t-\tau} \right|}{v_i^t} < \epsilon.\]

When the denominator is zero, the summand is set to the value of \(v_i^{ t-\tau}\).

This implementation is slightly generalised to allow for different number of updates to individual indices, as happens with powerset samplers instead of permutations. Every subset of indices that is found to converge can be pinned to that state. Once all indices have converged the method has converged.

Warning

This criterion is meant for the reproduction of the results in the paper, but we do not recommend using it in practice.

PARAMETER DESCRIPTION
n_steps

Compare values after so many steps. A step is one evaluation of the criterion, which happens once per batch.

TYPE: int

rtol

Relative tolerance for convergence (\(\epsilon\) in the formula).

TYPE: float

pin_converged

If True, once an index has converged, it is pinned

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    n_steps: int,
    rtol: float,
    pin_converged: bool = True,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)
    self.memory = RollingMemory(n_steps + 1, default=np.inf, dtype=np.float64)
    self.rtol = validate_number("rtol", rtol, float, lower=0.0, upper=1.0)
    self.update_op = np.logical_or if pin_converged else np.logical_and

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

completion

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py
def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

MaxChecks

MaxChecks(n_checks: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate as soon as the number of checks exceeds the threshold.

A "check" is one call to the criterion. Note that this might have different interpretations depending on the sampler. For example, MSRSampler performs a single utility evaluation to update all indices, so that's len(training_data) checks for a single training of the model. But it also only changes the counts field of the ValuationResult for about half of the indices, which is what e.g. MaxUpdates checks.

PARAMETER DESCRIPTION
n_checks

Threshold: if None, no check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: int | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, n_checks: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_checks is not None:
        n_checks = validate_number("n_checks", n_checks, int, lower=1)
    self.n_checks = n_checks

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxSamples

MaxSamples(sampler: IndexSampler, n_samples: int, modify_result: bool = True)

Bases: StoppingCriterion

Run until the sampler has sampled the given number of samples.

Warning

If the sampler is batched, and the valuation method runs in parallel, the check might be off by the sampler's batch size.

PARAMETER DESCRIPTION
sampler

The sampler to check.

TYPE: IndexSampler

n_samples

The number of samples to run until.

TYPE: int

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self, sampler: IndexSampler, n_samples: int, modify_result: bool = True
):
    super().__init__(modify_result=modify_result)
    self.sampler = sampler
    self.n_samples = validate_number("n_samples", n_samples, int, lower=1)
    self._completion = 0.0

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxTime

MaxTime(seconds: float | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate if the computation time exceeds the given number of seconds.

Checks the elapsed time since construction.

PARAMETER DESCRIPTION
seconds

Threshold: The computation is terminated if the elapsed time between object construction and a _check exceeds this value. If None, no _check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: float | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, seconds: float | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if seconds is None:
        seconds = np.inf
    self.max_seconds = validate_number("seconds", seconds, float, lower=1e-6)
    self.start = time()

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxUpdates

MaxUpdates(n_updates: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate if any number of value updates exceeds or equals the given threshold.

Note

If you want to ensure that all values have been updated, you probably want MinUpdates instead.

This checks the counts field of a ValuationResult, i.e. the number of times that each index has been updated. For powerset samplers, the maximum of this number coincides with the maximum number of subsets sampled. For permutation samplers, it coincides with the number of permutations sampled.

PARAMETER DESCRIPTION
n_updates

Threshold: if None, no check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: int | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, n_updates: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_updates is not None:
        n_updates = validate_number("n_updates", n_updates, int, lower=1)
    self.n_updates = n_updates
    self.last_max = 0

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MinUpdates

MinUpdates(n_updates: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate as soon as all value updates exceed or equal the given threshold.

This checks the counts field of a ValuationResult, i.e. the number of times that each index has been updated. For powerset samplers, the minimum of this number is a lower bound for the number of subsets sampled. For permutation samplers, it lower-bounds the amount of permutations sampled.

PARAMETER DESCRIPTION
n_updates

Threshold: if None, no _check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: int | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, n_updates: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_updates is not None:
        n_updates = validate_number("n_updates", n_updates, int, lower=1)
    self.n_updates = n_updates
    self.last_min = 0
    self._actual_completion = 0.0

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

NoStopping

NoStopping(sampler: IndexSampler | None = None, modify_result: bool = True)

Bases: StoppingCriterion

Keep running forever or until sampling stops.

If a sampler instance is passed, and it is a finite sampler, its counter will be used to update completion status.

PARAMETER DESCRIPTION
sampler

A sampler instance to use for completion status.

TYPE: IndexSampler | None DEFAULT: None

modify_result

If True the status of the input ValuationResult is modified in place after the call

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, sampler: IndexSampler | None = None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    self.sampler = sampler

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

RankCorrelation

RankCorrelation(
    rtol: float, burn_in: int, fraction: float = 1.0, modify_result: bool = True
)

Bases: StoppingCriterion

A check for stability of Spearman correlation between checks.

Convergence is reached when the change in rank correlation between two successive iterations is below a given threshold.

This criterion is used in (Wang et al.)2.

The meaning of successive iterations

Stopping criteria in pyDVL are typically evaluated after each batch of value updates is received. This can imply very different things, depending on the configuration of the samplers. For this reason, RankCorrelation keeps itself track of the number of updates that each index has seen, and only checks for correlation changes when a given fraction of all indices has been updated more than burn_in times and once since last time the criterion was checked.

PARAMETER DESCRIPTION
rtol

Relative tolerance for convergence (\(\epsilon\) in the formula)

TYPE: float

burn_in

The minimum number of updates an index must have seen before checking for convergence. This is required because the first correlation checks are usually meaningless.

TYPE: int

fraction

The fraction of values that must have been updated between two correlation checks. This is to avoid comparing two results where only one value has been updated, which would have almost perfect rank correlation.

TYPE: float DEFAULT: 1.0

modify_result

If True, the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Added in 0.9.0

Changed in 0.10.0

The behaviour of the burn_in parameter was changed to look at value updates. The parameter fraction was added.

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    rtol: float,
    burn_in: int,
    fraction: float = 1.0,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)

    self.rtol = validate_number("rtol", rtol, float, lower=0.0, upper=1.0)
    self.burn_in = burn_in
    self.fraction = validate_number(
        "fraction", fraction, float, lower=0.0, upper=1.0
    )
    self.memory = RollingMemory(size=2, default=np.nan, dtype=np.float64)
    self.count_memory = RollingMemory(size=2, default=0, dtype=np.int_)
    self._corr = np.nan
    self._completion = 0.0

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

RollingMemory

RollingMemory(
    size: int,
    skip_steps: int = 0,
    default: Union[DT, int, float] = inf,
    *,
    dtype: Type[DT] | None = None,
)

Bases: Generic[DT]

A simple rolling memory for the last size values of each index.

Updating the memory results in new values being copied to the last column of the matrix and old ones removed from the first.

PARAMETER DESCRIPTION
size

The number of steps to remember. The internal buffer will have shape (size, n_indices), where n_indices is the number of indices in the ValuationResult.

TYPE: int

skip_steps

The number of steps to skip between updates. If 0, the memory is updated at every step. If 1, the memory is updated every other step, and so on.

TYPE: int DEFAULT: 0

default

The default value to use when the memory is empty.

TYPE: Union[DT, int, float] DEFAULT: inf

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    size: int,
    skip_steps: int = 0,
    default: Union[DT, int, float] = np.inf,
    *,
    dtype: Type[DT] | None = None,
):
    if not isinstance(default, np.generic):  # convert to np scalar
        default = cast(DT, np.array(default, dtype=np.result_type(default))[()])
    if dtype is not None:  # user forced conversion
        default = dtype(default)
    self.size = validate_number("size", size, int, lower=1)
    self._skip_steps = validate_number("skip_steps", skip_steps, int, lower=0)
    self._count = 0
    self._default = cast(DT, default)
    self._data: NDArray[DT] = np.full(0, default, dtype=type(default))

data property

data: NDArray[DT]

A view on the data. Rows are the steps, columns are the indices

__getitem__

__getitem__(key: Union[slice, Iterable[int], int]) -> NDArray

Get items from the memory, properly handling temporal sequence.

Source code in src/pydvl/valuation/stopping.py
def __getitem__(self, key: Union[slice, Iterable[int], int]) -> NDArray:
    """Get items from the memory, properly handling temporal sequence."""
    key = self._validate_key(key)
    return cast(NDArray, self.data[key])

__len__

__len__() -> int

The number of steps that the memory has saved. This is guaranteed to be between 0 and size, inclusive.

Source code in src/pydvl/valuation/stopping.py
def __len__(self) -> int:
    """The number of steps that the memory has saved. This is guaranteed to be
    between 0 and `size`, inclusive."""
    return min(self.size, self._count // (1 + self._skip_steps))

must_skip

must_skip()

Check if the memory must skip the current update

Source code in src/pydvl/valuation/stopping.py
def must_skip(self):
    """Check if the memory must skip the current update"""
    return self._count % (1 + self._skip_steps) > 0

reset

reset() -> Self

Empty the memory

Source code in src/pydvl/valuation/stopping.py
def reset(self) -> Self:
    """Empty the memory"""
    self._data = np.full(0, self._default, dtype=type(self._default))
    self._count = 0
    return self

update

update(values: NDArray[DT]) -> Self

Update the memory with the values of the current result.

The values are appended to the memory as its last column, and the oldest values (the first column) are removed

Source code in src/pydvl/valuation/stopping.py
def update(self, values: NDArray[DT]) -> Self:
    """Update the memory with the values of the current result.

    The values are appended to the memory as its last column, and the oldest values
    (the first column) are removed
    """
    if len(self._data) == 0:
        self._data = np.full(
            (len(values), self.size),
            self._default,
            dtype=type(self._default),
        )
    self._count += 1
    if self.must_skip():
        return self
    self._data = np.concatenate([self._data[:, 1:], values.reshape(-1, 1)], axis=1)
    return self

StoppingCriterion

StoppingCriterion(modify_result: bool = True)

Bases: ABC

A composable callable object to determine whether a computation must stop.

A StoppingCriterion is a callable taking a ValuationResult and returning a Status. It also keeps track of individual convergence of values with converged, and reports the overall completion of the computation with completion.

Instances of StoppingCriterion can be composed with the binary operators & (and), and | (or), following the truth tables of Status. The unary operator ~ (not) is also supported. These boolean operations act according to the following rules:

  • The results of check() are combined with the operator. See Status for the truth tables.
  • The results of converged are combined with the operator (returning another boolean array).
  • The completion method returns the min, max, or the complement to 1 of the completions of the operands, for AND, OR and NOT respectively. This is required for cases where one of the criteria does not keep track of the convergence of single values, e.g. MaxUpdates, because completion by default returns the mean of the boolean convergence array.

Subclassing

Subclassing this class requires implementing a check() method that returns a Status object based on a given ValuationResult. This method should update the attribute _converged, which is a boolean array indicating whether the value for each index has converged. When this does not make sense for a particular stopping criterion, completion should be overridden to provide an overall completion value, since its default implementation attempts to compute the mean of _converged.

PARAMETER DESCRIPTION
modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, modify_result: bool = True):
    self.modify_result = modify_result
    self._converged = np.full(0, False)
    self._count = 0

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

count property

count: int

The number of times that the criterion has been checked.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

completion

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py
def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())