pydvl.valuation.stopping ¶

This module provides a basic set of criteria to stop valuation algorithms, in particular all semi-values. Common examples are MinUpdates, MaxTime, or HistoryDeviation.

Stopping criteria can behave in different ways depending on the context. For example, MaxUpdates limits the number of updates to values, which depending on the algorithm may mean a different number of utility evaluations or imply other computations like solving a linear or quadratic program. In the case of SemivalueValuation, the criteria are evaluated once per batch, which might lead to different behavior depending on the batch size (e.g. for certain batch sizes it might happen that the number of updates to values after convergence is not exactly what was required, since multiple updates might happen at once).

Stopping criteria are callables that are evaluated on a ValuationResult and return a Status object. They can be combined using boolean operators.

Saving a history of values

The special stopping criterion History can be used to store a rolling history of the values, e.g. for comparing methods as they evolve.

from pydvl.valuation import ShapleyValuation, History
history = History(n_steps=1000)
stopping = MaxUpdates(10000) | history
valuation = ShapleyValuation(utility=utility, sampler=sampler, is_done=stopping)
valuation.fit(training_data)
history.data[0]  # The last update
history.data[-1]  # The 1000th update before last

Combining stopping criteria¶

Objects of type StoppingCriterion can be combined with the binary operators & (and), and | (or), following the truth tables of Status. The unary operator ~ (not) is also supported. See StoppingCriterion for details on how these operations affect the behavior of the stopping criteria.

How convergence is determined¶

Most stopping criteria keep track of the convergence of each index separately but make global decisions based on the overall convergence of some fraction of all indices. For example, if we have a stopping criterion that checks whether the standard error of 90% of values is below a threshold, then methods will keep updating all indices until 90% of them have converged, irrespective of the quality of the individual estimates, and without freezing updates for indices along the way as values individually attain low standard error.

This has some practical implications, because some values do tend to converge sooner than others. For example, assume we use the criterion AbsoluteStandardError(0.02) | MaxUpdates(1000). Then values close to 0 might be marked as "converged" rather quickly because they fulfill the first criterion, say after 20 iterations, despite being poor estimates (see the section on pitfalls below for commentary on stopping criteria based on standard errors). Because other indices take longer to reach a low standard error and the criterion is a global check, the "converged" ones keep being updated and end up being good estimates. In this case, this has been beneficial, but one might not wish for converged values to be updated, if one is sure that the criterion is adequate for individual values.

Semi-value methods include a parameter skip_converged that allows to skip the computation of values that have converged. The way to avoid doing this too early is to use a more stringent check, e.g. AbsoluteStandardError(1e-3) | MaxUpdates(1000). With skip_converged=True this check can still take less time than the first one, despite requiring more iterations for some indices.

Stopping criterion for finite samplers

Using a finite sampler naturally defines when the valuation algorithm terminates. However, in order to properly report progress, we need to use a stopping criterion that keeps track of the number of iterations. In this case, one can use NoStopping with the sampler as an argument. This quirk is due to progress reported depending on the completion attribute of a criterion. Here's how it's done:

from pydvl.valuation import ShapleyValuation, NoStopping

sampler = DeterministicUniformSampler()
stopping = NoStopping(sampler)
valuation = ShapleyValuation(
    utility=utility, sampler=sampler, is_done=stopping, progress=True
)
with parallel_config(n_jobs=4):
    valuation.fit(data)

Choosing a stopping criterion¶

The choice of a stopping criterion greatly depends on the algorithm and the context. A safe bet is to combine a MaxUpdates or a MaxTime with a HistoryDeviation or an RankCorrelation. The former will ensure that the computation does not run for too long, while the latter will try to achieve results that are stable enough. Note however that if the threshold is too strict, one will always end up running until a maximum number of iterations or time. Also keep in mind that different values converge at different times, so you might want to use tight thresholds and skip_converged as described above for semi-values.

Example

from pydvl.valuation import BanzhafValuation, MinUpdates, MSRSampler, RankCorrelation

model = ... # Some sklearn-compatible model
scorer = SupervisedScorer("accuracy", test_data, default=0.0)
utility = ModelUtility(model, scorer)
sampler = MSRSampler(seed=seed)
stopping = RankCorrelation(rtol=1e-2, burn_in=32) | MinUpdates(1000)
valuation = BanzhafValuation(utility=utility, sampler=sampler, is_done=stopping)
with parallel_config(n_jobs=4):
    valuation.fit(training_data)
result = valuation.result

This will compute the Banzhaf semivalues for utility until either the change in Spearman rank correlation between updates is below 1e-2 or 1000 updates have been performed. The burn_in parameter is used to discard the first 32 updates from the computation of the standard error.

Warning

Be careful not to reuse the same stopping criterion for different computations. The object has state, which is reset by fit() for some valuation methods, but this is not guaranteed for all methods. If you need to reuse the same criterion, it's safer to create a new instance.

Interactions with sampling schemes and other pitfalls of stopping criteria¶

When sampling over powersets with a sequential index iteration, indices' values are updated sequentially, as expected. Now, if the number of samples per index is high, it might be a long while until the next index is updated. In this case, criteria like MinUpdates will seem to stall after each index has reached the specified number of updates, even though the computation is still ongoing. A "fix" is to set the skip_converged parameter of Semi-value methods to True, so that as soon as the stopping criterion is fulfilled for an index, the computation continues. Note that this will probably break any desirable properties of certain samplers, for instance the StratifiedSampler.

Problem with 'skip_converged'

Alas, the above will fail under some circumstances, until we fix this bug

Different samplers define different "update strategies" for values. For example, MSRSampler updates the counts field of a ValuationResult only for about half of the utility evaluations, because it reuses samples. This means that a stopping criterion like MaxChecks will not work as expected, because it will count the number of calls to the criterion, not the number of updates to the values. In this case, one should use MaxUpdates or, more likely, MinUpdates instead.

Finally, stopping criteria that rely on the standard error of the values, like AbsoluteStandardError, should be used with care. The standard error is a measure of the uncertainty of the estimate, but it does not guarantee that the estimate is close to the true value. For example, if the utility function is very noisy, the standard error might be very low, but the estimate might be far from the true value. In this case, one might want to use a RankCorrelation instead, which checks whether the rank of the values is stable.

Creating stopping criteria¶

In order to create a new stopping criterion, one can subclass StoppingCriterion and implement the _check method. This method should return a Status and update the _converged attribute, which is a boolean array indicating whether the value for each index has converged. When this does not make sense for a particular stopping criterion, completion should be overridden to provide an overall completion value, since its default implementation attempts to compute the mean of _converged.

References¶

Ghorbani, A., Zou, J., 2019. Data Shapley: Equitable Valuation of Data for Machine Learning. In: Proceedings of the 36th International Conference on Machine Learning, PMLR, pp. 2242–2251. ↩

AbsoluteStandardError ¶

AbsoluteStandardError(
    threshold: float,
    fraction: float = 1.0,
    burn_in: int = 4,
    modify_result: bool = True,
)

Bases: StoppingCriterion

Determine convergence based on the standard error of the values.

If \(s_i\) is the standard error for datum \(i\), then this criterion returns Converged if \(s_i < \epsilon\) for all \(i\) and a threshold value \(\epsilon \gt 0\).

Warning

This criterion should be used with care. The standard error is a measure of the uncertainty of the estimate, but it does not guarantee that the estimate is close to the true value. For example, if the utility function is very noisy, the standard error might be very low, but the estimate might be far from the true value. In this case, one might want to use a RankCorrelation instead, which checks whether the rank of the values is stable.

PARAMETER	DESCRIPTION
`threshold`	A value is considered to have converged if the standard error is below this threshold. A way of choosing it is to pick some percentage of the range of the values. For Shapley values this is the difference between the maximum and minimum of the utility function (to see this substitute the maximum and minimum values of the utility into the marginal contribution formula). TYPE: `float`
`fraction`	The fraction of values that must have converged for the criterion to return Converged. TYPE: `float` DEFAULT: `1.0`
`burn_in`	The number of iterations to ignore before checking for convergence. This is required because computations typically start with zero variance, as a result of using zeros(). The default is set to an arbitrary minimum which is usually enough but may need to be increased. TYPE: `int` DEFAULT: `4`
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(
    self,
    threshold: float,
    fraction: float = 1.0,
    burn_in: int = 4,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)
    self.threshold = threshold
    self.fraction = fraction
    self.burn_in = burn_in

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

completion ¶

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py

def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

History ¶

History(n_steps: int, skip_steps: int = 0, modify_result: bool = False)

Bases: StoppingCriterion

A dummy stopping criterion that stores the last steps values in a rolling memory.

You can access the values via indexing with the [] operator. Indices should always be negative, with the most recent value being -1, the second most recent being -2, and so on.

Typical usage

stopping = MaxSamples(5000) | (history := History(5000))
valuation = TMCShapleyValuation(utility, sampler, is_done=stopping)
with parallel_config(n_jobs=-1):
    valuation.fit(training_data)
# history[-10:]  # contains the last 10 steps

Comparing histories across valuation methods

Care must be taken when comparing the histories saved while fitting different methods. The rate at which stopping criteria are checked, and hence a History is updated is not guaranteed to be the same, due to differences in sampling and batching. For instance, a deterministic powerset sampler with a FiniteSequentialIndexIteration with a batch_size=2**(n-1) will update the history exactly n times (if given enough iterations by other stopping criteria), where n is the number of indices. If instead the batch size is 1, the history will be updated 2**n times, but all the values except for one index will remain constant during 2**(n-1) iterations. Comparing any of these to, say, a PermutationSampler which results in one update to the history per permutation, requires setting the parameter skip_steps adequately in the respective History objects.

Example

result = ValuationResult.from_random(size=7)
history = History(n_steps=5)
history(result)
assert all(history[-1] == result)

Args: n_steps: The number of steps to remember. skip_steps: The number of steps to skip between updates. If 0, the memory is updated at every check of the criterion. If 1, the memory is updated every other step, and so on. This is useful to synchronize the memory of methods with different update rates. modify_result: Ignored.

Source code in src/pydvl/valuation/stopping.py

def __init__(self, n_steps: int, skip_steps: int = 0, modify_result: bool = False):
    super().__init__(modify_result=False)
    self.memory = RollingMemory(
        size=n_steps, skip_steps=skip_steps, default=np.inf, dtype=np.float64
    )

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

data `property` ¶

data: NDArray[float64]

A view on the data. Rows are the steps, columns are the indices

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

len ¶

__len__() -> int

The number of steps that the memory has saved. This is guaranteed to be between 0 and n_steps, inclusive.

Source code in src/pydvl/valuation/stopping.py

def __len__(self) -> int:
    """The number of steps that the memory has saved. This is guaranteed to be
    between 0 and `n_steps`, inclusive."""
    return len(self.memory)

HistoryDeviation ¶

HistoryDeviation(
    n_steps: int,
    rtol: float,
    pin_converged: bool = True,
    modify_result: bool = True,
)

Bases: StoppingCriterion

A simple check for relative distance to a previous step in the computation.

The method used by Ghorbani and Zou, (2019)¹ computes the relative distances between the current values \(v_i^t\) and the values at the previous checkpoint \(v_i^{t-\tau}\). If the sum is below a given threshold, the computation is terminated.

\[\sum_{i=1}^n \frac{\left| v_i^t - v_i^{t-\tau} \right|}{v_i^t} < \epsilon.\]

When the denominator is zero, the summand is set to the value of \(v_i^{ t-\tau}\).

This implementation is slightly generalised to allow for different number of updates to individual indices, as happens with powerset samplers instead of permutations. Every subset of indices that is found to converge can be pinned to that state. Once all indices have converged the method has converged.

Warning

This criterion is meant for the reproduction of the results in the paper, but we do not recommend using it in practice.

PARAMETER	DESCRIPTION
`n_steps`	Compare values after so many steps. A step is one evaluation of the criterion, which happens once per batch. TYPE: `int`
`rtol`	Relative tolerance for convergence (\(\epsilon\) in the formula). TYPE: `float`
`pin_converged`	If `True`, once an index has converged, it is pinned TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(
    self,
    n_steps: int,
    rtol: float,
    pin_converged: bool = True,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)
    self.memory = RollingMemory(n_steps + 1, default=np.inf, dtype=np.float64)
    self.rtol = validate_number("rtol", rtol, float, lower=0.0, upper=1.0)
    self.update_op = np.logical_or if pin_converged else np.logical_and

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

completion ¶

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py

def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

MaxChecks ¶

MaxChecks(n_checks: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate as soon as the number of checks exceeds the threshold.

A "check" is one call to the criterion. Note that this might have different interpretations depending on the sampler. For example, MSRSampler performs a single utility evaluation to update all indices, so that's len(training_data) checks for a single training of the model. But it also only changes the counts field of the ValuationResult for about half of the indices, which is what e.g. MaxUpdates checks.

PARAMETER	DESCRIPTION
`n_checks`	Threshold: if `None`, no check is performed, effectively creating a (never) stopping criterion that always returns `Pending`. TYPE: `int \| None`
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(self, n_checks: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_checks is not None:
        n_checks = validate_number("n_checks", n_checks, int, lower=1)
    self.n_checks = n_checks

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxSamples ¶

MaxSamples(sampler: IndexSampler, n_samples: int, modify_result: bool = True)

Bases: StoppingCriterion

Run until the sampler has sampled the given number of samples.

Warning

If the sampler is batched, and the valuation method runs in parallel, the check might be off by the sampler's batch size.

PARAMETER	DESCRIPTION
`sampler`	The sampler to check. TYPE: `IndexSampler`
`n_samples`	The number of samples to run until. TYPE: `int`
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(
    self, sampler: IndexSampler, n_samples: int, modify_result: bool = True
):
    super().__init__(modify_result=modify_result)
    self.sampler = sampler
    self.n_samples = validate_number("n_samples", n_samples, int, lower=1)
    self._completion = 0.0

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxTime ¶

MaxTime(seconds: float | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate if the computation time exceeds the given number of seconds.

Checks the elapsed time since construction.

PARAMETER	DESCRIPTION
`seconds`	Threshold: The computation is terminated if the elapsed time between object construction and a _check exceeds this value. If `None`, no _check is performed, effectively creating a (never) stopping criterion that always returns `Pending`. TYPE: `float \| None`
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(self, seconds: float | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if seconds is None:
        seconds = np.inf
    self.max_seconds = validate_number("seconds", seconds, float, lower=1e-6)
    self.start = time()

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxUpdates ¶

MaxUpdates(n_updates: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate if any number of value updates exceeds or equals the given threshold.

Note

If you want to ensure that all values have been updated, you probably want MinUpdates instead.

This checks the counts field of a ValuationResult, i.e. the number of times that each index has been updated. For powerset samplers, the maximum of this number coincides with the maximum number of subsets sampled. For permutation samplers, it coincides with the number of permutations sampled.

PARAMETER	DESCRIPTION
`n_updates`	Threshold: if `None`, no check is performed, effectively creating a (never) stopping criterion that always returns `Pending`. TYPE: `int \| None`
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(self, n_updates: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_updates is not None:
        n_updates = validate_number("n_updates", n_updates, int, lower=1)
    self.n_updates = n_updates
    self.last_max = 0

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MinUpdates ¶

MinUpdates(n_updates: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate as soon as all value updates exceed or equal the given threshold.

This checks the counts field of a ValuationResult, i.e. the number of times that each index has been updated. For powerset samplers, the minimum of this number is a lower bound for the number of subsets sampled. For permutation samplers, it lower-bounds the amount of permutations sampled.

PARAMETER	DESCRIPTION
`n_updates`	Threshold: if `None`, no _check is performed, effectively creating a (never) stopping criterion that always returns `Pending`. TYPE: `int \| None`
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(self, n_updates: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_updates is not None:
        n_updates = validate_number("n_updates", n_updates, int, lower=1)
    self.n_updates = n_updates
    self.last_min = 0
    self._actual_completion = 0.0

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

NoStopping ¶

NoStopping(sampler: IndexSampler | None = None, modify_result: bool = True)

Bases: StoppingCriterion

Keep running forever or until sampling stops.

If a sampler instance is passed, and it is a finite sampler, its counter will be used to update completion status.

PARAMETER	DESCRIPTION
`sampler`	A sampler instance to use for completion status. TYPE: `IndexSampler \| None` DEFAULT: `None`
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(self, sampler: IndexSampler | None = None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    self.sampler = sampler

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

RankCorrelation ¶

RankCorrelation(
    rtol: float, burn_in: int, fraction: float = 1.0, modify_result: bool = True
)

Bases: StoppingCriterion

A check for stability of Spearman correlation between checks.

Convergence is reached when the change in rank correlation between two successive iterations is below a given threshold.

This criterion is used in (Wang et al.)².

The meaning of successive iterations

Stopping criteria in pyDVL are typically evaluated after each batch of value updates is received. This can imply very different things, depending on the configuration of the samplers. For this reason, RankCorrelation keeps itself track of the number of updates that each index has seen, and only checks for correlation changes when a given fraction of all indices has been updated more than burn_in times and once since last time the criterion was checked.

PARAMETER	DESCRIPTION
`rtol`	Relative tolerance for convergence (\(\epsilon\) in the formula) TYPE: `float`
`burn_in`	The minimum number of updates an index must have seen before checking for convergence. This is required because the first correlation checks are usually meaningless. TYPE: `int`
`fraction`	The fraction of values that must have been updated between two correlation checks. This is to avoid comparing two results where only one value has been updated, which would have almost perfect rank correlation. TYPE: `float` DEFAULT: `1.0`
`modify_result`	If `True`, the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Added in 0.9.0

Changed in 0.10.0

The behaviour of the burn_in parameter was changed to look at value updates. The parameter fraction was added.

Source code in src/pydvl/valuation/stopping.py

def __init__(
    self,
    rtol: float,
    burn_in: int,
    fraction: float = 1.0,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)

    self.rtol = validate_number("rtol", rtol, float, lower=0.0, upper=1.0)
    self.burn_in = burn_in
    self.fraction = validate_number(
        "fraction", fraction, float, lower=0.0, upper=1.0
    )
    self.memory = RollingMemory(size=2, default=np.nan, dtype=np.float64)
    self.count_memory = RollingMemory(size=2, default=0, dtype=np.int_)
    self._corr = np.nan
    self._completion = 0.0

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

RollingMemory ¶

RollingMemory(
    size: int,
    skip_steps: int = 0,
    default: Union[DT, int, float] = inf,
    *,
    dtype: Type[DT] | None = None,
)

Bases: Generic[DT]

A simple rolling memory for the last size values of each index.

Updating the memory results in new values being copied to the last column of the matrix and old ones removed from the first.

PARAMETER	DESCRIPTION
`size`	The number of steps to remember. The internal buffer will have shape `(size, n_indices)`, where `n_indices` is the number of indices in the ValuationResult. TYPE: `int`
`skip_steps`	The number of steps to skip between updates. If `0`, the memory is updated at every step. If `1`, the memory is updated every other step, and so on. TYPE: `int` DEFAULT: `0`
`default`	The default value to use when the memory is empty. TYPE: `Union[DT, int, float]` DEFAULT: `inf`

Source code in src/pydvl/valuation/stopping.py

def __init__(
    self,
    size: int,
    skip_steps: int = 0,
    default: Union[DT, int, float] = np.inf,
    *,
    dtype: Type[DT] | None = None,
):
    if not isinstance(default, np.generic):  # convert to np scalar
        default = cast(DT, np.array(default, dtype=np.result_type(default))[()])
    if dtype is not None:  # user forced conversion
        default = dtype(default)
    self.size = validate_number("size", size, int, lower=1)
    self._skip_steps = validate_number("skip_steps", skip_steps, int, lower=0)
    self._count = 0
    self._default = cast(DT, default)
    self._data: NDArray[DT] = np.full(0, default, dtype=type(default))

data `property` ¶

data: NDArray[DT]

A view on the data. Rows are the steps, columns are the indices

getitem ¶

__getitem__(key: Union[slice, Iterable[int], int]) -> NDArray

Get items from the memory, properly handling temporal sequence.

Source code in src/pydvl/valuation/stopping.py

def __getitem__(self, key: Union[slice, Iterable[int], int]) -> NDArray:
    """Get items from the memory, properly handling temporal sequence."""
    key = self._validate_key(key)
    return cast(NDArray, self.data[key])

len ¶

__len__() -> int

The number of steps that the memory has saved. This is guaranteed to be between 0 and size, inclusive.

Source code in src/pydvl/valuation/stopping.py

def __len__(self) -> int:
    """The number of steps that the memory has saved. This is guaranteed to be
    between 0 and `size`, inclusive."""
    return min(self.size, self._count // (1 + self._skip_steps))

must_skip ¶

must_skip()

Check if the memory must skip the current update

Source code in src/pydvl/valuation/stopping.py

def must_skip(self):
    """Check if the memory must skip the current update"""
    return self._count % (1 + self._skip_steps) > 0

reset ¶

reset() -> Self

Empty the memory

Source code in src/pydvl/valuation/stopping.py

def reset(self) -> Self:
    """Empty the memory"""
    self._data = np.full(0, self._default, dtype=type(self._default))
    self._count = 0
    return self

update ¶

update(values: NDArray[DT]) -> Self

Update the memory with the values of the current result.

The values are appended to the memory as its last column, and the oldest values (the first column) are removed

Source code in src/pydvl/valuation/stopping.py

def update(self, values: NDArray[DT]) -> Self:
    """Update the memory with the values of the current result.

    The values are appended to the memory as its last column, and the oldest values
    (the first column) are removed
    """
    if len(self._data) == 0:
        self._data = np.full(
            (len(values), self.size),
            self._default,
            dtype=type(self._default),
        )
    self._count += 1
    if self.must_skip():
        return self
    self._data = np.concatenate([self._data[:, 1:], values.reshape(-1, 1)], axis=1)
    return self

StoppingCriterion ¶

StoppingCriterion(modify_result: bool = True)

Bases: ABC

A composable callable object to determine whether a computation must stop.

A StoppingCriterion is a callable taking a ValuationResult and returning a Status. It also keeps track of individual convergence of values with converged, and reports the overall completion of the computation with completion.

Instances of StoppingCriterion can be composed with the binary operators & (and), and | (or), following the truth tables of Status. The unary operator ~ (not) is also supported. These boolean operations act according to the following rules:

The results of check() are combined with the operator. See Status for the truth tables.
The results of converged are combined with the operator (returning another boolean array).
The completion method returns the min, max, or the complement to 1 of the completions of the operands, for AND, OR and NOT respectively. This is required for cases where one of the criteria does not keep track of the convergence of single values, e.g. MaxUpdates, because completion by default returns the mean of the boolean convergence array.

Subclassing¶

Subclassing this class requires implementing a check() method that returns a Status object based on a given ValuationResult. This method should update the attribute _converged, which is a boolean array indicating whether the value for each index has converged. When this does not make sense for a particular stopping criterion, completion should be overridden to provide an overall completion value, since its default implementation attempts to compute the mean of _converged.

PARAMETER	DESCRIPTION
`modify_result`	If `True` the status of the input ValuationResult is modified in place after the call. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/stopping.py

def __init__(self, modify_result: bool = True):
    self.modify_result = modify_result
    self._converged = np.full(0, False)
    self._count = 0

converged `property` ¶

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS	DESCRIPTION
`NDArray[bool_]`	A boolean array indicating whether the values have converged for
`NDArray[bool_]`	each data point.

count `property` ¶

count: int

The number of times that the criterion has been checked.

call ¶

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py

def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

completion ¶

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py

def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

pydvl.valuation.stopping ¶

Combining stopping criteria¶

How convergence is determined¶

Choosing a stopping criterion¶

Interactions with sampling schemes and other pitfalls of stopping criteria¶

Creating stopping criteria¶

References¶

AbsoluteStandardError ¶

converged property ¶

count property ¶

__call__ ¶

completion ¶

History ¶

converged property ¶

count property ¶

data property ¶

__call__ ¶

__len__ ¶

HistoryDeviation ¶

converged property ¶

count property ¶

__call__ ¶

completion ¶

MaxChecks ¶

converged property ¶

count property ¶

__call__ ¶

MaxSamples ¶

converged property ¶

count property ¶

__call__ ¶

MaxTime ¶

converged property ¶

count property ¶

__call__ ¶

MaxUpdates ¶

converged property ¶

count property ¶

__call__ ¶

MinUpdates ¶

converged property ¶

count property ¶

__call__ ¶

NoStopping ¶

converged property ¶

count property ¶

__call__ ¶

RankCorrelation ¶

converged property ¶

count property ¶

__call__ ¶

RollingMemory ¶

data property ¶

__getitem__ ¶

__len__ ¶

must_skip ¶

reset ¶

update ¶

StoppingCriterion ¶

Subclassing¶

converged property ¶

count property ¶

__call__ ¶

completion ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

data `property` ¶

call ¶

len ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

call ¶

converged `property` ¶

count `property` ¶

call ¶

data `property` ¶

getitem ¶

len ¶

converged `property` ¶

count `property` ¶

call ¶