Skip to content

pydvl.valuation.stopping

Stopping criteria for value computations.

This module provides a basic set of stopping criteria to be used with valuation algorithms, in particular all semi-values. Common examples are MinUpdates, MaxTime, or HistoryDeviation.

Stopping criteria can behave in different ways depending on the context. For example, MaxUpdates limits the number of updates to values, which depending on the algorithm may mean a different number of utility evaluations or imply other computations like solving a linear or quadratic program. In the case of SemivalueValuation, the criteria are evaluated once per batch, which might lead to different behavior depending on the batch size (e.g. for certain batch_sizes it might happen that the number of updates to values after convergence is not exactly was required, since multiple updates might happen at once).

Stopping criteria are callables that are evaluated on a ValuationResult and return a Status object. They can be combined using boolean operators.

Combining stopping criteria

Objects of type StoppingCriterion can be combined with the binary operators & (and), and | (or), following the truth tables of Status. The unary operator ~ (not) is also supported. See StoppingCriterion for details on how these operations affect the behavior of the stopping criteria.

How convergence is determined

Most stopping criteria keep track of the convergence of each index separately but make global decisions based on the overall convergence of some fraction of all indices. For example, if we have a stopping criterion that checks whether the standard error of 90% of values is below a threshold, then methods will keep updating all indices until 90% of them have converged, irrespective of the quality of the individual estimates, and without freezing updates for indices along the way as values individually attain low standard error.

This has some practical implications, because some values do tend to converge sooner than others. For example, assume we use the criterion AbsoluteStandardError(0.02) | MaxUpdates(1000). Then values close to 0 might be marked as "converged" rather quickly because they fulfill the first criterion, say after 20 iterations, despite being poor estimates (see the section on pitfalls below for commentary on stopping criteria based on standard errors). Because other indices take longer to reach a low standard error and the criterion is a global check, the "converged" ones keep being updated and end up being good estimates. In this case, this has been beneficial, but one might not wish for converged values to be updated, if one is sure that the criterion is adequate for individual values.

Semi-value methods include a parameter skip_converged that allows to skip the computation of values that have converged. The way to avoid doing this too early is to use a more stringent check, e.g. AbsoluteStandardError(1e-3) | MaxUpdates(1000). With skip_converged=True this check can still take less time than the first one, despite requiring more iterations for some indices.

Stoppig criterion for finite samplers

Using a finite sampler naturally defines when the valuation algorithm terminates. However, in order to properly report progress, we need to use a stopping criterion that keeps track of the number of iterations. In this case, one can use NoStopping with the sampler as an argument. This quirk is due to progress reported depending on the completion attribute of a criterion. Here's how it's done:

from pydvl.valuation import ShapleyValuation, NoStopping

sampler = DeterministicUniformSampler()
stopping = NoStopping(sampler)
valuation = ShapleyValuation(
    utility=utility, sampler=sampler, is_done=stopping, progress=True
)
with parallel_config(n_jobs=4):
    valuation.fit(data)

Choosing a stopping criterion

The choice of a stopping criterion greatly depends on the algorithm and the context. A safe bet is to combine a MaxUpdates or a MaxTime with a HistoryDeviation or an RankCorrelation. The former will ensure that the computation does not run for too long, while the latter will try to achieve results that are stable enough. Note however that if the threshold is too strict, one will always end up running until a maximum number of iterations or time. Also keep in mind that different values converge at different times, so you might want to use tight thresholds and skip_converged as described above for semi-values.

Example

from pydvl.valuation import DataBanzhafValuation, MinUpdates, MSRSampler, RankCorrelation

model = ... # Some sklearn-compatible model
scorer = SupervisedScorer("accuracy", test_data, default=0.0)
utility = ModelUtility(model, scorer)
sampler = MSRSampler(seed=seed)
stopping = RankCorrelation(rtol=1e-2, burn_in=32) | MinUpdates(1000)
valuation = DataBanzhafValuation(utility=utility, sampler=sampler, is_done=stopping)
with parallel_config(n_jobs=4):
    valuation.fit(training_data)
result = valuation.values()
This will compute the Banzhaf semivalues for utility until either the change in Spearman rank correlation between updates is below 1e-2 or 1000 updates have been performed. The burn_in parameter is used to discard the first 32 updates from the computation of the standard error.

Warning

Be careful not to reuse the same stopping criterion for different computations. The object has state, which is reset by fit() for some valuation methods, but this is not guaranteed for all methods. If you need to reuse the same criterion, it's safer to create a new instance.

Interactions with sampling schemes and other pitfalls of stopping criteria

Different samplers define different "update strategies" for values. For example, MSRSampler updates the counts field of a ValuationResult only for about half of the utility evaluations, because it reuses samples. This means that a stopping criterion like MaxChecks will not work as expected, because it will count the number of calls to the criterion, not the number of updates to the values. In this case, one should use MaxUpdates or, more likely, MinUpdates instead.

Another pitfall is the interaction with the skip_converged parameter of Semi-value methods. If this is set to True, the stopping criterion should probably be more stringent.

Finally, stopping criteria that rely on the standard error of the values, like AbsoluteStandardError, should be used with care. The standard error is a measure of the uncertainty of the estimate, but it does not guarantee that the estimate is close to the true value. For example, if the utility function is very noisy, the standard error might be very low, but the estimate might be far from the true value. In this case, one might want to use a RankCorrelation instead, which checks whether the rank of the values is stable.

Creating stopping criteria

In order to create a new stopping criterion, one can subclass StoppingCriterion and implement the _check method. This method should return a Status and update the _converged attribute, which is a boolean array indicating whether the value for each index has converged. When this does not make sense for a particular stopping criterion, completion should be overridden to provide an overall completion value, since its default implementation attempts to compute the mean of _converged.

References


  1. Ghorbani, A., Zou, J., 2019. Data Shapley: Equitable Valuation of Data for Machine Learning. In: Proceedings of the 36th International Conference on Machine Learning, PMLR, pp. 2242–2251. 

StoppingCriterion

StoppingCriterion(modify_result: bool = True)

Bases: ABC

A composable callable object to determine whether a computation must stop.

A StoppingCriterion is a callable taking a ValuationResult and returning a Status. It also keeps track of individual convergence of values with converged, and reports the overall completion of the computation with completion.

Instances of StoppingCriterion can be composed with the binary operators & (and), and | (or), following the truth tables of Status. The unary operator ~ (not) is also supported. These boolean operations act according to the following rules:

  • The results of check() are combined with the operator. See Status for the truth tables.
  • The results of converged are combined with the operator (returning another boolean array).
  • The completion method returns the min, max, or the complement to 1 of the completions of the operands, for AND, OR and NOT respectively. This is required for cases where one of the criteria does not keep track of the convergence of single values, e.g. MaxUpdates, because completion by default returns the mean of the boolean convergence array.

Subclassing

Subclassing this class requires implementing a check() method that returns a Status object based on a given ValuationResult. This method should update the attribute _converged, which is a boolean array indicating whether the value for each index has converged. When this does not make sense for a particular stopping criterion, completion should be overridden to provide an overall completion value, since its default implementation attempts to compute the mean of _converged.

PARAMETER DESCRIPTION
modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, modify_result: bool = True):
    self.modify_result = modify_result
    self._converged = np.full(0, False)
    self._count = 0

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

completion

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py
def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

AbsoluteStandardError

AbsoluteStandardError(
    threshold: float,
    fraction: float = 1.0,
    burn_in: int = 4,
    modify_result: bool = True,
)

Bases: StoppingCriterion

Determine convergence based on the standard error of the values.

If \(s_i\) is the standard error for datum \(i\), then this criterion returns Converged if \(s_i < \epsilon\) for all \(i\) and a threshold value \(\epsilon \gt 0\).

Warning

This criterion should be used with care. The standard error is a measure of the uncertainty of the estimate, but it does not guarantee that the estimate is close to the true value. For example, if the utility function is very noisy, the standard error might be very low, but the estimate might be far from the true value. In this case, one might want to use a RankCorrelation instead, which checks whether the rank of the values is stable.

PARAMETER DESCRIPTION
threshold

A value is considered to have converged if the standard error is below this threshold. A way of choosing it is to pick some percentage of the range of the values. For Shapley values this is the difference between the maximum and minimum of the utility function (to see this substitute the maximum and minimum values of the utility into the marginal contribution formula).

TYPE: float

fraction

The fraction of values that must have converged for the criterion to return Converged.

TYPE: float DEFAULT: 1.0

burn_in

The number of iterations to ignore before checking for convergence. This is required because computations typically start with zero variance, as a result of using zeros(). The default is set to an arbitrary minimum which is usually enough but may need to be increased.

TYPE: int DEFAULT: 4

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    threshold: float,
    fraction: float = 1.0,
    burn_in: int = 4,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)
    self.threshold = threshold
    self.fraction = fraction
    self.burn_in = burn_in

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

completion

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py
def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxChecks

MaxChecks(n_checks: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate as soon as the number of checks exceeds the threshold.

A "check" is one call to the criterion. Note that this might have different interpretations depending on the sampler. For example, MSRSampler performs a single utility evaluation to update all indices, so that's len(training_data) checks for a single training of the model. But it also only changes the counts field of the ValuationResult for about half of the indices, which is what e.g. MaxUpdates checks.

PARAMETER DESCRIPTION
n_checks

Threshold: if None, no _check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: int | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, n_checks: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_checks is not None and n_checks < 1:
        raise ValueError("n_iterations must be at least 1 or None")
    self.n_checks = n_checks or np.inf

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxUpdates

MaxUpdates(n_updates: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate if any number of value updates exceeds or equals the given threshold.

Note

If you want to ensure that all values have been updated, you probably want MinUpdates instead.

This checks the counts field of a ValuationResult, i.e. the number of times that each index has been updated. For powerset samplers, the maximum of this number coincides with the maximum number of subsets sampled. For permutation samplers, it coincides with the number of permutations sampled.

PARAMETER DESCRIPTION
n_updates

Threshold: if None, no _check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: int | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, n_updates: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    if n_updates is not None and n_updates < 1:
        raise ValueError("n_updates must be at least 1 or None")
    self.n_updates = n_updates
    self.last_max = 0

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

NoStopping

NoStopping(sampler: IndexSampler | None = None, modify_result: bool = True)

Bases: StoppingCriterion

Keep running forever or until sampling stops.

If a sampler instance is passed, and it is a finite sampler, its counter will be used to update completion status.

PARAMETER DESCRIPTION
sampler

A sampler instance to use for completion status.

TYPE: IndexSampler | None DEFAULT: None

modify_result

If True the status of the input ValuationResult is modified in place after the call

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, sampler: IndexSampler | None = None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    self.sampler = sampler

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MinUpdates

MinUpdates(n_updates: int | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate as soon as all value updates exceed or equal the given threshold.

This checks the counts field of a ValuationResult, i.e. the number of times that each index has been updated. For powerset samplers, the minimum of this number is a lower bound for the number of subsets sampled. For permutation samplers, it lower-bounds the amount of permutations sampled.

PARAMETER DESCRIPTION
n_updates

Threshold: if None, no _check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: int | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, n_updates: int | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    self.n_updates = n_updates
    self.last_min = 0
    self._actual_completion = 0.0

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

MaxTime

MaxTime(seconds: float | None, modify_result: bool = True)

Bases: StoppingCriterion

Terminate if the computation time exceeds the given number of seconds.

Checks the elapsed time since construction.

PARAMETER DESCRIPTION
seconds

Threshold: The computation is terminated if the elapsed time between object construction and a _check exceeds this value. If None, no _check is performed, effectively creating a (never) stopping criterion that always returns Pending.

TYPE: float | None

modify_result

If True the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(self, seconds: float | None, modify_result: bool = True):
    super().__init__(modify_result=modify_result)
    self.max_seconds = seconds or np.inf
    if self.max_seconds <= 0:
        raise ValueError("Number of seconds for MaxTime must be positive or None")
    self.start = time()

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

RollingMemory

RollingMemory(
    n_steps: int,
    default: Union[DT, int, float],
    *,
    dtype: Type[DT] | None = None,
)

Bases: Generic[DT]

A simple rolling memory for the last n_steps values of each index.

Updating the memory results in new values are copied to the last column of the matrix and old ones removed from the first.

PARAMETER DESCRIPTION
n_steps

The number of steps to remember.

TYPE: int

default

The default value to use when the memory is empty.

TYPE: Union[DT, int, float]

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    n_steps: int,
    default: Union[DT, int, float],
    *,
    dtype: Type[DT] | None = None,
):
    if not isinstance(default, np.generic):  # convert to np scalar
        default = cast(DT, np.array(default, dtype=np.result_type(default))[()])
    if dtype is not None:  # user forced conversion
        default = dtype(default)
    if n_steps < 1:
        raise ValueError("n_steps must be at least 1")
    self.n_steps = n_steps
    self._count = 0
    self._default = cast(DT, default)
    self._data: NDArray[DT] = np.full(0, default, dtype=type(default))

update

update(values: NDArray[DT]) -> Self

Update the memory with the values of the current result.

The values are appended to the memory as its last column, and the oldest values (the first column) are removed

Source code in src/pydvl/valuation/stopping.py
def update(self, values: NDArray[DT]) -> Self:
    """Update the memory with the values of the current result.

    The values are appended to the memory as its last column, and the oldest values
    (the first column) are removed
    """
    if len(self._data) == 0:
        self._data = np.full(
            (len(values), self.n_steps),
            self._default,
            dtype=type(self._default),
        )
    self._data = np.concatenate([self._data[:, 1:], values.reshape(-1, 1)], axis=1)
    self._count += 1
    return self

HistoryDeviation

HistoryDeviation(
    n_steps: int,
    rtol: float,
    pin_converged: bool = True,
    modify_result: bool = True,
)

Bases: StoppingCriterion

A simple check for relative distance to a previous step in the computation.

The method used by Ghorbani and Zou, (2019)1 computes the relative distances between the current values \(v_i^t\) and the values at the previous checkpoint \(v_i^{t-\tau}\). If the sum is below a given threshold, the computation is terminated.

\[\sum_{i=1}^n \frac{\left| v_i^t - v_i^{t-\tau} \right|}{v_i^t} < \epsilon.\]

When the denominator is zero, the summand is set to the value of \(v_i^{ t-\tau}\).

This implementation is slightly generalised to allow for different number of updates to individual indices, as happens with powerset samplers instead of permutations. Every subset of indices that is found to converge can be pinned to that state. Once all indices have converged the method has converged.

Warning

This criterion is meant for the reproduction of the results in the paper, but we do not recommend using it in practice.

PARAMETER DESCRIPTION
n_steps

Compare values after so many steps. A step is one evaluation of the criterion, which happens once per batch.

TYPE: int

rtol

Relative tolerance for convergence (\(\epsilon\) in the formula).

TYPE: float

pin_converged

If True, once an index has converged, it is pinned

TYPE: bool DEFAULT: True

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    n_steps: int,
    rtol: float,
    pin_converged: bool = True,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)
    if rtol <= 0 or rtol >= 1:
        raise ValueError("rtol must be in (0, 1)")

    self.memory = RollingMemory(n_steps + 1, default=np.inf, dtype=np.float64)
    self.rtol = rtol
    self.update_op = np.logical_or if pin_converged else np.logical_and

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

completion

completion() -> float

Returns a value between 0 and 1 indicating the completion of the computation.

Source code in src/pydvl/valuation/stopping.py
def completion(self) -> float:
    """Returns a value between 0 and 1 indicating the completion of the
    computation."""
    if self.converged.size == 0:
        return 0.0
    return float(np.mean(self.converged).item())

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status

RankCorrelation

RankCorrelation(
    rtol: float, burn_in: int, fraction: float = 1.0, modify_result: bool = True
)

Bases: StoppingCriterion

A check for stability of Spearman correlation between checks.

Convergence is reached when the change in rank correlation between two successive iterations is below a given threshold.

This criterion is used in (Wang et al.)2.

The meaning of successive iterations

Stopping criteria in pyDVL are typically evaluated after each batch of value updates is received. This can imply very different things, depending on the configuration of the samplers. For this reason, RankCorrelation keeps itself track of the number of updates that each index has seen, and only checks for correlation changes when a given fraction of all indices has been updated more than burn_in times and once since last time the criterion was checked.

PARAMETER DESCRIPTION
rtol

Relative tolerance for convergence (\(\epsilon\) in the formula)

TYPE: float

burn_in

The minimum number of updates an index must have seen before checking for convergence. This is required because the first correlation checks are usually meaningless.

TYPE: int

fraction

The fraction of values that must have been updated between two correlation checks. This is to avoid comparing two results where only one value has been updated, which would have almost perfect rank correlation.

TYPE: float DEFAULT: 1.0

modify_result

If True, the status of the input ValuationResult is modified in place after the call.

TYPE: bool DEFAULT: True

Added in 0.9.0

Changed in 0.10.0

The behaviour of the burn_in parameter was changed to look at value updates. The parameter fraction was added.

Source code in src/pydvl/valuation/stopping.py
def __init__(
    self,
    rtol: float,
    burn_in: int,
    fraction: float = 1.0,
    modify_result: bool = True,
):
    super().__init__(modify_result=modify_result)
    if rtol < 0 or rtol > 1:
        raise ValueError("rtol must be in [0, 1]")
    if fraction <= 0 or fraction > 1:
        raise ValueError("fraction must be in (0, 1]")

    self.rtol = rtol
    self.burn_in = burn_in
    self.fraction = fraction
    self.memory = RollingMemory(n_steps=2, default=np.nan, dtype=np.float64)
    self.count_memory = RollingMemory(n_steps=2, default=0, dtype=np.int_)
    self._corr = np.nan
    self._completion = 0.0

count property

count: int

The number of times that the criterion has been checked.

converged property

converged: NDArray[bool_]

Returns a boolean array indicating whether the values have converged for each data point.

Inheriting classes must set the _converged attribute in their check().

RETURNS DESCRIPTION
NDArray[bool_]

A boolean array indicating whether the values have converged for

NDArray[bool_]

each data point.

__call__

__call__(result: ValuationResult) -> Status

Calls check(), maybe updating the result.

Source code in src/pydvl/valuation/stopping.py
def __call__(self, result: ValuationResult) -> Status:
    """Calls `check()`, maybe updating the result."""
    if len(result) == 0:
        logger.warning(
            "At least one iteration finished but no results where generated. "
            "Please check that your scorer and utility return valid numbers."
        )
    self._count += 1
    if self._converged.size == 0:
        self._converged = np.full_like(result.indices, False, dtype=bool)
    status = self._check(result)
    if self.modify_result:  # FIXME: this is not nice
        result._status = status
    return status