pydvl.valuation.methods._utility_values_and_sample_masks
¶
compute_utility_values_and_sample_masks
¶
compute_utility_values_and_sample_masks(
utility: UtilityBase,
sampler: IndexSampler,
n_samples: int,
progress: bool,
extra_samples: Iterable[SampleT] | None = None,
) -> Tuple[NDArray[float_], NDArray[bool_]]
Calculate utility values and sample masks on samples in parallel.
Creating the utility evaluations and sample masks is the computational bottleneck of several data valuation algorithms, for examples least-core and group-testing.
PARAMETER | DESCRIPTION |
---|---|
utility |
Utility object with model, data and scoring function.
TYPE:
|
sampler |
The sampler to use for the valuation.
TYPE:
|
n_samples |
The number of samples to use from the sampler.
TYPE:
|
progress |
Whether to show a progress bar.
TYPE:
|
extra_samples |
Additional samples to evaluate. For example, this can be used to calculate the total utility of the dataset in parallel with evaluating the utility on the samples. Defaults to None.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tuple[NDArray[float_], NDArray[bool_]]
|
A tuple containing the utility values and the sample masks. |
RAISES | DESCRIPTION |
---|---|
ValueError
|
If the utility object does not have training data. |