pydvl.valuation.samplers.permutation
¶
Permutation-based samplers.
TODO: explain the formulation and the different samplers.
References¶
-
Mitchell, Rory, Joshua Cooper, Eibe Frank, and Geoffrey Holmes. Sampling Permutations for Shapley Value Estimation. Journal of Machine Learning Research 23, no. 43 (2022): 1–46. ↩
-
Watson, Lauren, Zeno Kujawa, Rayna Andreeva, Hao-Tsung Yang, Tariq Elahi, and Rik Sarkar. Accelerated Shapley Value Approximation for Data Evaluation. arXiv, 9 November 2023. ↩
PermutationSamplerBase
¶
PermutationSamplerBase(
*args,
truncation: TruncationPolicy | None = None,
batch_size: int = 1,
**kwargs,
)
Bases: IndexSampler
Base class for permutation samplers.
Source code in src/pydvl/valuation/samplers/permutation.py
skip_indices
property
writable
¶
Indices being skipped in the sampler. The exact behaviour will be sampler-dependent, so that setting this property is disabled by default.
interrupt
¶
__len__
¶
__len__() -> int
Returns the length of the current sample generation in generate_batches.
RAISES | DESCRIPTION |
---|---|
`TypeError`
|
if the sampler is infinite or generate_batches has not been called yet. |
Source code in src/pydvl/valuation/samplers/base.py
generate_batches
¶
Batches the samples and yields them.
Source code in src/pydvl/valuation/samplers/base.py
sample_limit
abstractmethod
¶
sample_limit(indices: IndexSetT) -> int | None
Number of samples that can be generated from the indices.
PARAMETER | DESCRIPTION |
---|---|
indices
|
The indices used in the sampler.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
int | None
|
The maximum number of samples that will be generated, or |
Source code in src/pydvl/valuation/samplers/base.py
generate
abstractmethod
¶
Generates single samples.
IndexSampler.generate_batches()
will batch these samples according to the
batch size set upon construction.
PARAMETER | DESCRIPTION |
---|---|
indices
|
TYPE:
|
YIELDS | DESCRIPTION |
---|---|
SampleGenerator
|
A tuple (idx, subset) for each sample. |
Source code in src/pydvl/valuation/samplers/base.py
result_updater
¶
result_updater(result: ValuationResult) -> ResultUpdater[ValueUpdateT]
Returns a callable that updates a valuation result with a value update.
Because we use log-space computation for numerical stability, the default result updater keeps track of several quantities required to maintain accurate running 1st and 2nd moments.
PARAMETER | DESCRIPTION |
---|---|
result
|
The result to update
TYPE:
|
Returns: A callable object that updates the result with a value update
Source code in src/pydvl/valuation/samplers/base.py
PermutationSampler
¶
PermutationSampler(
truncation: TruncationPolicy | None = None,
seed: Seed | None = None,
batch_size: int = 1,
)
Bases: StochasticSamplerMixin
, PermutationSamplerBase
Samples permutations of indices.
Batching
Even though this sampler supports batching, it is not recommended to use it since the PermutationEvaluationStrategy processes whole permutations in one go, effectively batching the computation of up to n-1 marginal utilities in one process.
PARAMETER | DESCRIPTION |
---|---|
truncation
|
A policy to stop the permutation early.
TYPE:
|
seed
|
Seed for the random number generator.
TYPE:
|
Source code in src/pydvl/valuation/samplers/permutation.py
interrupt
¶
__len__
¶
__len__() -> int
Returns the length of the current sample generation in generate_batches.
RAISES | DESCRIPTION |
---|---|
`TypeError`
|
if the sampler is infinite or generate_batches has not been called yet. |
Source code in src/pydvl/valuation/samplers/base.py
generate_batches
¶
Batches the samples and yields them.
Source code in src/pydvl/valuation/samplers/base.py
result_updater
¶
result_updater(result: ValuationResult) -> ResultUpdater[ValueUpdateT]
Returns a callable that updates a valuation result with a value update.
Because we use log-space computation for numerical stability, the default result updater keeps track of several quantities required to maintain accurate running 1st and 2nd moments.
PARAMETER | DESCRIPTION |
---|---|
result
|
The result to update
TYPE:
|
Returns: A callable object that updates the result with a value update
Source code in src/pydvl/valuation/samplers/base.py
generate
¶
Generates the permutation samples.
PARAMETER | DESCRIPTION |
---|---|
indices
|
The indices to sample from. If empty, no samples are generated. If skip_indices is set, these indices are removed from the set before generating the permutation.
TYPE:
|
Source code in src/pydvl/valuation/samplers/permutation.py
AntitheticPermutationSampler
¶
AntitheticPermutationSampler(
truncation: TruncationPolicy | None = None,
seed: Seed | None = None,
batch_size: int = 1,
)
Bases: PermutationSampler
Samples permutations like PermutationSampler, but after each permutation, it returns the same permutation in reverse order.
This sampler was suggested in (Mitchell et al. 2022)1
New in version 0.7.1
Source code in src/pydvl/valuation/samplers/permutation.py
interrupt
¶
__len__
¶
__len__() -> int
Returns the length of the current sample generation in generate_batches.
RAISES | DESCRIPTION |
---|---|
`TypeError`
|
if the sampler is infinite or generate_batches has not been called yet. |
Source code in src/pydvl/valuation/samplers/base.py
generate_batches
¶
Batches the samples and yields them.
Source code in src/pydvl/valuation/samplers/base.py
result_updater
¶
result_updater(result: ValuationResult) -> ResultUpdater[ValueUpdateT]
Returns a callable that updates a valuation result with a value update.
Because we use log-space computation for numerical stability, the default result updater keeps track of several quantities required to maintain accurate running 1st and 2nd moments.
PARAMETER | DESCRIPTION |
---|---|
result
|
The result to update
TYPE:
|
Returns: A callable object that updates the result with a value update
Source code in src/pydvl/valuation/samplers/base.py
DeterministicPermutationSampler
¶
DeterministicPermutationSampler(
*args,
truncation: TruncationPolicy | None = None,
batch_size: int = 1,
**kwargs,
)
Bases: PermutationSamplerBase
Samples all n! permutations of the indices deterministically, and iterates through them, returning sets as required for the permutation-based definition of semi-values.
Source code in src/pydvl/valuation/samplers/permutation.py
skip_indices
property
writable
¶
Indices being skipped in the sampler. The exact behaviour will be sampler-dependent, so that setting this property is disabled by default.
interrupt
¶
__len__
¶
__len__() -> int
Returns the length of the current sample generation in generate_batches.
RAISES | DESCRIPTION |
---|---|
`TypeError`
|
if the sampler is infinite or generate_batches has not been called yet. |
Source code in src/pydvl/valuation/samplers/base.py
generate_batches
¶
Batches the samples and yields them.
Source code in src/pydvl/valuation/samplers/base.py
result_updater
¶
result_updater(result: ValuationResult) -> ResultUpdater[ValueUpdateT]
Returns a callable that updates a valuation result with a value update.
Because we use log-space computation for numerical stability, the default result updater keeps track of several quantities required to maintain accurate running 1st and 2nd moments.
PARAMETER | DESCRIPTION |
---|---|
result
|
The result to update
TYPE:
|
Returns: A callable object that updates the result with a value update
Source code in src/pydvl/valuation/samplers/base.py
PermutationEvaluationStrategy
¶
PermutationEvaluationStrategy(
sampler: PermutationSamplerBase,
utility: UtilityBase,
coefficient: Callable[[int, int], float] | None = None,
)
Bases: EvaluationStrategy[PermutationSamplerBase, ValueUpdate]
Computes marginal values for permutation sampling schemes in log-space.
This strategy iterates over permutations from left to right, computing the marginal utility wrt. the previous one at each step to save computation.