pydvl.valuation.methods.beta_shapley
¶
This module implements Beta-Shapley valuation as introduced in Kwon and Zou (2022)1.
Background on semi-values
Beta-Shapley is a special case of the semi-value valuation method. You can read a short introduction in the documentation.
Beta(\(\alpha\), \(\beta\))-Shapley is a semi-value whose coefficients are given by the Beta function. The coefficients are defined as:
Note that this deviates by a factor \(n\) from eq. (5) in Kwon and Zou (2022)1 because of how we define sampler weights, but the effective coefficient remains the same when using any PowersetSampler or PermutationSampler.
Connection to AME¶
Beta-Shapley can be seen as a special case of AME, introduced in Lin et al. (2022)2.
Todo
Explain sampler choices for AME and how to estimate Beta-Shapley with lasso.
References¶
-
Kwon, Yongchan, and James Zou. Beta Shapley: A Unified and Noise-Reduced Data Valuation Framework for Machine Learning. In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, 8780–8802. PMLR, 2022. ↩↩
-
Lin, Jinkun, Anqi Zhang, Mathias Lécuyer, Jinyang Li, Aurojit Panda, and Siddhartha Sen. Measuring the Effect of Training Data on Deep Learning Predictions via Randomized Experiments. In Proceedings of the 39th International Conference on Machine Learning, 13468–504. PMLR, 2022. ↩
BetaShapleyValuation
¶
BetaShapleyValuation(
utility: UtilityBase,
sampler: IndexSampler,
is_done: StoppingCriterion,
alpha: float,
beta: float,
skip_converged: bool = False,
show_warnings: bool = True,
progress: bool = False,
)
Bases: SemivalueValuation
Computes Beta-Shapley values.
PARAMETER | DESCRIPTION |
---|---|
utility
|
Object to compute utilities.
TYPE:
|
sampler
|
Sampling scheme to use.
TYPE:
|
is_done
|
Stopping criterion to use.
TYPE:
|
alpha
|
The alpha parameter of the Beta distribution.
TYPE:
|
beta
|
The beta parameter of the Beta distribution.
TYPE:
|
skip_converged
|
Whether to skip converged indices. Convergence is determined
by the stopping criterion's
TYPE:
|
show_warnings
|
Whether to show any runtime warnings.
TYPE:
|
progress
|
Whether to show a progress bar. If a dictionary, it is passed to
TYPE:
|
Source code in src/pydvl/valuation/methods/beta_shapley.py
log_coefficient
property
¶
log_coefficient: SemivalueCoefficient | None
Beta-Shapley coefficient.
Defined (up to a constant n) as eq. (5) of Kwon and Zou (2023)1.
fit
¶
fit(data: Dataset, continue_from: ValuationResult | None = None) -> Self
Fits the semi-value valuation to the data.
Access the results through the result
property.
PARAMETER | DESCRIPTION |
---|---|
data
|
Data for which to compute values
TYPE:
|
continue_from
|
A previously computed valuation result to continue from.
TYPE:
|
Source code in src/pydvl/valuation/methods/semivalue.py
values
¶
values(sort: bool = False) -> ValuationResult
Returns a copy of the valuation result.
The valuation must have been run with fit()
before calling this method.
PARAMETER | DESCRIPTION |
---|---|
sort
|
Whether to sort the valuation result by value before returning it.
TYPE:
|
Returns: The result of the valuation.