pydvl.valuation.methods.least_core
¶
LeastCoreValuation
¶
LeastCoreValuation(
utility: UtilityBase,
sampler: PowersetSampler,
n_samples: int | None = None,
non_negative_subsidy: bool = False,
solver_options: dict | None = None,
progress: bool = True,
)
Bases: Valuation
Umbrella class to calculate least-core values with multiple sampling methods.
See Data valuation for an overview.
Different samplers correspond to different least-core methods from the literature. For those, we provide convenience subclasses of LeastCoreValuation. See
Other samplers allow you to create your own method and might yield computational gains over a standard Monte Carlo method.
PARAMETER | DESCRIPTION |
---|---|
utility |
Utility object with model, data and scoring function.
TYPE:
|
sampler |
The sampler to use for the valuation.
TYPE:
|
n_samples |
The number of samples to use for the valuation. If None, it will be
set to the sample limit of the chosen sampler (for finite samplers) or
TYPE:
|
non_negative_subsidy |
If True, the least core subsidy \(e\) is constrained to be non-negative.
TYPE:
|
solver_options |
Optional dictionary containing a CVXPY solver and options to configure it. For valid values to the "solver" key see here. For additional options see here.
TYPE:
|
progress |
Whether to show a progress bar during the construction of the least-core problem.
TYPE:
|
Source code in src/pydvl/valuation/methods/least_core.py
values
¶
values(sort: bool = False) -> ValuationResult
Returns a copy of the valuation result.
The valuation must have been run with fit()
before calling this method.
PARAMETER | DESCRIPTION |
---|---|
sort |
Whether to sort the valuation result before returning it.
TYPE:
|
Returns: The result of the valuation.
Source code in src/pydvl/valuation/base.py
fit
¶
Calculate the least core valuation on a dataset.
This method has to be called before calling values()
.
Calculating the least core valuation is a computationally expensive task that
can be parallelized. To do so, call the fit()
method inside a
joblib.parallel_config
context manager as follows:
Source code in src/pydvl/valuation/methods/least_core.py
ExactLeastCoreValuation
¶
ExactLeastCoreValuation(
utility: UtilityBase,
non_negative_subsidy: bool = False,
solver_options: dict | None = None,
progress: bool = True,
batch_size: int = 1,
)
Bases: LeastCoreValuation
Class to calculate exact least-core values.
Equivalent to calling LeastCoreValuation
with a DeterministicUniformSampler
and n_samples=None
.
The definition of the exact least-core valuation is:
Where \(N = \{1, 2, \dots, n\}\) are the training set's indices.
PARAMETER | DESCRIPTION |
---|---|
utility |
Utility object with model, data and scoring function.
TYPE:
|
non_negative_subsidy |
If True, the least core subsidy \(e\) is constrained to be non-negative.
TYPE:
|
solver_options |
Optional dictionary containing a CVXPY solver and options to configure it. For valid values to the "solver" key see here. For additional options see here.
TYPE:
|
progress |
Whether to show a progress bar during the construction of the least-core problem.
TYPE:
|
Source code in src/pydvl/valuation/methods/least_core.py
fit
¶
Calculate the least core valuation on a dataset.
This method has to be called before calling values()
.
Calculating the least core valuation is a computationally expensive task that
can be parallelized. To do so, call the fit()
method inside a
joblib.parallel_config
context manager as follows:
Source code in src/pydvl/valuation/methods/least_core.py
values
¶
values(sort: bool = False) -> ValuationResult
Returns a copy of the valuation result.
The valuation must have been run with fit()
before calling this method.
PARAMETER | DESCRIPTION |
---|---|
sort |
Whether to sort the valuation result before returning it.
TYPE:
|
Returns: The result of the valuation.
Source code in src/pydvl/valuation/base.py
MonteCarloLeastCoreValuation
¶
MonteCarloLeastCoreValuation(
utility: UtilityBase,
n_samples: int,
non_negative_subsidy: bool = False,
solver_options: dict | None = None,
progress: bool = True,
seed: Seed | None = None,
batch_size: int = 1,
)
Bases: LeastCoreValuation
Class to calculate exact least-core values.
Equivalent to calling LeastCoreValuation
with a UniformSampler
.
The definition of the Monte Carlo least-core valuation is:
Where:
- \(U(2^N)\) is the uniform distribution over the powerset of \(N\).
- \(m\) is the number of subsets that will be sampled and whose utility will be computed and used to compute the data values.
PARAMETER | DESCRIPTION |
---|---|
utility |
Utility object with model, data and scoring function.
TYPE:
|
n_samples |
The number of samples to use for the valuation. If None, it will be
set to
TYPE:
|
non_negative_subsidy |
If True, the least core subsidy \(e\) is constrained to be non-negative.
TYPE:
|
solver_options |
Optional dictionary containing a CVXPY solver and options to configure it. For valid values to the "solver" key see here. For additional options see here.
TYPE:
|
progress |
Whether to show a progress bar during the construction of the least-core problem.
TYPE:
|
Source code in src/pydvl/valuation/methods/least_core.py
fit
¶
Calculate the least core valuation on a dataset.
This method has to be called before calling values()
.
Calculating the least core valuation is a computationally expensive task that
can be parallelized. To do so, call the fit()
method inside a
joblib.parallel_config
context manager as follows:
Source code in src/pydvl/valuation/methods/least_core.py
values
¶
values(sort: bool = False) -> ValuationResult
Returns a copy of the valuation result.
The valuation must have been run with fit()
before calling this method.
PARAMETER | DESCRIPTION |
---|---|
sort |
Whether to sort the valuation result before returning it.
TYPE:
|
Returns: The result of the valuation.
Source code in src/pydvl/valuation/base.py
create_least_core_problem
¶
create_least_core_problem(
u: UtilityBase, sampler: PowersetSampler, n_samples: int, progress: bool
) -> LeastCoreProblem
Create a Least Core problem from a utility and a sampler.
PARAMETER | DESCRIPTION |
---|---|
u |
Utility object with model, data and scoring function.
TYPE:
|
sampler |
The sampler to use for the valuation.
TYPE:
|
n_samples |
The maximum number of samples to use for the valuation.
TYPE:
|
progress |
Whether to show a progress bar during the construction of the least-core problem.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
LeastCoreProblem
|
The least core problem to solve.
TYPE:
|