pydvl.valuation.methods.least_core ¶

This module implements the least-core valuation method.

Least-Core values were introduced by Yan and Procaccia (2021).¹ Please refer to the paper or our documentation for more details and a comparison with other methods (Benmerzoug and de Benito Delgado, 2023).²

References¶

Yan, Tom, and Ariel D. Procaccia. If You Like Shapley Then You’ll Love the Core. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021, 6:5751–59. Virtual conference: Association for the Advancement of Artificial Intelligence, 2021. ↩
Benmerzoug, Anes, and Miguel de Benito Delgado. [Re] If You like Shapley, Then You’ll Love the Core. ReScience C 9, no. 2 (31 July 2023): #32. ↩

ExactLeastCoreValuation ¶

ExactLeastCoreValuation(
    utility: UtilityBase,
    non_negative_subsidy: bool = False,
    solver_options: dict | None = None,
    progress: bool = True,
    batch_size: int = 1,
)

Bases: LeastCoreValuation

Class to calculate exact least-core values.

Equivalent to constructing a LeastCoreValuation with a DeterministicUniformSampler and n_samples=None.

PARAMETER	DESCRIPTION
`utility`	Utility object with model, data and scoring function. TYPE: `UtilityBase`
`non_negative_subsidy`	If True, the least core subsidy \(e\) is constrained to be non-negative. TYPE: `bool` DEFAULT: `False`
`solver_options`	Optional dictionary containing a CVXPY solver and options to configure it. For valid values to the "solver" key see here. For additional options see here. TYPE: `dict \| None` DEFAULT: `None`
`progress`	Whether to show a progress bar during the construction of the least-core problem. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/methods/least_core.py

def __init__(
    self,
    utility: UtilityBase,
    non_negative_subsidy: bool = False,
    solver_options: dict | None = None,
    progress: bool = True,
    batch_size: int = 1,
):
    super().__init__(
        utility=utility,
        sampler=DeterministicUniformSampler(
            index_iteration=FiniteNoIndexIteration, batch_size=batch_size
        ),
        n_samples=None,
        non_negative_subsidy=non_negative_subsidy,
        solver_options=solver_options,
        progress=progress,
    )

result `property` ¶

result: ValuationResult

The current valuation result (not a copy).

fit ¶

fit(data: Dataset, continue_from: ValuationResult | None = None) -> Self

Calculate the exact least core valuation on a dataset.

This method computes all possible coalitions which makes it only feasible for small datasets (typically less than 20 samples).

PARAMETER	DESCRIPTION
`data`	Data for which to compute values TYPE: `Dataset`
`continue_from`	A previously computed valuation result to continue from. TYPE: `ValuationResult \| None` DEFAULT: `None`

Source code in src/pydvl/valuation/methods/least_core.py

def fit(self, data: Dataset, continue_from: ValuationResult | None = None) -> Self:
    """Calculate the exact least core valuation on a dataset.

    This method computes all possible coalitions which makes it only feasible for
    small datasets (typically less than 20 samples).

    Args:
        data: Data for which to compute values
        continue_from: A previously computed valuation result to continue from.
    """
    return super().fit(data, continue_from)

values ¶

values(sort: bool = False) -> ValuationResult

Returns a copy of the valuation result.

The valuation must have been run with fit() before calling this method.

PARAMETER	DESCRIPTION
`sort`	Whether to sort the valuation result by value before returning it. TYPE: `bool` DEFAULT: `False`

Returns: The result of the valuation.

Source code in src/pydvl/valuation/base.py

@deprecated(
    target=None,
    deprecated_in="0.10.0",
    remove_in="0.11.0",
)
def values(self, sort: bool = False) -> ValuationResult:
    """Returns a copy of the valuation result.

    The valuation must have been run with `fit()` before calling this method.

    Args:
        sort: Whether to sort the valuation result by value before returning it.
    Returns:
        The result of the valuation.
    """
    if not self.is_fitted:
        raise NotFittedException(type(self))
    assert self._result is not None

    r = self._result.copy()
    if sort:
        r.sort(inplace=True)
    return r

LeastCoreValuation ¶

LeastCoreValuation(
    utility: UtilityBase,
    sampler: PowersetSampler,
    n_samples: int | None = None,
    non_negative_subsidy: bool = False,
    solver_options: dict | None = None,
    progress: bool = True,
)

Bases: Valuation

Umbrella class to calculate least-core values with multiple sampling methods.

See the documentation for an overview.

Different samplers correspond to different least-core methods from the literature. For those, we provide convenience subclasses of LeastCoreValuation. See

ExactLeastCoreValuation
MonteCarloLeastCoreValuation

Other samplers allow you to create your own importance sampling method and might yield computational gains over the standard Monte Carlo method.

PARAMETER	DESCRIPTION
`utility`	Utility object with model, data and scoring function. TYPE: `UtilityBase`
`sampler`	The sampler to use for the valuation. TYPE: `PowersetSampler`
`n_samples`	The number of samples to use for the valuation. If None, it will be set to the sample limit of the chosen sampler (for finite samplers) or `1000 * len(data)` (for infinite samplers). TYPE: `int \| None` DEFAULT: `None`
`non_negative_subsidy`	If True, the least core subsidy \(e\) is constrained to be non-negative. TYPE: `bool` DEFAULT: `False`
`solver_options`	Optional dictionary containing a CVXPY solver and options to configure it. For valid values to the "solver" key see here. For additional options see here. TYPE: `dict \| None` DEFAULT: `None`
`progress`	Whether to show a progress bar during the construction of the least-core problem. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/methods/least_core.py

def __init__(
    self,
    utility: UtilityBase,
    sampler: PowersetSampler,
    n_samples: int | None = None,
    non_negative_subsidy: bool = False,
    solver_options: dict | None = None,
    progress: bool = True,
):
    super().__init__()

    _check_sampler(sampler)
    self._utility = utility
    self._sampler = sampler
    self._non_negative_subsidy = non_negative_subsidy
    self._solver_options = solver_options
    self._n_samples = n_samples
    self._progress = progress
    self.algorithm_name = f"LeastCore-{str(sampler)}"

result `property` ¶

result: ValuationResult

The current valuation result (not a copy).

fit ¶

fit(data: Dataset, continue_from: ValuationResult | None = None) -> Self

Calculate the least core valuation on a dataset.

This method has to be called before calling values().

Calculating the least core valuation is a computationally expensive task that can be parallelized. To do so, call the fit() method inside a joblib.parallel_config context manager as follows:

from joblib import parallel_config

with parallel_config(n_jobs=4):
    valuation.fit(data)

Args: data: Data for which to compute values continue_from: A previously computed valuation result to continue from.

Source code in src/pydvl/valuation/methods/least_core.py

def fit(self, data: Dataset, continue_from: ValuationResult | None = None) -> Self:
    """Calculate the least core valuation on a dataset.

    This method has to be called before calling `values()`.

    Calculating the least core valuation is a computationally expensive task that
    can be parallelized. To do so, call the `fit()` method inside a
    `joblib.parallel_config` context manager as follows:

    ```python
    from joblib import parallel_config

    with parallel_config(n_jobs=4):
        valuation.fit(data)
    ```
    Args:
        data: Data for which to compute values
        continue_from: A previously computed valuation result to continue from.
    """
    self._result = self._init_or_check_result(data, continue_from)
    self._utility = self._utility.with_dataset(data)

    if self._n_samples is None:
        self._n_samples = _get_default_n_samples(
            sampler=self._sampler, indices=data.indices
        )

    problem = create_least_core_problem(
        u=self._utility,
        sampler=self._sampler,
        n_samples=self._n_samples,
        progress=self._progress,
    )

    solution = lc_solve_problem(
        problem=problem,
        u=self._utility,
        algorithm=str(self),
        non_negative_subsidy=self._non_negative_subsidy,
        solver_options=self._solver_options,
    )

    self._result += solution
    return self

values ¶

values(sort: bool = False) -> ValuationResult

Returns a copy of the valuation result.

The valuation must have been run with fit() before calling this method.

PARAMETER	DESCRIPTION
`sort`	Whether to sort the valuation result by value before returning it. TYPE: `bool` DEFAULT: `False`

Returns: The result of the valuation.

Source code in src/pydvl/valuation/base.py

@deprecated(
    target=None,
    deprecated_in="0.10.0",
    remove_in="0.11.0",
)
def values(self, sort: bool = False) -> ValuationResult:
    """Returns a copy of the valuation result.

    The valuation must have been run with `fit()` before calling this method.

    Args:
        sort: Whether to sort the valuation result by value before returning it.
    Returns:
        The result of the valuation.
    """
    if not self.is_fitted:
        raise NotFittedException(type(self))
    assert self._result is not None

    r = self._result.copy()
    if sort:
        r.sort(inplace=True)
    return r

MonteCarloLeastCoreValuation ¶

MonteCarloLeastCoreValuation(
    utility: UtilityBase,
    n_samples: int,
    non_negative_subsidy: bool = False,
    solver_options: dict | None = None,
    progress: bool = True,
    seed: Seed | None = None,
    batch_size: int = 1,
)

Bases: LeastCoreValuation

Class to calculate exact least-core values.

Equivalent to creating a LeastCoreValuation with a UniformSampler.

PARAMETER	DESCRIPTION
`utility`	Utility object with model, data and scoring function. TYPE: `UtilityBase`
`n_samples`	The number of samples to use for the valuation. If None, it will be set to `1000 * len(data)`. TYPE: `int`
`non_negative_subsidy`	If True, the least core subsidy \(e\) is constrained to be non-negative. TYPE: `bool` DEFAULT: `False`
`solver_options`	Optional dictionary containing a CVXPY solver and options to configure it. For valid values to the "solver" key see here. For additional options see here. TYPE: `dict \| None` DEFAULT: `None`
`progress`	Whether to show a progress bar during the construction of the least-core problem. TYPE: `bool` DEFAULT: `True`

Source code in src/pydvl/valuation/methods/least_core.py

def __init__(
    self,
    utility: UtilityBase,
    n_samples: int,
    non_negative_subsidy: bool = False,
    solver_options: dict | None = None,
    progress: bool = True,
    seed: Seed | None = None,
    batch_size: int = 1,
):
    super().__init__(
        utility=utility,
        sampler=UniformSampler(
            index_iteration=NoIndexIteration, seed=seed, batch_size=batch_size
        ),
        n_samples=n_samples,
        non_negative_subsidy=non_negative_subsidy,
        solver_options=solver_options,
        progress=progress,
    )

result `property` ¶

result: ValuationResult

The current valuation result (not a copy).

fit ¶

fit(data: Dataset, continue_from: ValuationResult | None = None) -> Self

Calculate the Monte Carlo approximation of Least-Core valuation on a dataset.

This method uses random sampling of coalitions and can handle larger datasets than the exact method, with accuracy depending on the number of samples. Args: data: Data for which to compute values continue_from: A previously computed valuation result to continue from.

Source code in src/pydvl/valuation/methods/least_core.py

def fit(self, data: Dataset, continue_from: ValuationResult | None = None) -> Self:
    """Calculate the Monte Carlo approximation of Least-Core valuation on a dataset.

    This method uses random sampling of coalitions and can handle larger datasets
    than the exact method, with accuracy depending on the number of samples.
    Args:
        data: Data for which to compute values
        continue_from: A previously computed valuation result to continue from.
    """
    return super().fit(data, continue_from)

values ¶

values(sort: bool = False) -> ValuationResult

Returns a copy of the valuation result.

The valuation must have been run with fit() before calling this method.

PARAMETER	DESCRIPTION
`sort`	Whether to sort the valuation result by value before returning it. TYPE: `bool` DEFAULT: `False`

Returns: The result of the valuation.

Source code in src/pydvl/valuation/base.py

@deprecated(
    target=None,
    deprecated_in="0.10.0",
    remove_in="0.11.0",
)
def values(self, sort: bool = False) -> ValuationResult:
    """Returns a copy of the valuation result.

    The valuation must have been run with `fit()` before calling this method.

    Args:
        sort: Whether to sort the valuation result by value before returning it.
    Returns:
        The result of the valuation.
    """
    if not self.is_fitted:
        raise NotFittedException(type(self))
    assert self._result is not None

    r = self._result.copy()
    if sort:
        r.sort(inplace=True)
    return r

create_least_core_problem ¶

create_least_core_problem(
    u: UtilityBase, sampler: PowersetSampler, n_samples: int, progress: bool
) -> LeastCoreProblem

Create a Least Core problem from a utility and a sampler.

PARAMETER	DESCRIPTION
`u`	Utility object with model, data and scoring function. TYPE: `UtilityBase`
`sampler`	The sampler to use for the valuation. TYPE: `PowersetSampler`
`n_samples`	The maximum number of samples to use for the valuation. TYPE: `int`
`progress`	Whether to show a progress bar during the construction of the least-core problem. TYPE: `bool`

RETURNS	DESCRIPTION
`LeastCoreProblem`	The least core problem to solve. TYPE: `LeastCoreProblem`

Source code in src/pydvl/valuation/methods/least_core.py

def create_least_core_problem(
    u: UtilityBase, sampler: PowersetSampler, n_samples: int, progress: bool
) -> LeastCoreProblem:
    """Create a Least Core problem from a utility and a sampler.

    Args:
        u: Utility object with model, data and scoring function.
        sampler: The sampler to use for the valuation.
        n_samples: The maximum number of samples to use for the valuation.
        progress: Whether to show a progress bar during the construction of the
            least-core problem.

    Returns:
        LeastCoreProblem: The least core problem to solve.

    """
    utility_values, masks = compute_utility_values_and_sample_masks(
        utility=u, sampler=sampler, n_samples=n_samples, progress=progress
    )

    return LeastCoreProblem(utility_values=utility_values, A_lb=masks.astype(float))

pydvl.valuation.methods.least_core ¶

References¶

ExactLeastCoreValuation ¶

result property ¶

fit ¶

values ¶

LeastCoreValuation ¶

result property ¶

fit ¶

values ¶

MonteCarloLeastCoreValuation ¶

result property ¶

fit ¶

values ¶

create_least_core_problem ¶

result `property` ¶

result `property` ¶

result `property` ¶