Skip to content

pydvl.valuation.methods.twodshapley

This module implements 2D-Shapley, as introduced in (Liu et al., 2023)1.

References


  1. Liu, Zhihong, Hoang Anh Just, Xiangyu Chang, Xi Chen, and Ruoxi Jia. 2D-Shapley: A Framework for Fragmented Data Valuation. In Proceedings of the 40th International Conference on Machine Learning, 21730–55. PMLR, 2023. 

TwoDSample dataclass

TwoDSample(
    idx: int | IndexT | None, subset: NDArray[IndexT], features: NDArray[IndexT]
)

Bases: Sample

A sample for 2D-Shapley, consisting of a set of indices and a set of features.

idx instance-attribute

idx: int | IndexT | None

Index of current sample

subset instance-attribute

subset: NDArray[IndexT]

Indices of current sample

with_idx_in_subset

with_idx_in_subset() -> Self

Return a copy of sample with idx added to the subset.

Returns the original sample if idx was already part of the subset.

RETURNS DESCRIPTION
Sample

A copy of the sample with idx added to the subset.

TYPE: Self

RAISES DESCRIPTION
ValueError

If idx is None.

Source code in src/pydvl/valuation/types.py
def with_idx_in_subset(self) -> Self:
    """Return a copy of sample with idx added to the subset.

    Returns the original sample if idx was already part of the subset.

    Returns:
        Sample: A copy of the sample with idx added to the subset.

    Raises:
        ValueError: If idx is None.
    """
    if self.idx in self.subset:
        return self

    if self.idx is None:
        raise ValueError("Cannot add idx to subset if idx is None.")

    new_subset = np.array(self.subset.tolist() + [self.idx])
    return replace(self, subset=new_subset)

with_idx

with_idx(idx: int) -> Self

Return a copy of sample with idx changed.

Returns the original sample if idx is the same.

PARAMETER DESCRIPTION
idx

New value for idx.

TYPE: int

RETURNS DESCRIPTION
Sample

A copy of the sample with idx changed.

TYPE: Self

Source code in src/pydvl/valuation/types.py
def with_idx(self, idx: int) -> Self:
    """Return a copy of sample with idx changed.

    Returns the original sample if idx is the same.

    Args:
        idx: New value for idx.

    Returns:
        Sample: A copy of the sample with idx changed.
    """
    if self.idx == idx:
        return self

    return replace(self, idx=idx)

with_subset

with_subset(subset: NDArray[IndexT]) -> Self

Return a copy of sample with subset changed.

Returns the original sample if subset is the same.

PARAMETER DESCRIPTION
subset

New value for subset.

TYPE: NDArray[IndexT]

RETURNS DESCRIPTION
Sample

A copy of the sample with subset changed.

TYPE: Self

Source code in src/pydvl/valuation/types.py
def with_subset(self, subset: NDArray[IndexT]) -> Self:
    """Return a copy of sample with subset changed.

    Returns the original sample if subset is the same.

    Args:
        subset: New value for subset.

    Returns:
        Sample: A copy of the sample with subset changed.
    """
    if np.array_equal(self.subset, subset):
        return self

    return replace(self, subset=subset)