Skip to content

pydvl.valuation.scorers

Scorer

Bases: ABC

A scoring callable that takes a model and returns a scalar.

Added in version 0.10.0

ABC added

Dataset

Dataset(
    x: NDArray,
    y: NDArray,
    feature_names: Sequence[str] | NDArray[str_] | None = None,
    target_names: Sequence[str] | NDArray[str_] | None = None,
    data_names: Sequence[str] | NDArray[str_] | None = None,
    description: str | None = None,
    multi_output: bool = False,
)

A convenience class to handle datasets.

It holds a dataset, together with info on feature names, target names, and data names. It is used to pass data around to valuation methods.

The underlying data arrays can be accessed via Dataset.data(), which returns the tuple (X, y) as a read-only RawData object. The data can be accessed by indexing the object directly, e.g. dataset[0] will return the data point corresponding to index 0 in dataset. For this base class, this is the same as dataset.data([0]), which is the first point in the data array, but derived classes can behave differently.

PARAMETER DESCRIPTION
x

training data

TYPE: NDArray

y

labels for training data

TYPE: NDArray

feature_names

names of the features of x data

TYPE: Sequence[str] | NDArray[str_] | None DEFAULT: None

target_names

names of the features of y data

TYPE: Sequence[str] | NDArray[str_] | None DEFAULT: None

data_names

names assigned to data points. For example, if the dataset is a time series, each entry can be a timestamp which can be referenced directly instead of using a row number.

TYPE: Sequence[str] | NDArray[str_] | None DEFAULT: None

description

A textual description of the dataset.

TYPE: str | None DEFAULT: None

multi_output

set to False if labels are scalars, or to True if they are vectors of dimension > 1.

TYPE: bool DEFAULT: False

Changed in version 0.10.0

No longer holds split data, but only x, y.

Changed in version 0.10.0

Slicing now return a new Dataset object, not raw data.

Source code in src/pydvl/valuation/dataset.py
def __init__(
    self,
    x: NDArray,
    y: NDArray,
    feature_names: Sequence[str] | NDArray[np.str_] | None = None,
    target_names: Sequence[str] | NDArray[np.str_] | None = None,
    data_names: Sequence[str] | NDArray[np.str_] | None = None,
    description: str | None = None,
    multi_output: bool = False,
):
    self._x, self._y = check_X_y(
        x, y, multi_output=multi_output, estimator="Dataset"
    )

    def make_names(s: str, a: np.ndarray) -> list[str]:
        n = a.shape[1] if len(a.shape) > 1 else 1
        return [f"{s}{i:0{1 + int(np.log10(n))}d}" for i in range(1, n + 1)]

    self.feature_names = (
        list(feature_names) if feature_names is not None else make_names("x", x)
    )
    self.target_names = (
        list(target_names) if target_names is not None else make_names("y", y)
    )

    if len(self._x.shape) > 1:
        if len(self.feature_names) != self._x.shape[-1]:
            raise ValueError("Mismatching number of features and names")
    if len(self._y.shape) > 1:
        if len(self.target_names) != self._y.shape[-1]:
            raise ValueError("Mismatching number of targets and names")

    self.description = description or "No description"
    self._indices = np.arange(len(self._x), dtype=np.int_)
    self._data_names = (
        np.array(data_names, dtype=np.str_)
        if data_names is not None
        else self._indices.astype(np.str_)
    )

indices property

indices: NDArray[int_]

Index of positions in data.x_train.

Contiguous integers from 0 to len(Dataset).

names property

names: NDArray[str_]

Names of each individual datapoint.

Used for reporting Shapley values.

n_features property

n_features: int

Returns the number of dimensions of a sample.

feature

feature(name: str) -> tuple[slice, int]

Returns a slice for the feature with the given name.

Source code in src/pydvl/valuation/dataset.py
def feature(self, name: str) -> tuple[slice, int]:
    """Returns a slice for the feature with the given name."""
    try:
        return np.index_exp[:, self.feature_names.index(name)]  # type: ignore
    except ValueError:
        raise ValueError(f"Feature {name} is not in {self.feature_names}")

data

data(
    indices: int | slice | Sequence[int] | NDArray[int_] | None = None,
) -> RawData

Given a set of indices, returns the training data that refer to those indices, as a read-only tuple-like structure.

This is used mainly by subclasses of UtilityBase to retrieve subsets of the data from indices.

PARAMETER DESCRIPTION
indices

Optional indices that will be used to select points from the training data. If None, the entire training data will be returned.

TYPE: int | slice | Sequence[int] | NDArray[int_] | None DEFAULT: None

RETURNS DESCRIPTION
RawData

If indices is not None, the selected x and y arrays from the training data. Otherwise, the entire dataset.

Source code in src/pydvl/valuation/dataset.py
def data(
    self, indices: int | slice | Sequence[int] | NDArray[np.int_] | None = None
) -> RawData:
    """Given a set of indices, returns the training data that refer to those
    indices, as a read-only tuple-like structure.

    This is used mainly by subclasses of
    [UtilityBase][pydvl.valuation.utility.base.UtilityBase] to retrieve subsets of
    the data from indices.

    Args:
        indices: Optional indices that will be used to select points from
            the training data. If `None`, the entire training data will be
            returned.

    Returns:
        If `indices` is not `None`, the selected x and y arrays from the
            training data. Otherwise, the entire dataset.
    """
    if indices is None:
        return RawData(self._x, self._y)
    return RawData(self._x[indices], self._y[indices])

data_indices

data_indices(indices: Sequence[int] | None = None) -> NDArray[int_]

Returns a subset of indices.

This is equivalent to using Dataset.indices[logical_indices] but allows subclasses to define special behaviour, e.g. when indices in Dataset do not match the indices in the data.

For Dataset, this is a simple pass-through.

PARAMETER DESCRIPTION
indices

A set of indices held by this object

TYPE: Sequence[int] | None DEFAULT: None

RETURNS DESCRIPTION
NDArray[int_]

The indices of the data points in the data array.

Source code in src/pydvl/valuation/dataset.py
def data_indices(self, indices: Sequence[int] | None = None) -> NDArray[np.int_]:
    """Returns a subset of indices.

    This is equivalent to using `Dataset.indices[logical_indices]` but allows
    subclasses to define special behaviour, e.g. when indices in `Dataset` do not
    match the indices in the data.

    For `Dataset`, this is a simple pass-through.

    Args:
        indices: A set of indices held by this object

    Returns:
        The indices of the data points in the data array.
    """
    if indices is None:
        return self._indices
    return self._indices[indices]

logical_indices

logical_indices(indices: Sequence[int] | None = None) -> NDArray[int_]

Returns the indices in this Dataset for the given indices in the data array.

This is equivalent to using Dataset.indices[data_indices] but allows subclasses to define special behaviour, e.g. when indices in Dataset do not match the indices in the data.

PARAMETER DESCRIPTION
indices

A set of indices in the data array.

TYPE: Sequence[int] | None DEFAULT: None

RETURNS DESCRIPTION
NDArray[int_]

The abstract indices for the given data indices.

Source code in src/pydvl/valuation/dataset.py
def logical_indices(self, indices: Sequence[int] | None = None) -> NDArray[np.int_]:
    """Returns the indices in this `Dataset` for the given indices in the data array.

    This is equivalent to using `Dataset.indices[data_indices]` but allows
    subclasses to define special behaviour, e.g. when indices in `Dataset` do not
    match the indices in the data.

    Args:
        indices: A set of indices in the data array.

    Returns:
        The abstract indices for the given data indices.
    """
    if indices is None:
        return self._indices
    return self._indices[indices]

from_sklearn classmethod

from_sklearn(
    data: Bunch,
    train_size: int | float = 0.8,
    random_state: int | None = None,
    stratify_by_target: bool = False,
    **kwargs,
) -> tuple[Dataset, Dataset]

Constructs two Dataset objects from a sklearn.utils.Bunch, as returned by the load_* functions in scikit-learn toy datasets.

Example
>>> from pydvl.valuation.dataset import Dataset
>>> from sklearn.datasets import load_boston  # noqa
>>> train, test = Dataset.from_sklearn(load_boston())
PARAMETER DESCRIPTION
data

scikit-learn Bunch object. The following attributes are supported:

  • data: covariates.
  • target: target variables (labels).
  • feature_names (optional): the feature names.
  • target_names (optional): the target names.
  • DESCR (optional): a description.

TYPE: Bunch

train_size

size of the training dataset. Used in train_test_split float values represent the fraction of the dataset to include in the training split and should be in (0,1). An integer value sets the absolute number of training samples.

TYPE: int | float DEFAULT: 0.8

the value is automatically set to the complement of the test size. random_state: seed for train / test split stratify_by_target: If True, data is split in a stratified fashion, using the target variable as labels. Read more in scikit-learn's user guide. kwargs: Additional keyword arguments to pass to the Dataset constructor. Use this to pass e.g. is_multi_output.

RETURNS DESCRIPTION
tuple[Dataset, Dataset]

Object with the sklearn dataset

Changed in version 0.6.0

Added kwargs to pass to the Dataset constructor.

Changed in version 0.10.0

Returns a tuple of two Dataset objects.

Source code in src/pydvl/valuation/dataset.py
@classmethod
def from_sklearn(
    cls,
    data: Bunch,
    train_size: int | float = 0.8,
    random_state: int | None = None,
    stratify_by_target: bool = False,
    **kwargs,
) -> tuple[Dataset, Dataset]:
    """Constructs two [Dataset][pydvl.valuation.dataset.Dataset] objects from a
    [sklearn.utils.Bunch][], as returned by the `load_*`
    functions in [scikit-learn toy datasets](https://scikit-learn.org/stable/datasets/toy_dataset.html).

    ??? Example
        ```pycon
        >>> from pydvl.valuation.dataset import Dataset
        >>> from sklearn.datasets import load_boston  # noqa
        >>> train, test = Dataset.from_sklearn(load_boston())
        ```

    Args:
        data: scikit-learn Bunch object. The following attributes are supported:

            - `data`: covariates.
            - `target`: target variables (labels).
            - `feature_names` (**optional**): the feature names.
            - `target_names` (**optional**): the target names.
            - `DESCR` (**optional**): a description.
        train_size: size of the training dataset. Used in `train_test_split`
            float values represent the fraction of the dataset to include in the
            training split and should be in (0,1). An integer value sets the
            absolute number of training samples.
    the value is automatically set to the complement of the test size.
        random_state: seed for train / test split
        stratify_by_target: If `True`, data is split in a stratified
            fashion, using the target variable as labels. Read more in
            [scikit-learn's user guide](https://scikit-learn.org/stable/modules/cross_validation.html#stratification).
        kwargs: Additional keyword arguments to pass to the
            [Dataset][pydvl.valuation.dataset.Dataset] constructor. Use this to pass e.g. `is_multi_output`.

    Returns:
        Object with the sklearn dataset

    !!! tip "Changed in version 0.6.0"
        Added kwargs to pass to the [Dataset][pydvl.valuation.dataset.Dataset] constructor.
    !!! tip "Changed in version 0.10.0"
        Returns a tuple of two [Dataset][pydvl.valuation.dataset.Dataset] objects.
    """
    x_train, x_test, y_train, y_test = train_test_split(
        data.data,
        data.target,
        train_size=train_size,
        random_state=random_state,
        stratify=data.target if stratify_by_target else None,
    )
    return (
        cls(
            x_train,
            y_train,
            feature_names=data.get("feature_names"),
            target_names=data.get("target_names"),
            description=data.get("DESCR"),
            **kwargs,
        ),
        cls(
            x_test,
            y_test,
            feature_names=data.get("feature_names"),
            target_names=data.get("target_names"),
            description=data.get("DESCR"),
            **kwargs,
        ),
    )

from_arrays classmethod

from_arrays(
    X: NDArray,
    y: NDArray,
    train_size: float = 0.8,
    random_state: int | None = None,
    stratify_by_target: bool = False,
    **kwargs: Any,
) -> tuple[Dataset, Dataset]

Constructs a Dataset object from X and y numpy arrays as returned by the make_* functions in sklearn generated datasets.

Example
>>> from pydvl.valuation.dataset import Dataset
>>> from sklearn.datasets import make_regression
>>> X, y = make_regression()
>>> dataset = Dataset.from_arrays(X, y)
PARAMETER DESCRIPTION
X

numpy array of shape (n_samples, n_features)

TYPE: NDArray

y

numpy array of shape (n_samples,)

TYPE: NDArray

train_size

size of the training dataset. Used in train_test_split

TYPE: float DEFAULT: 0.8

random_state

seed for train / test split

TYPE: int | None DEFAULT: None

stratify_by_target

If True, data is split in a stratified fashion, using the y variable as labels. Read more in sklearn's user guide.

TYPE: bool DEFAULT: False

kwargs

Additional keyword arguments to pass to the Dataset constructor. Use this to pass e.g. feature_names or target_names.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
tuple[Dataset, Dataset]

Object with the passed X and y arrays split across training and test sets.

New in version 0.4.0

Changed in version 0.6.0

Added kwargs to pass to the Dataset constructor.

Changed in version 0.10.0

Returns a tuple of two Dataset objects.

Source code in src/pydvl/valuation/dataset.py
@classmethod
def from_arrays(
    cls,
    X: NDArray,
    y: NDArray,
    train_size: float = 0.8,
    random_state: int | None = None,
    stratify_by_target: bool = False,
    **kwargs: Any,
) -> tuple[Dataset, Dataset]:
    """Constructs a [Dataset][pydvl.valuation.dataset.Dataset] object from X and y numpy arrays  as
    returned by the `make_*` functions in [sklearn generated datasets](https://scikit-learn.org/stable/datasets/sample_generators.html).

    ??? Example
        ```pycon
        >>> from pydvl.valuation.dataset import Dataset
        >>> from sklearn.datasets import make_regression
        >>> X, y = make_regression()
        >>> dataset = Dataset.from_arrays(X, y)
        ```

    Args:
        X: numpy array of shape (n_samples, n_features)
        y: numpy array of shape (n_samples,)
        train_size: size of the training dataset. Used in `train_test_split`
        random_state: seed for train / test split
        stratify_by_target: If `True`, data is split in a stratified fashion,
            using the y variable as labels. Read more in [sklearn's user
            guide](https://scikit-learn.org/stable/modules/cross_validation.html#stratification).
        kwargs: Additional keyword arguments to pass to the
            [Dataset][pydvl.valuation.dataset.Dataset] constructor. Use this to pass
            e.g. `feature_names` or `target_names`.

    Returns:
        Object with the passed X and y arrays split across training and test sets.

    !!! tip "New in version 0.4.0"

    !!! tip "Changed in version 0.6.0"
        Added kwargs to pass to the [Dataset][pydvl.valuation.dataset.Dataset] constructor.

    !!! tip "Changed in version 0.10.0"
        Returns a tuple of two [Dataset][pydvl.valuation.dataset.Dataset] objects.
    """
    x_train, x_test, y_train, y_test = train_test_split(
        X,
        y,
        train_size=train_size,
        random_state=random_state,
        stratify=y if stratify_by_target else None,
    )
    return cls(x_train, y_train, **kwargs), cls(x_test, y_test, **kwargs)

ClasswiseSupervisedScorer

ClasswiseSupervisedScorer(
    scoring: str
    | SupervisedScorerCallable[SupervisedModelT]
    | SupervisedModelT,
    test_data: Dataset,
    default: float = 0.0,
    range: tuple[float, float] = (0, 1),
    in_class_discount_fn: Callable[[float], float] = lambda x: x,
    out_of_class_discount_fn: Callable[[float], float] = exp,
    rescale_scores: bool = True,
    name: str | None = None,
)

Bases: SupervisedScorer[SupervisedModelT]

A Scorer designed for evaluation in classification problems.

The final score is the combination of the in-class and out-of-class scores, which are e.g. the accuracy of the trained model over the instances of the test set with the same, and different, labels, respectively. See the module's documentation for more on this.

These two scores are computed with an "inner" scoring function, which must be provided upon construction.

Multi-class support

The inner score must support multiple class labels if you intend to apply them to a multi-class problem. For instance, 'accuracy' supports multiple classes, but f1 does not. For a two-class classification problem, using f1_weighted is essentially equivalent to using accuracy.

PARAMETER DESCRIPTION
scoring

Name of the scoring function or a callable that can be passed to SupervisedScorer.

TYPE: str | SupervisedScorerCallable[SupervisedModelT] | SupervisedModelT

default

Score to use when a model fails to provide a number, e.g. when too little was used to train it, or errors arise.

TYPE: float DEFAULT: 0.0

range

Numerical range of the score function. Some Monte Carlo methods can use this to estimate the number of samples required for a certain quality of approximation. If not provided, it can be read from the scoring object if it provides it, for instance if it was constructed with compose_score.

TYPE: tuple[float, float] DEFAULT: (0, 1)

in_class_discount_fn

Continuous, monotonic increasing function used to discount the in-class score.

TYPE: Callable[[float], float] DEFAULT: lambda x: x

out_of_class_discount_fn

Continuous, monotonic increasing function used to discount the out-of-class score.

TYPE: Callable[[float], float] DEFAULT: exp

rescale_scores

If set to True, the scores will be denormalized. This is particularly useful when the inner score function \(a_S\) is calculated by an estimator of the form $ rac{1}{N} \sum_i x_i$.

TYPE: bool DEFAULT: True

name

Name of the scorer. If not provided, the name of the inner scoring function will be prefixed by classwise.

TYPE: str | None DEFAULT: None

New in version 0.7.1

Source code in src/pydvl/valuation/scorers/classwise.py
def __init__(
    self,
    scoring: str | SupervisedScorerCallable[SupervisedModelT] | SupervisedModelT,
    test_data: Dataset,
    default: float = 0.0,
    range: tuple[float, float] = (0, 1),
    in_class_discount_fn: Callable[[float], float] = lambda x: x,
    out_of_class_discount_fn: Callable[[float], float] = np.exp,
    rescale_scores: bool = True,
    name: str | None = None,
):
    disc_score_in_class = in_class_discount_fn(range[1])
    disc_score_out_of_class = out_of_class_discount_fn(range[1])
    transformed_range = (0, disc_score_in_class * disc_score_out_of_class)
    super().__init__(
        scoring=scoring,
        test_data=test_data,
        range=transformed_range,
        default=default,
        name=name or f"classwise {str(scoring)}",
    )
    self._in_class_discount_fn = in_class_discount_fn
    self._out_of_class_discount_fn = out_of_class_discount_fn
    self.label: int | None = None
    self.num_classes = len(np.unique(self.test_data.data().y))
    self.rescale_scores = rescale_scores

compute_in_and_out_of_class_scores

compute_in_and_out_of_class_scores(
    model: SupervisedModelT, rescale_scores: bool = True
) -> tuple[float, float]

Computes in-class and out-of-class scores using the provided inner scoring function. The result is

\[ a_S(D=\{(x_1, y_1), \dots, (x_K, y_K)\}) = \frac{1}{N} \sum_k s(y(x_k), y_k). \]

In this context, for label \(c\) calculations are executed twice: once for \(D_c\) and once for \(D_{-c}\) to determine the in-class and out-of-class scores, respectively. By default, the raw scores are multiplied by \(\frac{|D_c|}{|D|}\) and \(\frac{|D_{-c}|}{|D|}\), respectively. This is done to ensure that both scores are of the same order of magnitude. This normalization is particularly useful when the inner score function \(a_S\) is calculated by an estimator of the form \(\frac{1}{N} \sum_i x_i\), e.g. the accuracy.

PARAMETER DESCRIPTION
model

Model used for computing the score on the validation set.

TYPE: SupervisedModelT

rescale_scores

If set to True, the scores will be denormalized. This is particularly useful when the inner score function \(a_S\) is calculated by an estimator of the form \(\frac{1}{N} \sum_i x_i\).

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
tuple[float, float]

Tuple containing the in-class and out-of-class scores.

Source code in src/pydvl/valuation/scorers/classwise.py
def compute_in_and_out_of_class_scores(
    self, model: SupervisedModelT, rescale_scores: bool = True
) -> tuple[float, float]:
    r"""
    Computes in-class and out-of-class scores using the provided inner
    scoring function. The result is

    $$
    a_S(D=\{(x_1, y_1), \dots, (x_K, y_K)\}) = \frac{1}{N} \sum_k s(y(x_k), y_k).
    $$

    In this context, for label $c$ calculations are executed twice: once for $D_c$
    and once for $D_{-c}$ to determine the in-class and out-of-class scores,
    respectively. By default, the raw scores are multiplied by $\frac{|D_c|}{|D|}$
    and $\frac{|D_{-c}|}{|D|}$, respectively. This is done to ensure that both
    scores are of the same order of magnitude. This normalization is particularly
    useful when the inner score function $a_S$ is calculated by an estimator of the
    form $\frac{1}{N} \sum_i x_i$, e.g. the accuracy.

    Args:
        model: Model used for computing the score on the validation set.
        rescale_scores: If set to True, the scores will be denormalized. This is
            particularly useful when the inner score function $a_S$ is calculated by
            an estimator of the form $\frac{1}{N} \sum_i x_i$.

    Returns:
        Tuple containing the in-class and out-of-class scores.
    """
    if self.label is None:
        raise ValueError(
            "The scorer's label attribute should be set before calling it"
        )

    scorer = self._scorer
    label_set_match = self.test_data.data().y == self.label
    label_set = np.where(label_set_match)[0]

    if len(label_set) == 0:
        return 0, 1 / max(1, self.num_classes - 1)

    complement_label_set = np.where(~label_set_match)[0]
    in_class_score = scorer(model, *self.test_data.data(label_set))
    out_of_class_score = scorer(model, *self.test_data.data(complement_label_set))

    if rescale_scores:
        # TODO: This can lead to NaN values
        #       We should clearly indicate this to users
        _, y_test = self.test_data.data()
        n_in_class = np.count_nonzero(y_test == self.label)
        n_out_of_class = len(y_test) - n_in_class
        in_class_score *= n_in_class / (n_in_class + n_out_of_class)
        out_of_class_score *= n_out_of_class / (n_in_class + n_out_of_class)

    return in_class_score, out_of_class_score

SupervisedScorer

SupervisedScorer(
    scoring: str
    | SupervisedScorerCallable[SupervisedModelT]
    | SupervisedModelT,
    test_data: Dataset,
    default: float,
    range: tuple[float, float] = (-inf, inf),
    name: str | None = None,
)

Bases: Generic[SupervisedModelT], Scorer

A scoring callable that takes a model, data, and labels and returns a scalar.

PARAMETER DESCRIPTION
scoring

Either a string or callable that can be passed to get_scorer.

TYPE: str | SupervisedScorerCallable[SupervisedModelT] | SupervisedModelT

test_data

Dataset where the score will be evaluated.

TYPE: Dataset

default

score to be used when a model cannot be fit, e.g. when too little data is passed, or errors arise.

TYPE: float

range

numerical range of the score function. Some Monte Carlo methods can use this to estimate the number of samples required for a certain quality of approximation. If not provided, it can be read from the scoring object if it provides it, for instance if it was constructed with compose_score().

TYPE: tuple[float, float] DEFAULT: (-inf, inf)

name

The name of the scorer. If not provided, the name of the function passed will be used.

TYPE: str | None DEFAULT: None

New in version 0.5.0

Changed in version 0.10.0

This is now SupervisedScorer and holds the test data used to evaluate the model.

Source code in src/pydvl/valuation/scorers/supervised.py
def __init__(
    self,
    scoring: str | SupervisedScorerCallable[SupervisedModelT] | SupervisedModelT,
    test_data: Dataset,
    default: float,
    range: tuple[float, float] = (-np.inf, np.inf),
    name: str | None = None,
):
    super().__init__()
    if isinstance(scoring, SupervisedModel):
        from sklearn.metrics import check_scoring

        self._scorer = check_scoring(scoring)
        if name is None:
            name = f"Default scorer for {scoring.__class__.__name__}"
    elif isinstance(scoring, str):
        self._scorer = get_scorer(scoring)
        if name is None:
            name = scoring
    else:
        self._scorer = scoring
        if name is None:
            name = getattr(scoring, "__name__", "scorer")
    self.test_data = test_data
    self.default = default
    # TODO: auto-fill from known scorers ?
    self.range = np.array(range, dtype=np.float64)
    self.name = name

SupervisedScorerCallable

Bases: Protocol[SupervisedModelT]

Signature for a scorer

compose_score

compose_score(
    scorer: SupervisedScorer,
    transformation: Callable[[float], float],
    name: str,
) -> SupervisedScorer

Composes a scoring function with an arbitrary scalar transformation.

Useful to squash unbounded scores into ranges manageable by data valuation methods.

Example
sigmoid = lambda x: 1/(1+np.exp(-x))
compose_score(Scorer("r2"), sigmoid, range=(0,1), name="squashed r2")
PARAMETER DESCRIPTION
scorer

The object to be composed.

TYPE: SupervisedScorer

transformation

A scalar transformation

TYPE: Callable[[float], float]

name

A string representation for the composition, for str().

TYPE: str

RETURNS DESCRIPTION
SupervisedScorer

The composite SupervisedScorer.

Source code in src/pydvl/valuation/scorers/utils.py
def compose_score(
    scorer: SupervisedScorer,
    transformation: Callable[[float], float],
    name: str,
) -> SupervisedScorer:
    """Composes a scoring function with an arbitrary scalar transformation.

    Useful to squash unbounded scores into ranges manageable by data valuation
    methods.

    ??? Example
        ```python
        sigmoid = lambda x: 1/(1+np.exp(-x))
        compose_score(Scorer("r2"), sigmoid, range=(0,1), name="squashed r2")
        ```

    Args:
        scorer: The object to be composed.
        transformation: A scalar transformation
        name: A string representation for the composition, for `str()`.

    Returns:
        The composite [SupervisedScorer][pydvl.valuation.scorers.SupervisedScorer].
    """

    class CompositeSupervisedScorer(SupervisedScorer[SupervisedModelT]):
        def __call__(self, model: SupervisedModelT) -> float:
            raw = super().__call__(model)
            return transformation(raw)

    new_scorer = CompositeSupervisedScorer(
        scoring=scorer._scorer,
        test_data=scorer.test_data,
        default=transformation(scorer.default),
        range=(
            transformation(scorer.range[0].item()),
            transformation(scorer.range[1].item()),
        ),
        name=name,
    )
    return new_scorer