pydvl.utils.score
¶
This module provides a Scorer class that wraps scoring functions with additional information.
Scorers are the fundamental building block of many data valuation methods. They are typically used by the Utility class to evaluate the quality of a model when trained on subsets of the training data.
Scorers can be constructed in the same way as in scikit-learn: either from known strings or from a callable. Greater values must be better. If they are not, a negated version can be used, see scikit-learn's make_scorer().
Scorer provides additional information about the scoring function, like its range and default values, which can be used by some data valuation methods (like group_testing_shapley()) to estimate the number of samples required for a certain quality of approximation.
squashed_r2
module-attribute
¶
squashed_r2 = compose_score(Scorer('r2'), _sigmoid, (0, 1), 'squashed r2')
A scorer that squashes the R² score into the range [0, 1] using a sigmoid.
squashed_variance
module-attribute
¶
squashed_variance = compose_score(
Scorer("explained_variance"),
_sigmoid,
(0, 1),
"squashed explained variance",
)
A scorer that squashes the explained variance score into the range [0, 1] using a sigmoid.
Scorer
¶
Scorer(
scoring: Union[str, ScorerCallable],
default: float = np.nan,
range: Tuple = (-np.inf, np.inf),
name: Optional[str] = None,
)
A scoring callable that takes a model, data, and labels and returns a scalar.
PARAMETER | DESCRIPTION |
---|---|
scoring |
Either a string or callable that can be passed to get_scorer.
TYPE:
|
default |
score to be used when a model cannot be fit, e.g. when too little data is passed, or errors arise. |
range |
numerical range of the score function. Some Monte Carlo
methods can use this to estimate the number of samples required for a
certain quality of approximation. If not provided, it can be read from
the |
name |
The name of the scorer. If not provided, the name of the function passed will be used. |
New in version 0.5.0
Source code in src/pydvl/utils/score.py
compose_score
¶
compose_score(
scorer: Scorer,
transformation: Callable[[float], float],
range: Tuple[float, float],
name: str,
) -> Scorer
Composes a scoring function with an arbitrary scalar transformation.
Useful to squash unbounded scores into ranges manageable by data valuation methods.
Example:
sigmoid = lambda x: 1/(1+np.exp(-x))
compose_score(Scorer("r2"), sigmoid, range=(0,1), name="squashed r2")
PARAMETER | DESCRIPTION |
---|---|
scorer |
The object to be composed.
TYPE:
|
transformation |
A scalar transformation |
range |
The range of the transformation. This will be used e.g. by Utility for the range of the composed. |
name |
A string representation for the composition, for
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Scorer
|
The composite Scorer. |