pydvl.valuation.games
¶
This module provides several predefined games and, depending on the game, the corresponding Shapley values, Least-Core values or both of them, for benchmarking purposes.
References¶
-
Castro, J., Gómez, D. and Tejada, J., 2009. Polynomial calculation of the Shapley value based on sampling. Computers & Operations Research, 36(5), pp.1726-1730. ↩
DummyGameDataset
¶
Bases: Dataset
Dummy game dataset.
Initializes a dummy game dataset with n_players and an optional description.
This class is used internally inside the Game class.
PARAMETER | DESCRIPTION |
---|---|
n_players
|
Number of players that participate in the game.
TYPE:
|
description
|
Optional description of the dataset.
TYPE:
|
Source code in src/pydvl/valuation/games.py
indices
property
¶
Index of positions in data.x_train.
Contiguous integers from 0 to len(Dataset).
names
property
¶
Names of each individual datapoint.
Used for reporting Shapley values.
feature
¶
Returns a slice for the feature with the given name.
Source code in src/pydvl/valuation/dataset.py
data
¶
Given a set of indices, returns the training data that refer to those indices, as a read-only tuple-like structure.
This is used mainly by subclasses of UtilityBase to retrieve subsets of the data from indices.
PARAMETER | DESCRIPTION |
---|---|
indices
|
Optional indices that will be used to select points from
the training data. If
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
RawData
|
If |
Source code in src/pydvl/valuation/dataset.py
data_indices
¶
Returns a subset of indices.
This is equivalent to using Dataset.indices[logical_indices]
but allows
subclasses to define special behaviour, e.g. when indices in Dataset
do not
match the indices in the data.
For Dataset
, this is a simple pass-through.
PARAMETER | DESCRIPTION |
---|---|
indices
|
A set of indices held by this object |
RETURNS | DESCRIPTION |
---|---|
NDArray[int_]
|
The indices of the data points in the data array. |
Source code in src/pydvl/valuation/dataset.py
logical_indices
¶
Returns the indices in this Dataset
for the given indices in the data array.
This is equivalent to using Dataset.indices[data_indices]
but allows
subclasses to define special behaviour, e.g. when indices in Dataset
do not
match the indices in the data.
PARAMETER | DESCRIPTION |
---|---|
indices
|
A set of indices in the data array. |
RETURNS | DESCRIPTION |
---|---|
NDArray[int_]
|
The abstract indices for the given data indices. |
Source code in src/pydvl/valuation/dataset.py
from_sklearn
classmethod
¶
from_sklearn(
data: Bunch,
train_size: int | float = 0.8,
random_state: int | None = None,
stratify_by_target: bool = False,
**kwargs,
) -> tuple[Dataset, Dataset]
Constructs two Dataset objects from a
sklearn.utils.Bunch, as returned by the load_*
functions in scikit-learn toy datasets.
Example
PARAMETER | DESCRIPTION |
---|---|
data
|
scikit-learn Bunch object. The following attributes are supported:
TYPE:
|
train_size
|
size of the training dataset. Used in |
the value is automatically set to the complement of the test size.
random_state: seed for train / test split
stratify_by_target: If True
, data is split in a stratified
fashion, using the target variable as labels. Read more in
scikit-learn's user guide.
kwargs: Additional keyword arguments to pass to the
Dataset constructor. Use this to pass e.g. is_multi_output
.
RETURNS | DESCRIPTION |
---|---|
tuple[Dataset, Dataset]
|
Object with the sklearn dataset |
Changed in version 0.6.0
Added kwargs to pass to the Dataset constructor.
Changed in version 0.10.0
Returns a tuple of two Dataset objects.
Source code in src/pydvl/valuation/dataset.py
from_arrays
classmethod
¶
from_arrays(
X: NDArray,
y: NDArray,
train_size: float = 0.8,
random_state: int | None = None,
stratify_by_target: bool = False,
**kwargs: Any,
) -> tuple[Dataset, Dataset]
Constructs a Dataset object from X and y numpy arrays as
returned by the make_*
functions in sklearn generated datasets.
Example
PARAMETER | DESCRIPTION |
---|---|
X
|
numpy array of shape (n_samples, n_features)
TYPE:
|
y
|
numpy array of shape (n_samples,)
TYPE:
|
train_size
|
size of the training dataset. Used in
TYPE:
|
random_state
|
seed for train / test split
TYPE:
|
stratify_by_target
|
If
TYPE:
|
kwargs
|
Additional keyword arguments to pass to the
Dataset constructor. Use this to pass
e.g.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Dataset, Dataset]
|
Object with the passed X and y arrays split across training and test sets. |
New in version 0.4.0
Changed in version 0.6.0
Added kwargs to pass to the Dataset constructor.
Changed in version 0.10.0
Returns a tuple of two Dataset objects.
Source code in src/pydvl/valuation/dataset.py
DummyModel
¶
Game
¶
Game(
n_players: int,
score_range: tuple[float, float] = (-inf, inf),
description: str | None = None,
)
Bases: ABC
Base class for games
Any Game subclass has to implement the abstract _score
method
to assign a score to each coalition/subset and at least
one of shapley_values
, least_core_values
, or banzhaf_values
.
PARAMETER | DESCRIPTION |
---|---|
n_players
|
Number of players that participate in the game.
TYPE:
|
score_range
|
Minimum and maximum values of the |
description
|
Optional string description of the dummy dataset that will be created.
TYPE:
|
ATTRIBUTE | DESCRIPTION |
---|---|
n_players |
Number of players that participate in the game.
|
data |
Dummy dataset object.
|
u |
Utility object with a dummy model and dataset.
|
Source code in src/pydvl/valuation/games.py
SymmetricVotingGame
¶
SymmetricVotingGame(n_players: int)
Bases: Game
Toy game that is used for testing and demonstration purposes.
A symmetric voting game defined in (Castro et al., 2009)1 Section 4.1
For this game the utility of a coalition is 1 if its cardinality is greater than num_samples/2, or 0 otherwise.
PARAMETER | DESCRIPTION |
---|---|
n_players
|
Number of players that participate in the game.
TYPE:
|
Source code in src/pydvl/valuation/games.py
AsymmetricVotingGame
¶
AsymmetricVotingGame(n_players: int = 51)
Bases: Game
Toy game that is used for testing and demonstration purposes.
An asymmetric voting game defined in (Castro et al., 2009)1 Section 4.2.
For this game the player set is \(N = \{1,\dots,51\}\) and the utility of a coalition is given by:
where \(w = [w_1,\dots, w_{51}]\) is a list of weights associated with each player.
PARAMETER | DESCRIPTION |
---|---|
n_players
|
Number of players that participate in the game.
TYPE:
|
Source code in src/pydvl/valuation/games.py
257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 |
|
ShoesGame
¶
Bases: Game
Toy game that is used for testing and demonstration purposes.
A shoes game defined in (Castro et al., 2009)1.
In this game, some players have a left shoe and others a right shoe.
The payoff (utility) of a coalition \(S\) is:
Where \(L\), respectively \(R\), is the set of players with left shoes, respectively right shoes. This means that the marginal contribution of a player with a left shoe to a coalition \(S\) is 1 if the number of players with a left shoe in \(S\) is strictly less than the number of players with a right shoe in \(S\), and 0 otherwise. Let player \(i\) have a left shoe, then:
The situation is analogous for players with a right shoe. In order to compute the Shapley or Banzhaf value for a player \(i\) with a left shoe, we need then the number of subsets \(S\) of \(D_{-i}\) such that \(\mid S \cap L \mid < \mid S \cap R \mid\). This number is given by the sum:
PARAMETER | DESCRIPTION |
---|---|
left
|
Number of players with a left shoe.
TYPE:
|
right
|
Number of players with a right shoe.
TYPE:
|
Source code in src/pydvl/valuation/games.py
shapley_values
cached
¶
shapley_values() -> ValuationResult
We use the fact that the marginal utility of a coalition S of size k is 1 if |S ∩ L| < |S ∩ R| and 0 otherwise, and compute Shapley values with the formula that iterates over subset sizes.
The solution for left or right shoes is symmetrical
Source code in src/pydvl/valuation/games.py
banzhaf_values
cached
¶
banzhaf_values() -> ValuationResult
We use the fact that the marginal utility of a coalition S is 1 if |S ∩ L| < |S ∩ R| and 0 otherwise, and simply count those sets.
The solution for left or right shoes is symmetrical.
Source code in src/pydvl/valuation/games.py
AirportGame
¶
AirportGame(n_players: int = 100)
Bases: Game
Toy game that is used for testing and demonstration purposes.
An airport game defined in (Castro et al., 2009)1 Section 4.3
PARAMETER | DESCRIPTION |
---|---|
n_players
|
Number of players that participate in the game.
TYPE:
|
Source code in src/pydvl/valuation/games.py
MinimumSpanningTreeGame
¶
MinimumSpanningTreeGame(n_players: int = 100)
Bases: Game
Toy game that is used for testing and demonstration purposes.
A minimum spanning tree game defined in (Castro et al., 2009)1.
Let \(G = (N \cup \{0\},E)\) be a valued graph where \(N = \{1,\dots,100\}\), and the cost associated to an edge \((i, j)\) is:
A minimum spanning tree game \((N, c)\) is a cost game, where for a given coalition \(S \subset N\), \(v(S)\) is the sum of the edge cost of the minimum spanning tree, i.e. \(v(S)\) = Minimum Spanning Tree of the graph \(G|_{S\cup\{0\}}\), which is the partial graph restricted to the players \(S\) and the source node \(0\).
PARAMETER | DESCRIPTION |
---|---|
n_players
|
Number of players that participate in the game.
TYPE:
|
Source code in src/pydvl/valuation/games.py
MinerGame
¶
MinerGame(n_players: int)
Bases: Game
Toy game that is used for testing and demonstration purposes.
Consider a group of n miners, who have discovered large bars of gold.
If two miners can carry one piece of gold, then the payoff of a coalition \(S\) is:
If there are more than two miners and there is an even number of miners, then the core consists of the single payoff where each miner gets 1/2.
If there is an odd number of miners, then the core is empty.
Taken from Wikipedia
PARAMETER | DESCRIPTION |
---|---|
n_players
|
Number of miners that participate in the game.
TYPE:
|