Skip to content

pydvl.utils.types

This module contains types, protocols, decorators and generic function transformations. Some of it probably belongs elsewhere.

BaggingModel

Bases: Protocol

Any model with the attributes n_estimators and max_samples is considered a bagging model.

fit

fit(x: NDArray, y: NDArray | None)

Fit the model to the data

PARAMETER DESCRIPTION
x

Independent variables

TYPE: NDArray

y

Dependent variable

TYPE: NDArray | None

Source code in src/pydvl/utils/types.py
def fit(self, x: NDArray, y: NDArray | None):
    """Fit the model to the data

    Args:
        x: Independent variables
        y: Dependent variable
    """
    pass

predict

predict(x: NDArray) -> NDArray

Compute predictions for the input

PARAMETER DESCRIPTION
x

Independent variables for which to compute predictions

TYPE: NDArray

RETURNS DESCRIPTION
NDArray

Predictions for the input

Source code in src/pydvl/utils/types.py
def predict(self, x: NDArray) -> NDArray:
    """Compute predictions for the input

    Args:
        x: Independent variables for which to compute predictions

    Returns:
        Predictions for the input
    """
    pass

BaseModel

Bases: Protocol

This is the minimal model protocol with the method fit()

fit

fit(x: NDArray, y: NDArray | None)

Fit the model to the data

PARAMETER DESCRIPTION
x

Independent variables

TYPE: NDArray

y

Dependent variable

TYPE: NDArray | None

Source code in src/pydvl/utils/types.py
def fit(self, x: NDArray, y: NDArray | None):
    """Fit the model to the data

    Args:
        x: Independent variables
        y: Dependent variable
    """
    pass

SupervisedModel

Bases: Protocol

This is the standard sklearn Protocol with the methods fit(), predict() and score().

fit

fit(x: NDArray, y: NDArray | None)

Fit the model to the data

PARAMETER DESCRIPTION
x

Independent variables

TYPE: NDArray

y

Dependent variable

TYPE: NDArray | None

Source code in src/pydvl/utils/types.py
def fit(self, x: NDArray, y: NDArray | None):
    """Fit the model to the data

    Args:
        x: Independent variables
        y: Dependent variable
    """
    pass

predict

predict(x: NDArray) -> NDArray

Compute predictions for the input

PARAMETER DESCRIPTION
x

Independent variables for which to compute predictions

TYPE: NDArray

RETURNS DESCRIPTION
NDArray

Predictions for the input

Source code in src/pydvl/utils/types.py
def predict(self, x: NDArray) -> NDArray:
    """Compute predictions for the input

    Args:
        x: Independent variables for which to compute predictions

    Returns:
        Predictions for the input
    """
    pass

score

score(x: NDArray, y: NDArray | None) -> float

Compute the score of the model given test data

PARAMETER DESCRIPTION
x

Independent variables

TYPE: NDArray

y

Dependent variable

TYPE: NDArray | None

RETURNS DESCRIPTION
float

The score of the model on (x, y)

Source code in src/pydvl/utils/types.py
def score(self, x: NDArray, y: NDArray | None) -> float:
    """Compute the score of the model given test data

    Args:
        x: Independent variables
        y: Dependent variable

    Returns:
        The score of the model on `(x, y)`
    """
    pass

ensure_seed_sequence

ensure_seed_sequence(
    seed: Optional[Union[Seed, SeedSequence]] = None,
) -> SeedSequence

If the passed seed is a SeedSequence object then it is returned as is. If it is a Generator the internal protected seed sequence from the generator gets extracted. Otherwise, a new SeedSequence object is created from the passed (optional) seed.

PARAMETER DESCRIPTION
seed

Either an int, a Generator object a SeedSequence object or None.

TYPE: Optional[Union[Seed, SeedSequence]] DEFAULT: None

RETURNS DESCRIPTION
SeedSequence

A SeedSequence object.

New in version 0.7.0

Source code in src/pydvl/utils/types.py
def ensure_seed_sequence(
    seed: Optional[Union[Seed, SeedSequence]] = None,
) -> SeedSequence:
    """
    If the passed seed is a SeedSequence object then it is returned as is. If it is
    a Generator the internal protected seed sequence from the generator gets extracted.
    Otherwise, a new SeedSequence object is created from the passed (optional) seed.

    Args:
        seed: Either an int, a Generator object a SeedSequence object or None.

    Returns:
        A SeedSequence object.

    !!! tip "New in version 0.7.0"
    """
    if isinstance(seed, SeedSequence):
        return seed
    elif isinstance(seed, Generator):
        return cast(SeedSequence, seed.bit_generator.seed_seq)  # type: ignore
    else:
        return SeedSequence(seed)

validate_number

validate_number(
    name: str,
    value: Any,
    dtype: Type[T],
    lower: T | None = None,
    upper: T | None = None,
) -> T

Ensure that the value is of the given type and within the given bounds.

For int and float types, this function is lenient with numpy numeric types and will convert them to the appropriate Python type as long as no precision is lost.

PARAMETER DESCRIPTION
name

The name of the variable to validate.

TYPE: str

value

The value to validate.

TYPE: Any

dtype

The type to convert the value to.

TYPE: Type[T]

lower

The lower bound for the value (inclusive).

TYPE: T | None DEFAULT: None

upper

The upper bound for the value (inclusive).

TYPE: T | None DEFAULT: None

RAISES DESCRIPTION
TypeError

If the value is not of the given type.

ValueError

If the value is not within the given bounds, if there is precision loss, e.g. when forcing a float to an int, or if dtype is not a valid scalar type.

Source code in src/pydvl/utils/types.py
def validate_number(
    name: str,
    value: Any,
    dtype: Type[T],
    lower: T | None = None,
    upper: T | None = None,
) -> T:
    """Ensure that the value is of the given type and within the given bounds.

    For int and float types, this function is lenient with numpy numeric types and
    will convert them to the appropriate Python type as long as no precision is lost.

    Args:
        name: The name of the variable to validate.
        value: The value to validate.
        dtype: The type to convert the value to.
        lower: The lower bound for the value (inclusive).
        upper: The upper bound for the value (inclusive).

    Raises:
        TypeError: If the value is not of the given type.
        ValueError: If the value is not within the given bounds, if there is precision
            loss, e.g. when forcing a float to an int, or if `dtype` is not a valid
            scalar type.
    """
    if not isinstance(value, (int, float, np.number)):
        raise TypeError(f"'{name}' is not a number, it is {type(value).__name__}")
    if not issubclass(dtype, (np.number, int, float)):
        raise ValueError(f"type '{dtype}' is not a valid scalar type")

    converted = dtype(value)
    if not np.isnan(converted) and not np.isclose(converted, value, rtol=0, atol=0):
        raise ValueError(
            f"'{name}' cannot be converted to {dtype.__name__} without precision loss"
        )
    value = cast(T, converted)

    if lower is not None and value < lower:  # type: ignore
        raise ValueError(f"'{name}' is {value}, but it should be >= {lower}")
    if upper is not None and value > upper:  # type: ignore
        raise ValueError(f"'{name}' is {value}, but it should be <= {upper}")
    return value