Skip to content

Installing pyDVL

To install the latest release use:

pip install pyDVL

You can also install the latest development version from TestPyPI:

pip install pyDVL --index-url https://test.pypi.org/simple/

In order to check the installation you can use:

python -c "import pydvl; print(pydvl.__version__)"

Dependencies

pyDVL requires Python >= 3.8, numpy, scikit-learn, scipy, cvxpy for the Core methods, and joblib for parallelization locally. Additionally,the Influence functions module requires PyTorch (see Installing pyDVL).

Extras

pyDVL has a few extra dependencies that can be optionally installed:

  • influence:

    To use all features of influence functions use instead:

    pip install pyDVL[influence]
    

    This includes a dependency on PyTorch (Version 2.0 and above) and thus is left out by default.

  • cupy:

    In case that you have a supported version of CUDA installed (v11.2 to 11.8 as of this writing), you can enable eigenvalue computations for low-rank approximations with CuPy on the GPU by using:

    pip install pyDVL[cupy]
    

    This installs cupy-cuda11x.

    If you use a different version of CUDA, please install CuPy manually.

  • ray:

    If you want to use Ray to distribute data valuation workloads across nodes in a cluster (it can be used locally as well, but for this we recommend joblib instead) install pyDVL using:

    pip install pyDVL[ray]
    

    see [[getting-started#ray]] for more details on how to use it.

  • memcached:

    If you want to use Memcached for caching utility evaluations, use:

    pip install pyDVL[memcached]
    

    This installs pymemcache additionally.