Skip to content

Installing pyDVL

To install the latest release use:

pip install pyDVL

To use all features of influence functions use instead:

pip install pyDVL[influence]

This includes a dependency on PyTorch (Version 2.0 and above) and thus is left out by default.

In case that you have a supported version of CUDA installed (v11.2 to 11.8 as of this writing), you can enable eigenvalue computations for low-rank approximations with CuPy on the GPU by using:

pip install pyDVL[cupy]

If you use a different version of CUDA, please install CuPy manually.

In order to check the installation you can use:

python -c "import pydvl; print(pydvl.__version__)"

You can also install the latest development version from TestPyPI:

pip install pyDVL --index-url https://test.pypi.org/simple/

Dependencies

pyDVL requires Python >= 3.8, Memcached for caching and Ray for parallelization in a cluster (locally it uses joblib). Additionally, the Influence functions module requires PyTorch (see Installing pyDVL).

ray is used to distribute workloads across nodes in a cluster (it can be used locally as well, but for this we recommend joblib instead). Please follow the instructions in their documentation to set up the cluster. Once you have a running cluster, you can use it by passing the address of the head node to parallel methods via ParallelConfig.

Setting up the cache

memcached is an in-memory key-value store accessible over the network. pyDVL uses it to cache the computation of the utility function and speed up some computations (in particular, semi-value computations with the PermutationSampler but other methods may benefit as well).

You can either install it as a package or run it inside a docker container (the simplest). For installation instructions, refer to the Getting started section in memcached's wiki. Then you can run it with:

memcached -u user

To run memcached inside a container in daemon mode instead, do:

docker container run -d --rm -p 11211:11211 memcached:latest

Using the cache

Continue reading about the cache in the First Steps and the documentation for the caching module.


Last update: 2023-09-01
Created: 2023-05-15