pydvl.influence.torch.batch_operation
¶
    This module contains abstractions and implementations for operations carried out on a batch \(b\). These operations are of the form
$$ m(b) \cdot v$$,
where \(m(b)\) is a matrix defined by the data in the batch and \(v\) is a vector or matrix. These batch operations can be used to conveniently build aggregations or recursions over sequence of batches, e.g. an average of the form
$$ \frac{1}{|B|} \sum_{b in B}m(b)\cdot v$$,
which is useful in the case that keeping \(B\) in memory is not feasible.
            ChunkAveraging
¶
    
              Bases: _TensorAveraging[_TensorDictChunkAveraging]
Averages tensors, provided by a generator, and normalizes by the number of tensors.
            GaussNewtonBatchOperation
¶
GaussNewtonBatchOperation(
    model: Module,
    loss: LossType,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)
              Bases: _ModelBasedBatchOperation
Given a model and loss function computes the Gauss-Newton vector or matrix product with respect to the model parameters, i.e.
where model is a torch.nn.Module and \(v\) is a vector or matrix.
| PARAMETER | DESCRIPTION | 
|---|---|
                model
             | 
            
               The model. 
                  
                    TYPE:
                        | 
          
                loss
             | 
            
               The loss function. 
                  
                    TYPE:
                        | 
          
                restrict_to
             | 
            
               The parameters to restrict the differentiation to,
i.e. the corresponding sub-matrix of the Jacobian. If None, the full
Jacobian is used. Make sure the input matches the corrct dimension, i.e. the
last dimension must be equal to the property   | 
          
Source code in src/pydvl/influence/torch/batch_operation.py
                    
            apply
¶
apply(batch: TorchBatch, tensor: Tensor) -> Tensor
Applies the batch operation to a tensor.
Args:
    batch: Batch of data for computation
    tensor: A tensor consistent to the operation, i.e. it must be
        at most 2-dim, and it's tailing dimension must
        be equal to the property input_size.
| RETURNS | DESCRIPTION | 
|---|---|
                
                    Tensor
                
             | 
            
               A tensor after applying the batch operation  | 
          
Source code in src/pydvl/influence/torch/batch_operation.py
              
            HessianBatchOperation
¶
HessianBatchOperation(
    model: Module,
    loss: LossType,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)
              Bases: _ModelBasedBatchOperation
Given a model and loss function computes the Hessian vector or matrix product with respect to the model parameters, i.e.
where model is a torch.nn.Module and \(v\) is a vector or matrix.
| PARAMETER | DESCRIPTION | 
|---|---|
                model
             | 
            
               The model. 
                  
                    TYPE:
                        | 
          
                loss
             | 
            
               The loss function. 
                  
                    TYPE:
                        | 
          
                restrict_to
             | 
            
               The parameters to restrict the second order differentiation to,
i.e. the corresponding sub-matrix of the Hessian. If None, the full Hessian
is used. Make sure the input matches the corrct dimension, i.e. the
last dimension must be equal to the property   | 
          
Source code in src/pydvl/influence/torch/batch_operation.py
                    
            apply
¶
apply(batch: TorchBatch, tensor: Tensor) -> Tensor
Applies the batch operation to a tensor.
Args:
    batch: Batch of data for computation
    tensor: A tensor consistent to the operation, i.e. it must be
        at most 2-dim, and it's tailing dimension must
        be equal to the property input_size.
| RETURNS | DESCRIPTION | 
|---|---|
                
                    Tensor
                
             | 
            
               A tensor after applying the batch operation  | 
          
Source code in src/pydvl/influence/torch/batch_operation.py
              
            InverseHarmonicMeanBatchOperation
¶
InverseHarmonicMeanBatchOperation(
    model: Module,
    loss: Callable[[Tensor, Tensor], Tensor],
    regularization: float,
    restrict_to: Optional[Dict[str, Parameter]] = None,
)
              Bases: _ModelBasedBatchOperation
Given a model and loss function computes an approximation of the inverse Gauss-Newton vector or matrix product. Viewing the damped Gauss-newton matrix
as an arithmetic mean of the rank-\(1\) updates, this operation replaces it with the harmonic mean of the rank-\(1\) updates, i.e.
and computes
where model is a torch.nn.Module and \(v\) is a vector or matrix.
In other words, it switches the order of summation and inversion, which resolves
to the inverse harmonic mean of the rank-\(1\) updates.
The inverses of the rank-\(1\) updates are not calculated explicitly, but instead a vectorized version of the Sherman–Morrison formula is applied.
For more information, see Inverse Harmonic Mean.
| PARAMETER | DESCRIPTION | 
|---|---|
                model
             | 
            
               The model. 
                  
                    TYPE:
                        | 
          
                loss
             | 
            
               The loss function.  | 
          
                restrict_to
             | 
            
               The parameters to restrict the differentiation to,
i.e. the corresponding sub-matrix of the Jacobian. If None, the full
Jacobian is used. Make sure the input matches the corrct dimension, i.e. the
last dimension must be equal to the property   | 
          
Source code in src/pydvl/influence/torch/batch_operation.py
                    
            apply
¶
apply(batch: TorchBatch, tensor: Tensor) -> Tensor
Applies the batch operation to a tensor.
Args:
    batch: Batch of data for computation
    tensor: A tensor consistent to the operation, i.e. it must be
        at most 2-dim, and it's tailing dimension must
        be equal to the property input_size.
| RETURNS | DESCRIPTION | 
|---|---|
                
                    Tensor
                
             | 
            
               A tensor after applying the batch operation  | 
          
Source code in src/pydvl/influence/torch/batch_operation.py
              
            PointAveraging
¶
PointAveraging(batch_dim: int = 0)
              Bases: _TensorAveraging[_TensorDictPointAveraging]
Averages tensors provided by a generator. The averaging is weighted by the number of points in each tensor and the final result is normalized by the number of total points.
| PARAMETER | DESCRIPTION | 
|---|---|
                batch_dim
             | 
            
               Dimension to extract the number of points for the weighting. 
                  
                    TYPE:
                        |