Skip to content

Methods

We currently implement the following methods:

Data valuation

Influence functions


  1. Watson, L., Kujawa, Z., Andreeva, R., Yang, H.-T., Elahi, T., Sarkar, R., 2023. Accelerated Shapley Value Approximation for Data Evaluation [WWW Document]. https://doi.org/10.48550/arXiv.2311.05346 

  2. Kwon, Y., Zou, J., 2022. Beta Shapley: A Unified and [Noise-reduced Data Valuation Framework]{.nocase} for Machine Learning, in: Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS) 2022,. Presented at the AISTATS 2022, PMLR, Valencia, Spain. 

  3. Schoch, S., Xu, H., Ji, Y., 2022. CS-Shapley: [Class-wise Shapley Values]{.nocase} for Data Valuation in Classification, in: Proc. Of the Thirty-Sixth Conference on Neural Information Processing Systems (NeurIPS). Presented at the Advances in Neural Information Processing Systems (NeurIPS 2022), New Orleans, Louisiana, USA. 

  4. Wang, J.T., Jia, R., 2023. Data Banzhaf: A Robust Data Valuation Framework for Machine Learning, in: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. Presented at the International Conference on Artificial Intelligence and Statistics, PMLR, pp. 6388--6421. 

  5. Wang, T., Yang, Y., Jia, R., 2022. Improving [Cooperative Game Theory-based Data Valuation]{.nocase} via Data Utility Learning. Presented at the International Conference on Learning Representations (ICLR 2022). Workshop on Socially Responsible Machine Learning, arXiv. https://doi.org/10.48550/arXiv.2107.06336 

  6. Kwon, Y., Zou, J., 2023. Data-OOB: [Out-of-bag Estimate]{.nocase} as a Simple and Efficient Data Value, in: Proceedings of the 40th International Conference on Machine Learning. Presented at the International Conference on Machine Learning, PMLR, pp. 18135--18152. 

  7. Jia, R., Dao, D., Wang, B., Hubis, F.A., Hynes, N., Gürel, N.M., Li, B., Zhang, C., Song, D., Spanos, C.J., 2019. Towards Efficient Data Valuation Based on the Shapley Value, in: Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. Presented at the International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR, pp. 1167--1176. 

  8. Jia, R., Dao, D., Wang, B., Hubis, F.A., Gurel, N.M., Li, B., Zhang, C., Spanos, C., Song, D., 2019. Efficient task-specific data valuation for nearest neighbor algorithms. Proc. VLDB Endow. 12, 1610--1623. https://doi.org/10.14778/3342263.3342637 

  9. Yan, T., Procaccia, A.D., 2021. If You Like Shapley Then You'll Love the Core, in: Proceedings of the 35th AAAI Conference on Artificial Intelligence. Presented at the AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence, Virtual conference, pp. 5751--5759. https://doi.org/10.1609/aaai.v35i6.16721 

  10. Okhrati, R., Lipani, A., 2021. A Multilinear Sampling Algorithm to Estimate Shapley Values, in: 2020 25th International Conference on Pattern Recognition (ICPR). Presented at the 2020 25th International Conference on Pattern Recognition (ICPR), IEEE, pp. 7992--7999. https://doi.org/10.1109/ICPR48806.2021.9412511 

  11. Castro, J., Gómez, D., Tejada, J., 2009. Polynomial calculation of the Shapley value based on sampling. Computers\ & Operations Research, Selected papers presented at the Tenth International Symposium on Locational Decisions (ISOLDE X) 36, 1726--1730. https://doi.org/10.1016/j.cor.2008.04.004 

  12. Ghorbani, A., Zou, J., 2019. Data Shapley: Equitable Valuation of Data for Machine Learning, in: Proceedings of the 36th International Conference on Machine Learning, PMLR. Presented at the International Conference on Machine Learning (ICML 2019), PMLR, pp. 2242--2251. 

  13. Wu, M., Jia, R., Lin, C., Huang, W., Chang, X., 2023. Variance reduced Shapley value estimation for trustworthy data valuation. Computers\ & Operations Research 159, 106305. https://doi.org/10.1016/j.cor.2023.106305 

  14. Koh, P.W., Liang, P., 2017. Understanding [Black-box Predictions]{.nocase} via Influence Functions, in: Proceedings of the 34th International Conference on Machine Learning. Presented at the International Conference on Machine Learning, PMLR, pp. 1885--1894. 

  15. Agarwal, N., Bullins, B., Hazan, E., 2017. Second-Order Stochastic Optimization for Machine Learning in Linear Time. JMLR 18, 1--40. 

  16. Schioppa, A., Zablotskaia, P., Vilar, D., Sokolov, A., 2022. Scaling Up Influence Functions. Proc. AAAI Conf. Artif. Intell. 36, 8179--8186. https://doi.org/10.1609/aaai.v36i8.20791 

  17. George, T., Laurent, C., Bouthillier, X., Ballas, N., Vincent, P., 2018. Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis, in: Advances in Neural Information Processing Systems. Curran Associates, Inc. 

  18. Martens, J., Grosse, R., 2015. Optimizing Neural Networks with [Kronecker-factored Approximate Curvature]{.nocase}, in: Proceedings of the 32nd International Conference on Machine Learning. Presented at the International Conference on Machine Learning, PMLR, pp. 2408--2417. 

  19. Hataya, R., Yamada, M., 2023. Nyström Method for Accurate and Scalable Implicit Differentiation, in: Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. Presented at the International Conference on Artificial Intelligence and Statistics, PMLR, pp. 4643--4654. 

  20. Kwon, Y., Wu, E., Wu, K., Zou, J., 2023. DataInf: Efficiently Estimating Data Influence in [LoRA-tuned LLMs]{.nocase} and Diffusion Models. Presented at the The Twelfth International Conference on Learning Representations. https://doi.org/10.48550/arXiv.2310.00902