Recommended Reads
The Disagreement Problem in Explainable Machine Learning: A Practitioner’s Perspective
This paper explores whether and to what degree different post hoc explainability tools provide consistent information about model behavior. It seeks to identify in specific scenarios the reasons that drive disagreement in outputs of these tools and potential ways to resolve such disagreements. The evaluation includes empirical analysis and a survey of how users of these tools contend with inconsistent outputs. The authors conclude that when explainability tools produce inconsistent information about model behavior, there are no official or consistent methods to resolving these disagreements and call for development of principled evaluation metrics to more reliably identify when such disagreements occur and their causes.