Comparing Effects of Attribution-based, Example-based, and Feature-based Explanation Methods on AI-Assisted Decision-Making

Trust calibration is essential in AI-assisted decision-making tasks. If human users understand the reasons for a prediction of an AI model, they can assess whether or not the prediction is reasonable. Especially for high-risk tasks like mushroom hunting (where a wrong decision may be fatal), it is important that users trust or overrule the AI in the right situations. Various explainable AI methods are currently being discussed as potentially useful for facilitating understanding and to calibrate user trust. So far, however, it is unclear which approaches are most effective. Our work takes on this issue; in a between-subjects experiment with 𝑁 = 501 participants. Participants were tasked to classify the edibility of mushrooms depicted on images. We compare the effects of three XAI methods on human AI-assisted decision-making behavior: (i) Grad-CAM attributions; (ii) nearest neighbor examples; and (iii) an adoption of network dissection. For nearest neighbor examples, we found a statistically significant improvement in user performance compared to a condition without explanations. Effects did not reach statistical significance for Grad-CAM and network dissection. For the latter, however, the effect size estimators show a similar tendency as for nearest neighbor. We found that the effects also varied for different task items (i.e., mushroom images). Explanations seem to be particularly effective if they reveal possible flaws in case of wrong AI classifications or reassure users in case of correct classifications. Our results suggest that well-established methods might not be as beneficial to end users as expected and that XAI techniques must be chosen carefully in real-world scenarios.

https://osf.io/h6dwz/

About
Latest Posts

Ryan Watkins

Professor at George Washington University

I am a Professor with Human-Technology Collaboration and Educational Technology programs at George Washington University in Washington DC. I have written 12 books and more than 100 articles, and I co-host of the Parsing Science podcast where scientists tell the stories behind their research. I am also the developer of the WeShareScience.com online platform for sharing research videos, and SciencePods.com where researchers can create free podcasts about their science. My research interests include human interactions with intelligent machines, needs, needs assessments, and instructional design.