Predictive and Causal Implications of using Shapley Value for Model Interpretation

Sisi Ma; Roshan Tourani

Predictive and Causal Implications of using Shapley Value for Model Interpretation

Sisi Ma, Roshan Tourani

Institute for Health Informatics

Research output: Contribution to journal › Conference article › peer-review

18 Scopus citations

Abstract

Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model interpretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.

Original language	English (US)
Pages (from-to)	23-38
Number of pages	16
Journal	Proceedings of Machine Learning Research
Volume	127
State	Published - 2020
Event	2020 ACM SIGKDD workshop on Causal Discovery, CD 2020 - Virtual, Online, United States Duration: Aug 24 2020 → …

Bibliographical note

Publisher Copyright:
© 2020 Sisi Ma and Roshan Tourani.

Keywords

Causal Bayesian Networks
Model Explanation
Model Interpretability
Predictive Models
Shapley Value

OpenUrl availability

Full text

Cite this

@article{489114685c694eb7bdab24ff10235e5d,

title = "Predictive and Causal Implications of using Shapley Value for Model Interpretation",

abstract = "Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model interpretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.",

keywords = "Causal Bayesian Networks, Model Explanation, Model Interpretability, Predictive Models, Shapley Value",

author = "Sisi Ma and Roshan Tourani",

note = "Publisher Copyright: {\textcopyright} 2020 Sisi Ma and Roshan Tourani.; 2020 ACM SIGKDD workshop on Causal Discovery, CD 2020 ; Conference date: 24-08-2020",

year = "2020",

language = "English (US)",

volume = "127",

pages = "23--38",

journal = "Proceedings of Machine Learning Research",

issn = "2640-3498",

}

TY - JOUR

T1 - Predictive and Causal Implications of using Shapley Value for Model Interpretation

AU - Ma, Sisi

AU - Tourani, Roshan

PY - 2020

Y1 - 2020

N2 - Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model interpretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.

AB - Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model interpretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.

KW - Causal Bayesian Networks

KW - Model Explanation

KW - Model Interpretability

KW - Predictive Models

KW - Shapley Value

UR - http://www.scopus.com/inward/record.url?scp=85162957389&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85162957389&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85162957389

SN - 2640-3498

VL - 127

SP - 23

EP - 38

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

T2 - 2020 ACM SIGKDD workshop on Causal Discovery, CD 2020

Y2 - 24 August 2020

ER -

Predictive and Causal Implications of using Shapley Value for Model Interpretation

Abstract

Bibliographical note

Keywords

OpenUrl availability

Other files and links

Fingerprint

Cite this