Abstract
Identifying meaningful features that drive a phenomenon (response) of interest in complex systems of interconnected factors is a challenging problem. Causal discovery methods have been previously applied to estimate bounds on causal strengths of factors on a response or to identify meaningful interactions between factors in complex systems, but these approaches have been used only for inferential purposes. In contrast, we posit that interactions between factors with a potential causal association on a given response could be viable candidates not only for hypothesis generation but also for predictive modeling. In this work, we propose a causality-guided feature selection methodology that identifies factors having a potential cause-effect relationship in complex systems, and selects features by clustering them based on their causal strength with respect to the response. To this end, we estimate statistically significant causal effects on the response of factors taking part in potential causal relationships, while addressing associated technical challenges, such as multicollinearity in the data. We validate the proposed methodology for predicting response in five real-world datasets from the domain of climate science and biology. The selected features show predictive skill and consistent performance across different domains.
Original language | English (US) |
---|---|
Title of host publication | Advanced Data Mining and Applications - 12th International Conference, ADMA 2016, Proceedings |
Editors | Jinyan Li, Xue Li, Shuliang Wang, Jianxin Li, Quan Z. Sheng |
Publisher | Springer |
Pages | 391-405 |
Number of pages | 15 |
ISBN (Print) | 9783319495859 |
DOIs | |
State | Published - 2016 |
Event | 12th International Conference on Advanced Data Mining and Applications, ADMA 2016 - Gold Coast, Australia Duration: Dec 12 2016 → Dec 15 2016 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 10086 LNAI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Other
Other | 12th International Conference on Advanced Data Mining and Applications, ADMA 2016 |
---|---|
Country/Territory | Australia |
City | Gold Coast |
Period | 12/12/16 → 12/15/16 |
Bibliographical note
Publisher Copyright:© Springer International Publishing AG 2016.