Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications

Shaoming Xu; Ankush Khandelwal; Xiang Li; Xiaowei Jia; Licheng Liu; Jared Willard; Rahul Ghosh; Kelly Cutler; Michael Steinbach; Christopher Duffy; John Nieber; Vipin Kumar

Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications

Shaoming Xu, Ankush Khandelwal, Xiang Li, Xiaowei Jia, Licheng Liu, Jared Willard, Rahul Ghosh, Kelly Cutler, Michael Steinbach, Christopher Duffy, John Nieber, Vipin Kumar

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

In many environmental applications, recurrent neural networks (RNNs) are often used to model physical variables with long temporal dependencies. However, due to mini-batch training, temporal relationships between training segments within the batch (intra-batch) as well as between batches (inter-batch) are not considered, which can lead to limited performance. Stateful RNNs aim to address this issue by passing hidden states between batches. Since Stateful RNNs ignore intra-batch temporal dependency, there exists a trade-off between training stability and capturing temporal dependency. In this paper, we provide a quantitative comparison of different Stateful RNN modeling strategies, and propose two strategies to enforce both intra- and inter-batch temporal dependency. First, we extend Stateful RNNs by defining a batch as a temporally ordered set of training segments, which enables intra-batch sharing of temporal information. While this approach significantly improves the performance, it leads to much larger training times due to highly sequential training. To address this issue, we further propose a new strategy which augments a training segment with an initial value of the target variable from the timestep right before the starting of the training segment. In other words, we provide an initial value of the target variable as additional input so that the network can focus on learning changes relative to that initial value. By using this strategy, samples can be passed in any order (mini-batch training) which significantly reduces the training time while maintaining the performance. In demonstrating the utility of our approach in hydrological modeling, we observe that the most significant gains in predictive accuracy occur when these methods are applied to state variables whose values change more slowly, such as soil water and snowpack, rather than continuously moving flux variables such as streamflow.

Original language	English (US)
Title of host publication	2023 SIAM International Conference on Data Mining, SDM 2023
Publisher	Society for Industrial and Applied Mathematics Publications
Pages	649-657
Number of pages	9
ISBN (Electronic)	9781611977653
State	Published - 2023
Event	2023 SIAM International Conference on Data Mining, SDM 2023 - Minneapolis, United States Duration: Apr 27 2023 → Apr 29 2023

Publication series

Name	2023 SIAM International Conference on Data Mining, SDM 2023

Conference

Conference	2023 SIAM International Conference on Data Mining, SDM 2023
Country/Territory	United States
City	Minneapolis
Period	4/27/23 → 4/29/23

Bibliographical note

Publisher Copyright:
Copyright © 2023 by SIAM.

OpenUrl availability

Full text

Cite this

Xu, S., Khandelwal, A., Li, X., Jia, X., Liu, L., Willard, J., Ghosh, R., Cutler, K., Steinbach, M., Duffy, C., Nieber, J., & Kumar, V. (2023). Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications. In 2023 SIAM International Conference on Data Mining, SDM 2023 (pp. 649-657). (2023 SIAM International Conference on Data Mining, SDM 2023). Society for Industrial and Applied Mathematics Publications.

Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications. / Xu, Shaoming; Khandelwal, Ankush; Li, Xiang et al.
2023 SIAM International Conference on Data Mining, SDM 2023. Society for Industrial and Applied Mathematics Publications, 2023. p. 649-657 (2023 SIAM International Conference on Data Mining, SDM 2023).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Xu, S, Khandelwal, A, Li, X, Jia, X, Liu, L, Willard, J, Ghosh, R, Cutler, K, Steinbach, M, Duffy, C, Nieber, J & Kumar, V 2023, Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications. in 2023 SIAM International Conference on Data Mining, SDM 2023. 2023 SIAM International Conference on Data Mining, SDM 2023, Society for Industrial and Applied Mathematics Publications, pp. 649-657, 2023 SIAM International Conference on Data Mining, SDM 2023, Minneapolis, United States, 4/27/23.

Xu S, Khandelwal A, Li X, Jia X, Liu L, Willard J et al. Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications. In 2023 SIAM International Conference on Data Mining, SDM 2023. Society for Industrial and Applied Mathematics Publications. 2023. p. 649-657. (2023 SIAM International Conference on Data Mining, SDM 2023).

Xu, Shaoming ; Khandelwal, Ankush ; Li, Xiang et al. / Mini-Batch Learning Strategies for modeling long term temporal dependencies : A study in environmental applications. 2023 SIAM International Conference on Data Mining, SDM 2023. Society for Industrial and Applied Mathematics Publications, 2023. pp. 649-657 (2023 SIAM International Conference on Data Mining, SDM 2023).

@inproceedings{75b00c87cfaf418f9fe4fcd5c209a81e,

title = "Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications",

abstract = "In many environmental applications, recurrent neural networks (RNNs) are often used to model physical variables with long temporal dependencies. However, due to mini-batch training, temporal relationships between training segments within the batch (intra-batch) as well as between batches (inter-batch) are not considered, which can lead to limited performance. Stateful RNNs aim to address this issue by passing hidden states between batches. Since Stateful RNNs ignore intra-batch temporal dependency, there exists a trade-off between training stability and capturing temporal dependency. In this paper, we provide a quantitative comparison of different Stateful RNN modeling strategies, and propose two strategies to enforce both intra- and inter-batch temporal dependency. First, we extend Stateful RNNs by defining a batch as a temporally ordered set of training segments, which enables intra-batch sharing of temporal information. While this approach significantly improves the performance, it leads to much larger training times due to highly sequential training. To address this issue, we further propose a new strategy which augments a training segment with an initial value of the target variable from the timestep right before the starting of the training segment. In other words, we provide an initial value of the target variable as additional input so that the network can focus on learning changes relative to that initial value. By using this strategy, samples can be passed in any order (mini-batch training) which significantly reduces the training time while maintaining the performance. In demonstrating the utility of our approach in hydrological modeling, we observe that the most significant gains in predictive accuracy occur when these methods are applied to state variables whose values change more slowly, such as soil water and snowpack, rather than continuously moving flux variables such as streamflow.",

author = "Shaoming Xu and Ankush Khandelwal and Xiang Li and Xiaowei Jia and Licheng Liu and Jared Willard and Rahul Ghosh and Kelly Cutler and Michael Steinbach and Christopher Duffy and John Nieber and Vipin Kumar",

note = "Publisher Copyright: Copyright {\textcopyright} 2023 by SIAM.; 2023 SIAM International Conference on Data Mining, SDM 2023 ; Conference date: 27-04-2023 Through 29-04-2023",

year = "2023",

language = "English (US)",

series = "2023 SIAM International Conference on Data Mining, SDM 2023",

publisher = "Society for Industrial and Applied Mathematics Publications",

pages = "649--657",

booktitle = "2023 SIAM International Conference on Data Mining, SDM 2023",

}

TY - GEN

T1 - Mini-Batch Learning Strategies for modeling long term temporal dependencies

T2 - 2023 SIAM International Conference on Data Mining, SDM 2023

AU - Xu, Shaoming

AU - Khandelwal, Ankush

AU - Li, Xiang

AU - Jia, Xiaowei

AU - Liu, Licheng

AU - Willard, Jared

AU - Ghosh, Rahul

AU - Cutler, Kelly

AU - Steinbach, Michael

AU - Duffy, Christopher

AU - Nieber, John

AU - Kumar, Vipin

PY - 2023

Y1 - 2023

N2 - In many environmental applications, recurrent neural networks (RNNs) are often used to model physical variables with long temporal dependencies. However, due to mini-batch training, temporal relationships between training segments within the batch (intra-batch) as well as between batches (inter-batch) are not considered, which can lead to limited performance. Stateful RNNs aim to address this issue by passing hidden states between batches. Since Stateful RNNs ignore intra-batch temporal dependency, there exists a trade-off between training stability and capturing temporal dependency. In this paper, we provide a quantitative comparison of different Stateful RNN modeling strategies, and propose two strategies to enforce both intra- and inter-batch temporal dependency. First, we extend Stateful RNNs by defining a batch as a temporally ordered set of training segments, which enables intra-batch sharing of temporal information. While this approach significantly improves the performance, it leads to much larger training times due to highly sequential training. To address this issue, we further propose a new strategy which augments a training segment with an initial value of the target variable from the timestep right before the starting of the training segment. In other words, we provide an initial value of the target variable as additional input so that the network can focus on learning changes relative to that initial value. By using this strategy, samples can be passed in any order (mini-batch training) which significantly reduces the training time while maintaining the performance. In demonstrating the utility of our approach in hydrological modeling, we observe that the most significant gains in predictive accuracy occur when these methods are applied to state variables whose values change more slowly, such as soil water and snowpack, rather than continuously moving flux variables such as streamflow.

AB - In many environmental applications, recurrent neural networks (RNNs) are often used to model physical variables with long temporal dependencies. However, due to mini-batch training, temporal relationships between training segments within the batch (intra-batch) as well as between batches (inter-batch) are not considered, which can lead to limited performance. Stateful RNNs aim to address this issue by passing hidden states between batches. Since Stateful RNNs ignore intra-batch temporal dependency, there exists a trade-off between training stability and capturing temporal dependency. In this paper, we provide a quantitative comparison of different Stateful RNN modeling strategies, and propose two strategies to enforce both intra- and inter-batch temporal dependency. First, we extend Stateful RNNs by defining a batch as a temporally ordered set of training segments, which enables intra-batch sharing of temporal information. While this approach significantly improves the performance, it leads to much larger training times due to highly sequential training. To address this issue, we further propose a new strategy which augments a training segment with an initial value of the target variable from the timestep right before the starting of the training segment. In other words, we provide an initial value of the target variable as additional input so that the network can focus on learning changes relative to that initial value. By using this strategy, samples can be passed in any order (mini-batch training) which significantly reduces the training time while maintaining the performance. In demonstrating the utility of our approach in hydrological modeling, we observe that the most significant gains in predictive accuracy occur when these methods are applied to state variables whose values change more slowly, such as soil water and snowpack, rather than continuously moving flux variables such as streamflow.

UR - http://www.scopus.com/inward/record.url?scp=85174885288&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85174885288&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85174885288

T3 - 2023 SIAM International Conference on Data Mining, SDM 2023

SP - 649

EP - 657

BT - 2023 SIAM International Conference on Data Mining, SDM 2023

PB - Society for Industrial and Applied Mathematics Publications

Y2 - 27 April 2023 through 29 April 2023

ER -

Mini-Batch Learning Strategies for modeling long term temporal dependencies: A study in environmental applications

Abstract

Publication series

Conference

Bibliographical note

OpenUrl availability

Other files and links

Fingerprint

Cite this