TY - GEN
T1 - Dynamic performance tuning for speculative threads
AU - Luo, Yangchun
AU - Packirisamy, Venkatesan
AU - Hsu, Wei Chung
AU - Zhai, Antonia
AU - Mungre, Nikhil
AU - Tarkas, Ankit
PY - 2009
Y1 - 2009
N2 - In Response to the emergence of multicore processors, various novel and sophisticated execution models have been introduced to fully utilize these processors. One such execution model is Thread-Level Speculation (TLS), which allows potentially dependent threads to execute speculatively in parallel. While TLS offers significant performance potential for applications that are otherwise non-parallel, extracting efficient speculative threads in the presence of complex control flow and ambiguous data dependences is a real challenge. This task is further complicated by the fact that the performance of speculative threads is often architecture-dependent, input-sensitive, and exhibits phase behaviors. Thus we propose dynamic performance tuning mechanisms that determine where and how to create speculative threads at runtime. This paper describes the design, implementation, and evaluation of hardware and software support that takes advantage of runtime performance profiles to extract efficient speculative threads. In our proposed framework, speculative threads are monitored by hardware-based performance counters and their performance impact is estimated. The creation of speculative threads is adjusted based on the estimation. This paper proposes speculative threads performance estimation techniques, that are capable of correctly determining whether speculation can improve performance for loops that corresponds to 83.8% of total loop execution time across all benchmarks. This paper also examines several dynamic performance tuning policies and finds that the best tuning policy achieves an overall speedup of 36.8% on a set of benchmarks from SPEC2000 suite, which outperforms static thread management by 9.5%.
AB - In Response to the emergence of multicore processors, various novel and sophisticated execution models have been introduced to fully utilize these processors. One such execution model is Thread-Level Speculation (TLS), which allows potentially dependent threads to execute speculatively in parallel. While TLS offers significant performance potential for applications that are otherwise non-parallel, extracting efficient speculative threads in the presence of complex control flow and ambiguous data dependences is a real challenge. This task is further complicated by the fact that the performance of speculative threads is often architecture-dependent, input-sensitive, and exhibits phase behaviors. Thus we propose dynamic performance tuning mechanisms that determine where and how to create speculative threads at runtime. This paper describes the design, implementation, and evaluation of hardware and software support that takes advantage of runtime performance profiles to extract efficient speculative threads. In our proposed framework, speculative threads are monitored by hardware-based performance counters and their performance impact is estimated. The creation of speculative threads is adjusted based on the estimation. This paper proposes speculative threads performance estimation techniques, that are capable of correctly determining whether speculation can improve performance for loops that corresponds to 83.8% of total loop execution time across all benchmarks. This paper also examines several dynamic performance tuning policies and finds that the best tuning policy achieves an overall speedup of 36.8% on a set of benchmarks from SPEC2000 suite, which outperforms static thread management by 9.5%.
KW - Dynamic optimization
KW - Multicore
KW - Thread-level speculation
UR - http://www.scopus.com/inward/record.url?scp=70549097151&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70549097151&partnerID=8YFLogxK
U2 - 10.1145/1555754.1555812
DO - 10.1145/1555754.1555812
M3 - Conference contribution
AN - SCOPUS:70549097151
SN - 9781605585260
T3 - Proceedings - International Symposium on Computer Architecture
SP - 462
EP - 473
BT - ISCA 2009 - 36th Annual International Symposium on Computer Architecture, Conference Proceedings
T2 - ISCA 2009 - 36th Annual International Symposium on Computer Architecture
Y2 - 20 June 2009 through 24 June 2009
ER -