TY - GEN
T1 - Designing a processor from the ground up to allow voltage/reliability tradeoffs
AU - Kahng, Andrew B.
AU - Kang, Seokhyeong
AU - Kumar, Rakesh
AU - Sartori, John
PY - 2010
Y1 - 2010
N2 - Current processor designs have a critical operating point that sets a hard limit on voltage scaling. Any scaling beyond the critical voltage results in exceeding the maximum allowable error rate, i.e., there are more timing errors than can be effectively and gainfully detected or corrected by an error-tolerance mechanism. This limits the effectiveness of voltage scaling as a knob for reliability/power tradeoffs. In this paper, we present power-aware slack redistribution, a novel design-level approach to allow voltage/reliability tradeoffs in processors. Techniques based on power-aware slack redistribution reapportion timing slack of the frequently-occurring, near-critical timing paths of a processor in a power- and area-efficient manner, such that we increase the range of voltages over which the incidence of operational (timing) errors is acceptable. This results in soft architectures - designs that fail gracefully, allowing us to perform reliability/power tradeoffs by reducing voltage up to the point that produces maximum allowable errors for our application. The goal of our optimization is to minimize the voltage at which a soft architecture encounters the maximum allowable error rate, thus maximizing the range over which voltage scaling is possible and minimizing power consumption for a given error rate. Our experiments demonstrate 23% power savings over the baseline design at an error rate of 1%. Observed power reductions are 29%, 29%, 19%, and 20% for error rates of 2%, 4%, 8%, and 16% respectively. Benefits are higher in the face of error recovery using Razor. Area overhead of our techniques is up to 2.7%.
AB - Current processor designs have a critical operating point that sets a hard limit on voltage scaling. Any scaling beyond the critical voltage results in exceeding the maximum allowable error rate, i.e., there are more timing errors than can be effectively and gainfully detected or corrected by an error-tolerance mechanism. This limits the effectiveness of voltage scaling as a knob for reliability/power tradeoffs. In this paper, we present power-aware slack redistribution, a novel design-level approach to allow voltage/reliability tradeoffs in processors. Techniques based on power-aware slack redistribution reapportion timing slack of the frequently-occurring, near-critical timing paths of a processor in a power- and area-efficient manner, such that we increase the range of voltages over which the incidence of operational (timing) errors is acceptable. This results in soft architectures - designs that fail gracefully, allowing us to perform reliability/power tradeoffs by reducing voltage up to the point that produces maximum allowable errors for our application. The goal of our optimization is to minimize the voltage at which a soft architecture encounters the maximum allowable error rate, thus maximizing the range over which voltage scaling is possible and minimizing power consumption for a given error rate. Our experiments demonstrate 23% power savings over the baseline design at an error rate of 1%. Observed power reductions are 29%, 29%, 19%, and 20% for error rates of 2%, 4%, 8%, and 16% respectively. Benefits are higher in the face of error recovery using Razor. Area overhead of our techniques is up to 2.7%.
UR - http://www.scopus.com/inward/record.url?scp=77952561335&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77952561335&partnerID=8YFLogxK
U2 - 10.1109/hpca.2010.5416652
DO - 10.1109/hpca.2010.5416652
M3 - Conference contribution
AN - SCOPUS:77952561335
SN - 9781424456581
T3 - Proceedings - International Symposium on High-Performance Computer Architecture
BT - HPCA-16 2010 - The 16th International Symposium on High-Performance Computer Architecture
PB - IEEE Computer Society
T2 - 16th International Symposium on High-Performance Computer Architecture, HPCA-16 2010
Y2 - 9 January 2010 through 14 January 2010
ER -