Tensor Decomposition for Model Reduction in Neural Networks: A Review [Feature]

Xingyi Liu, Keshab K. Parhi

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Modern neural networks have revolutionized the fields of computer vision (CV) and Natural Language Processing (NLP). They are widely used for solving complex CV tasks and NLP tasks such as image classification, image generation, and machine translation. Most state-of-the-art neural networks are over-parameterized and require a high computational cost. One straightforward solution is to replace the layers of the networks with their low-rank tensor approximations using different tensor decomposition methods. This article reviews six tensor decomposition methods and illustrates their ability to compress model parameters of convolutional neural networks (CNNs), recurrent neural networks (RNNs) and Transformers. The accuracy of some compressed models can be higher than the original versions. Evaluations indicate that tensor decompositions can achieve significant reductions in model size, run-time and energy consumption, and are well suited for implementing neural networks on edge devices.

Original languageEnglish (US)
Pages (from-to)8-28
Number of pages21
JournalIEEE Circuits and Systems Magazine
Volume23
Issue number2
DOIs
StatePublished - 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2001-2012 IEEE.

Keywords

  • Tensor decomposition
  • Tucker decomposition
  • block-term decomposition
  • canonical polyadic decomposition
  • convolution neural network acceleration
  • hierarchical Tucker decomposition
  • model compression.
  • recurrent neural network acceleration
  • tensor ring decomposition
  • tensor train decomposition
  • transformer acceleration

Fingerprint

Dive into the research topics of 'Tensor Decomposition for Model Reduction in Neural Networks: A Review [Feature]'. Together they form a unique fingerprint.

Cite this