Self-Supervised Quantization-Aware Knowledge Distillation

Background

Deep neural networks (DNNs) are an important tool in enhancing deep learning capabilities, but they have substantial computational and memory requirements. Quantization is one of the important model compression approaches to address the mismatch between resource-hungry DNNs and resource-constrained devices, by converting the full-precision model weights or activations to lower precision. Specifically, Quantization-Aware Training (QAT) has achieved promising early results in creating low-bit models, but also lead to considerable accuracy loss and cannot achieve consistent performance on every model architecture. There is a need for a generalized, simple yet effective framework that is flexible to incorporate and improve QAT algorithms for both low-bit and high-bit quantization.

Invention Description

Researchers at Arizona State University have developed Self-Supervised Quantization-Aware Knowledge Distillation (SQAKD), which is new framework that enhances low-bit deep learning models’ performance by unifying quantization-aware training and knowledge distillation without requiring labeled data. This framework formulates Quantization-Aware Training (QAT) as a co-optimization problem, which minimizes the divergence loss between full-precision and low-bit models, improving the overall model performance and training efficiency.

Potential Applications:

Deployment of efficient, low-bit deep learning models on resource-constrained devices
Improvement of deep learning model performance in IoT devices and edge computing
Enhancement of model training and inference speed (e.g., hardware)

Benefits and Advantages:

Enhanced convergence speed – improves speed and accuracy for low-bit deep learning models
Flexible framework – incorporates various QAT models
Broad application scope – hyperparameter-free and operates without labeled data
Reduces training costs – simplifies training procedures and promotes reproducibility
Improved performance – outperforms state-of-the-art QAT methods across various architectures

Inventor(s)

Technology categories

Licensing Contacts