Deep neural networks (DNNs) have been successful across many applications, but they require a large amount of computation and storage to achieve high accuracy. Considering the algorithmic side, the arithmetic complexity and storage requirement of DNNs has been aggressively reduced by low-precision quantization techniques. Considering the hardware side, many digital accelerators efficiently implemented DNNs with specialized dataflows based on a systolic array of multiply-and-accumulate (MAC) engines and on-chip memory hierarchy. Still, the energy/power breakdown results reported in recent DNN accelerators show that memory access and data communication consume a dominant portion (e.g., two-thirds or higher) of the total on-chip energy/power.
To address such bottlenecks, in-memory computing (IMC) has emerged as a promising technique. IMC performs MAC computation inside the memory (e.g., in SRAM) by activating multiple/all rows, whose result is represented by analog bitline voltage/current (VBL/IBL), and subsequently digitized by an analog-to-digital converter (ADC) in the periphery. This substantially reduces data transfer (compared to digital accelerators with separate MAC arrays) and increases parallelism (compared to conventional row-by-row access), significantly improving the energy efficiency of MAC operations. However, IMC designs achieve higher energy efficiency by trading off the signal-to-noise ratio, since analog computation inherently involves variability/noise. As a result, IMC chips show variability in the ADC outputs for the same ideal MAC value, and often report accuracy degradation compared to the digital baseline.
Researchers at Arizona State University and Columbia University have developed a hardware noise-aware deep neural network (DNN) training scheme to largely recover the accuracy loss of highly-parallel in-memory computing (IMC) hardware. To effectively improve the DNN accuracy of IMC hardware, hardware extracted noise is injected for DNN training at the partial sum level that matches with the crossbar structure of IMC hardware and the injected noise is based on actual hardware noise measured from multiple IMC chips. The actual hardware noise measured at the partial sum level (ADC output) captures individual weight-/activation-level noise, bitline noise, and ADC offset/quantization noise collectively.
Related publication: Improving DNN Hardware Accuracy by In-Memory Computing Noise Injection
Potential Applications:
- Deep Neural Network (DNN) training for deployment
- DNN hardware
- In-memory computing (IMC) for training DNN
- Application-specific integrated circuits (ASICs) for DNN
Benefits and Advantages:
- Better inference accuracy especially when the IMC hardware noise amount is high
- Experimentation has proven consistent improvement of DNN accuracy for actual IMC hardware for various DNNs and IMC chip measurements