Variable Input Resolution SRAM In-Memory Computing Architecture

In-memory computing (IMC) is a widely used technique that performs low-precision computations inside memory elements to break the von-Neumann bottleneck in conventional AI/ML hardware (e.g., CPUs/GPUs).  Recently, SRAM based IMC has gained significant attention due to its high energy efficiency and easy integration with CMOS ICs.  A fundamental limitation of SRAM based IMC is nonlinearity of the access transistors to which the input activation is applied.  Large values of vector matrix multiplication (VMM) products computed by SRAM array make the currents through the access transistors change nonlinearly with the VMM products which in turn makes the VMM product nonlinear.  

Recent works have tried to address this nonlinearity issue through (1) 1-bit activation; (2) converting analog input activations to binary pulse trains; and (3) charge-domain computation using switched-capacitors.  However, 1-bit activation requires boosting with multiple classifiers to achieve good performance which reduces energy efficiency.  Pulse trains improve linearity but cannot fundamentally address nonlinearity due to current-domain accumulation in SRAM array and introduce quantization error due to conversion of analog input to pulsed input.  And, switched-capacitors improve linearity, but current designs are still limited by nonlinearity of access transistors that apply input activation to the capacitors.

Researchers at Arizona State University and the University at Buffalo have developed a solution to the fundamental non-linearity in SRAM based IMC with a switched-capacitor based delta-sigma SRAM architecture.  This architecture converts activations to binary pulse trains with lower in-band quantization error and performs computations in charge-domain with high linearity.  By using a delta-sigma modulator (DSM), this architecture allows variable resolution for input and output activations without requiring changes in hardware.   

Related Publications: 

A 138-TOPS/W Delta-Sigma Modulator-Based Variable- Resolution Activation in-Memory Computing Macro

SRAM In-Memory Computing Macro With Delta-Sigma Modulator Based Variable-Resolution Activation

Potential Applications:

  • Architecture can be used for:
    • Artificial intelligence algorithms (e.g., on-device AI algorithms)
    • In-memory computing
    • SRAM accelerators

Benefits and Advantages:

  • Resolution of DSM can be dynamically re-configured easily without requiring changes in hardware
  • Proposed SRAM based IMC architecture (e.g., fabricated in 65nm CMOS) performed the following:
    • identified 5 human activities (i.e., sitting, standing, walking, running, and dancing) from the HAR dataset (i.e., acceleration data measured by smartphone accelerometer sensors) with more than 96% accuracy while consuming only 420.5 pJ/classification
    • benchmarked on MNIST and CIFAR-10 datasets achieving mean accuracies of 98.67% and 89.85% respectively with maximum energy efficiency of 138.6 TOPS/W