The recent decades have given advent to the rise of sophisticated High Performance Computing (HPC) accelerators, vastly speeding up calculations. In the last years dedicated AI accelerators, meant for the evaluation of Artificial Neural Networks, have gathered traction. Resistive Random Access Memory (RRAM) devices are a possible future candidate for these accelerators since crossbar implementations allow for the evaluation of matrix vector multiplications in O(1). Unfortunately integrating these novel devices into accelerators challenges since they still suffer from device variations and require sophisticated peripheral circuitry. Additionally, suitable design flows are missing since these cells are difficult to integrate into the traditional digital flow. While multiple foundries are able to fabricate promising RRAM prototypes suffering less from device variations, full system integration tends to be lacking. Fortunately the rise of the RISC-V ecosystem has enabled eased access to a fully customizable ISA. We propose to exploit the advantages of the RRAM devices combined with the flexibility of RISC-V cores by integrating multiple RRAM-based blocks into a RISC-V core via Memory Mapped I/O (MMIO), resulting in an architecture which can be reconfigured in software. Additionally, we propose a possible approach for the design, simulation and verification of large RRAM systems, namely setting up three closely intertwined simulation environments and illustrate its applicability by integrating, characterizing and validating a RRAM-based MVM block fabricated in 130 nm technology. Finally, we demonstrate that RRAM technologies might be ready for HPC.
«
The recent decades have given advent to the rise of sophisticated High Performance Computing (HPC) accelerators, vastly speeding up calculations. In the last years dedicated AI accelerators, meant for the evaluation of Artificial Neural Networks, have gathered traction. Resistive Random Access Memory (RRAM) devices are a possible future candidate for these accelerators since crossbar implementations allow for the evaluation of matrix vector multiplications in O(1). Unfortunately integrating thes...
»