Abstract
Memory hierarchy latency is one of the main problems that prevents processors from achieving high performance. To eliminate the need of loading/storing large sets of data, Resistive Associative Processors (ReAP) have been proposed as a solution to the von Neumann bottleneck. In ReAPs, logic and memory structures are combined together to allow inmemory computations. In this paper, we propose a new algorithm to compute the matrix multiplication inside the memory that exploits the benefits of ReAP. The proposed approach is based on the Cannon algorithm and uses a series of rotations without duplicating the data. It runs in O(n), where n is the dimension of the matrix. The method also applies to a large set of row by column matrix-based applications. Experimental results show several orders of magnitude increase in performance and reduction in energy and area when compared to the latest FPGA and CPU implementations.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Print) | 9783981926316 |
DOIs | |
State | Published - Apr 19 2018 |
Externally published | Yes |