Design and Performance Analysis of Hardware Accelerator for Deep Neural Network in Heterogeneous Platform
Date
2018-08
Authors
Sefat, Md Syadus
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This thesis describes a new flexible approach to implementing energy-efficient
DNN accelerator on FPGAs. Our design leverages the Coherent Accelerator Processor Interface (CAPI) which provides a cache-coherent view of system memory to attached accelerators. Computational kernels are accelerated on a CAPI-supported Kintex FPGA board. Our implementation bypasses the need for device driver code and significantly reduces the communication and I/O transfer overhead. To improve the performance of the entire application, we propose a collaborative model of execution in which the control of the data flow within the accelerator is kept independent, freeing-up CPU cores to work on other parts of the application. For further performance enhancements, we propose a technique to exploit data locality in the cache, situated in the CAPI Power Service Layer (PSL). Finally, we develop a resource-conscious implementation for more efficient utilization of resources and improved scalability. Compared with the previous work, our architecture achieves both improved performance and better power efficiency.
Description
Keywords
Hardware, Accelerator, DNN, FPGA
Citation
Sefat, M. D. S. (2018). <i>Design and performance analysis of hardware accelerator for deep neural network in heterogeneous platform</i> (Unpublished thesis). Texas State University, San Marcos, Texas.