NAGIOS: RODERIC FUNCIONANDO

A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks

Repositori DSpace/Manakin

IMPORTANT: Aquest repositori està en una versió antiga des del 3/12/2023. La nova instal.lació está en https://roderic.uv.es/

A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks

Mostra el registre parcial de l'element

dc.contributor.author Medus, Leandro Daniel
dc.contributor.author Iakymchuk, Taras
dc.contributor.author Francés Villora, José Vicente
dc.contributor.author Bataller Mompean, Manuel
dc.contributor.author Rosado Muñoz, Alfredo
dc.date.accessioned 2020-04-29T10:29:36Z
dc.date.available 2020-04-29T10:29:36Z
dc.date.issued 2019
dc.identifier.citation Medus, Leandro Daniel Iakymchuk, Taras Francés Villora, José Vicente Bataller Mompean, Manuel Rosado Muñoz, Alfredo 2019 A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks Ieee Access 7 76084 76103
dc.identifier.uri https://hdl.handle.net/10550/74123
dc.description.abstract New chips for machine learning applications appear, they are tuned for a specific topology, being efficient by using highly parallel designs at the cost of high power or large complex devices. However, the computational demands of deep neural networks require flexible and efficient hardware architectures able to fit different applications, neural network types, number of inputs, outputs, layers, and units in each layer, making the migration from software to hardware easy. This paper describes novel hardware implementing any feedforward neural network (FFNN): multilayer perceptron, autoencoder, and logistic regression. The architecture admits an arbitrary input and output number, units in layers, and a number of layers. The hardware combines matrix algebra concepts with serial-parallel computation. It is based on a systolic ring of neural processing elements (NPE), only requiring as many NPEs as neuron units in the largest layer, no matter the number of layers. The use of resources grows linearly with the number of NPEs. This versatile architecture serves as an accelerator in real-time applications and its size does not affect the system clock frequency. Unlike most approaches, a single activation function block (AFB) for the whole FFNN is required. Performance, resource usage, and accuracy for several network topologies and activation functions are evaluated. The architecture reaches 550 MHz clock speed in a Virtex7 FPGA. The proposed implementation uses 18-bit fixed point achieving similar classification performance to a floating point approach. A reduced weight bit size does not affect the accuracy, allowing more weights in the same memory. Different FFNN for Iris and MNIST datasets were evaluated and, for a real-time application of abnormal cardiac detection, a x256 acceleration was achieved. The proposed architecture can perform up to 1980 Giga operations per second (GOPS), implementing the multilayer FFNN of up to 3600 neurons per layer in a single chip. The architecture can be extended to bigger capacity devices or multi-chip by the simple NPE ring extension.
dc.language.iso eng
dc.relation.ispartof Ieee Access, 2019, vol. 7, p. 76084-76103
dc.subject Arquitectura d'ordinadors
dc.subject Sistemes informàtics
dc.subject Xarxes neuronals (Informàtica)
dc.title A Novel Systolic Parallel Hardware Architecture for the FPGA Acceleration of Feedforward Neural Networks
dc.type journal article es_ES
dc.date.updated 2020-04-29T10:29:37Z
dc.identifier.doi 10.1109/ACCESS.2019.2920885
dc.identifier.idgrec 136992
dc.rights.accessRights open access es_ES

Visualització       (12.60Mb)

Aquest element apareix en la col·lecció o col·leccions següent(s)

Mostra el registre parcial de l'element

Cerca a RODERIC

Cerca avançada

Visualitza

Estadístiques