NAGIOS: RODERIC FUNCIONANDO

An Scalable matrix computing unit architecture for FPGA and SCUMO user design interface

Repositori DSpace/Manakin

IMPORTANT: Aquest repositori està en una versió antiga des del 3/12/2023. La nova instal.lació está en https://roderic.uv.es/

An Scalable matrix computing unit architecture for FPGA and SCUMO user design interface

dc.contributor.author	Abbaszadeh, Asgar
dc.contributor.author	Iakymchuk, Taras
dc.contributor.author	Bataller Mompean, Manuel
dc.contributor.author	Francés Villora, José Vicente
dc.contributor.author	Rosado Muñoz, Alfredo
dc.date.accessioned	2020-04-29T10:56:40Z
dc.date.available	2020-04-29T10:56:40Z
dc.date.issued	2019
dc.identifier.citation	Abbaszadeh, Asgar Iakymchuk, Taras Bataller Mompean, Manuel Francés Villora, José Vicente Rosado Muñoz, Alfredo 2019 An Scalable matrix computing unit architecture for FPGA and SCUMO user design interface Electronics 8 1 94-1 94-20
dc.identifier.uri	https://hdl.handle.net/10550/74126
dc.description.abstract	High dimensional matrix algebra is essential in numerous signal processing and machine learning algorithms. This work describes a scalable square matrix-computing unit designed on the basis of circulant matrices. It optimizes data flow for the computation of any sequence of matrix operations removing the need for data movement for intermediate results, together with the individual matrix operations' performance in direct or transposed form (the transpose matrix operation only requires a data addressing modification). The allowed matrix operations are: matrix-by-matrix addition, subtraction, dot product and multiplication, matrix-by-vector multiplication, and matrix by scalar multiplication. The proposed architecture is fully scalable with the maximum matrix dimension limited by the available resources. In addition, a design environment is also developed, permitting assistance, through a friendly interface, from the customization of the hardware computing unit to the generation of the final synthesizable IP core. For N x N matrices, the architecture requires N ALU-RAM blocks and performs O(NN), requiring NN +7 and N +7 clock cycles for matrix-matrix and matrix-vector operations, respectively. For the tested Virtex7 FPGA device, the computation for 500 x 500 matrices allows a maximum clock frequency of 346 MHz, achieving an overall performance of 173 GOPS. This architecture shows higher performance than other state-of-the-art matrix computing units.
dc.language.iso	eng
dc.relation.ispartof	Electronics, 2019, vol. 8, num. 1, p. 94-1-94-20
dc.subject	Arquitectura d'ordinadors
dc.subject	Sistemes informàtics
dc.title	An Scalable matrix computing unit architecture for FPGA and SCUMO user design interface
dc.type	journal article	es_ES
dc.date.updated	2020-04-29T10:56:41Z
dc.identifier.doi	10.3390/electronics8010094
dc.identifier.idgrec	136993
dc.rights.accessRights	open access	es_ES