NAGIOS: RODERIC FUNCIONANDO

Memory degradation induced by attention in recurrent neural architectures

Repositori DSpace/Manakin

IMPORTANT: Aquest repositori està en una versió antiga des del 3/12/2023. La nova instal.lació está en https://roderic.uv.es/

Memory degradation induced by attention in recurrent neural architectures

Mostra el registre parcial de l'element

dc.contributor.author Harvat, Mykola
dc.contributor.author Martín Guerrero, José David
dc.date.accessioned 2023-06-15T07:52:14Z
dc.date.available 2023-06-16T04:45:06Z
dc.date.issued 2022 es_ES
dc.identifier.citation Harvat, M., & Martín-Guerrero, J. D. (2022). Memory degradation induced by attention in recurrent neural architectures. Neurocomputing, 502, 161-176. es_ES
dc.identifier.uri https://hdl.handle.net/10550/87913
dc.description.abstract This paper studies the memory mechanisms in recurrent neural architectures when attention models are included. Pure-attention models like Transformers are more and more popular as they tend to outperform models with recurrent connections in many different tasks. Our conjecture is that attention prevents the recurrent connections from transferring information properly between consecutive next steps. This conjecture is empirically tested using five different models, namely, a model without attention, a standard Loung attention model, a standard Bahdanau attention model, and our proposal to add attention to the inputs in order to fill the gap between recurrent and parallel architectures (for both Luong and Bahdanau attention models). Eight different problems are considered to assess the five models: a sequence-reverse copy problem, a sequence-reverse copy problem with repetitions, a filter sequence problem, a sequence-reverse copy problem with bigrams and four translation problems (English to Spanish, English to French, English to German and English to Italian). The achieved results reinforce our conjecture on the interaction between attention and recurrence. es_ES
dc.language.iso en es_ES
dc.publisher Elsevier es_ES
dc.subject long short-term memory networks es_ES
dc.subject attention mechanisms es_ES
dc.subject recurrence es_ES
dc.subject gate activations es_ES
dc.subject forget gate es_ES
dc.title Memory degradation induced by attention in recurrent neural architectures es_ES
dc.type journal article es_ES
dc.subject.unesco UNESCO::CIENCIAS TECNOLÓGICAS es_ES
dc.identifier.doi 10.1016/j.neucom.2022.06.056 es_ES
dc.accrualmethod CI es_ES
dc.embargo.terms 0 days es_ES
dc.type.hasVersion VoR es_ES
dc.rights.accessRights open access es_ES

Visualització       (8.520Mb)

Aquest element apareix en la col·lecció o col·leccions següent(s)

Mostra el registre parcial de l'element

Cerca a RODERIC

Cerca avançada

Visualitza

Estadístiques