NAGIOS: RODERIC FUNCIONANDO

Memory degradation induced by attention in recurrent neural architectures

Repositori DSpace/Manakin

IMPORTANT: Aquest repositori està en una versió antiga des del 3/12/2023. La nova instal.lació está en https://roderic.uv.es/

Memory degradation induced by attention in recurrent neural architectures

dc.contributor.author	Harvat, Mykola
dc.contributor.author	Martín Guerrero, José David
dc.date.accessioned	2023-06-15T07:52:14Z
dc.date.available	2023-06-16T04:45:06Z
dc.date.issued	2022	es_ES
dc.identifier.citation	Harvat, M., & Martín-Guerrero, J. D. (2022). Memory degradation induced by attention in recurrent neural architectures. Neurocomputing, 502, 161-176.	es_ES
dc.identifier.uri	https://hdl.handle.net/10550/87913
dc.description.abstract	This paper studies the memory mechanisms in recurrent neural architectures when attention models are included. Pure-attention models like Transformers are more and more popular as they tend to outperform models with recurrent connections in many different tasks. Our conjecture is that attention prevents the recurrent connections from transferring information properly between consecutive next steps. This conjecture is empirically tested using five different models, namely, a model without attention, a standard Loung attention model, a standard Bahdanau attention model, and our proposal to add attention to the inputs in order to fill the gap between recurrent and parallel architectures (for both Luong and Bahdanau attention models). Eight different problems are considered to assess the five models: a sequence-reverse copy problem, a sequence-reverse copy problem with repetitions, a filter sequence problem, a sequence-reverse copy problem with bigrams and four translation problems (English to Spanish, English to French, English to German and English to Italian). The achieved results reinforce our conjecture on the interaction between attention and recurrence.	es_ES
dc.language.iso	en	es_ES
dc.publisher	Elsevier	es_ES
dc.subject	long short-term memory networks	es_ES
dc.subject	attention mechanisms	es_ES
dc.subject	recurrence	es_ES
dc.subject	gate activations	es_ES
dc.subject	forget gate	es_ES
dc.title	Memory degradation induced by attention in recurrent neural architectures	es_ES
dc.type	journal article	es_ES
dc.subject.unesco	UNESCO::CIENCIAS TECNOLÓGICAS	es_ES
dc.identifier.doi	10.1016/j.neucom.2022.06.056	es_ES
dc.accrualmethod	CI	es_ES
dc.embargo.terms	0 days	es_ES
dc.type.hasVersion	VoR	es_ES
dc.rights.accessRights	open access	es_ES