NAGIOS: RODERIC FUNCIONANDO

Memory degradation induced by attention in recurrent neural architectures

Repositori DSpace/Manakin

Valencià Castellano

IMPORTANT: Aquest repositori està en una versió antiga des del 3/12/2023. La nova instal.lació está en https://roderic.uv.es/

Memory degradation induced by attention in recurrent neural architectures

Mostra el registre complet de l'element

Visualització (8.520Mb)

Harvat, Mykola; Martín Guerrero, José David

Aquest document és un/a article, creat/da en: 2022

This paper studies the memory mechanisms in recurrent neural architectures when attention models are included. Pure-attention models like Transformers are more and more popular as they tend to outperform models with recurrent connections in many different tasks. Our conjecture is that attention prevents the recurrent connections from transferring information properly between consecutive next steps. This conjecture is empirically tested using five different models, namely, a model without attention, a standard Loung attention model, a standard Bahdanau attention model, and our proposal to add attention to the inputs in order to fill the gap between recurrent and parallel architectures (for both Luong and Bahdanau attention models). Eight different problems are considered to assess the five models: a sequence-reverse copy problem, a sequence-reverse copy problem with repetitions, a filter... [Llegir més ...]

Veure al catàleg Trobes