Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Memory Architectures for Deep Learning

Aktivität: Vortrag oder PräsentationEingeladener VortragScience-to-science

Beschreibung

Currently, the most successful Deep Learning architecture is the transformer. The attention mechanism of the transformer is equivalent to modern Hopfield networks, therefore is an associative memory. However, this associative memory has disadvantages like its quadratic complexity with the sequence length when mutually associating sequences elements, its restriction to pairwise associations, its limitations in modifying the memory, its insufficient abstraction capabilities. In contrast, recurrent neural networks (RNNs) like LSTMs have linear complexity, associate sequence elements with a representation of all previous elements, can directly modify memory content, and have high abstraction capabilities. However, RNNs cannot store sequence elements that were rare in the training data, since RNNs have to learn to store. Transformer can store rare or even new sequence elements, which is one of the main reasons besides their high parallelization why they outperformed RNNs in language modelling. I think that future successful Deep Learning architectures should comprise both of these memories: attention for implementing episodic memories and RNNs for implementing short-term memories and abstraction.
Zeitraum16 Nov. 2023
Ereignistitelunbekannt/unknown
VeranstaltungstypSonstiges
OrtÖsterreichAuf Karte anzeigen

Wissenschaftszweige

  • 101031 Approximationstheorie
  • 102 Informatik
  • 305901 Computerunterstützte Diagnose und Therapie
  • 102033 Data Mining
  • 102032 Computational Intelligence
  • 101029 Mathematische Statistik
  • 102013 Human-Computer Interaction
  • 305905 Medizinische Informatik
  • 101028 Mathematische Modellierung
  • 101027 Dynamische Systeme
  • 101004 Biomathematik
  • 101026 Zeitreihenanalyse
  • 202017 Embedded Systems
  • 101024 Wahrscheinlichkeitstheorie
  • 305907 Medizinische Statistik
  • 102019 Machine Learning
  • 202037 Signalverarbeitung
  • 102018 Künstliche Neuronale Netze
  • 103029 Statistische Physik
  • 202036 Sensorik
  • 202035 Robotik
  • 106005 Bioinformatik
  • 106007 Biostatistik
  • 101019 Stochastik
  • 101018 Statistik
  • 101017 Spieltheorie
  • 101016 Optimierung
  • 102001 Artificial Intelligence
  • 101015 Operations Research
  • 102004 Bioinformatik
  • 101014 Numerische Mathematik
  • 102003 Bildverarbeitung

JKU-Schwerpunkte

  • Digital Transformation