Modern Hopfield Networks

Activity: Talk or presentationInvited talkscience-to-science

Description

Associative memories are one of the earliest artificial neural models dating back to the 1960s and 1970s. Best known are Hopfield Networks, presented by John Hopfield in 1982. Recently, Modern Hopfield Networks have been introduced, which tremendously increase the storage capacity and converge extremely fast. We generalize the energy function of modern Hopfield Networks to continuous patterns and propose a new update rule. The new Hopfield Network has exponential storage capacity. Its update rule ensures global convergence to energy minima and converges in one update step with exponentially low error. The new Hopfield network has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points, which store a single pattern. Surprisingly, the transformer attention mechanism is equal to the update rule of our new modern Hopfield Network with continuous states. Transformer and BERT models operate in their first layers preferably in the global averaging regime, while they operate in higher layers in metastable states. We provide a new PyTorch layer called "Hopfield", which allows equipping deep learning architectures with modern Hopfield networks as a new powerful concept comprising pooling, memory, and attention. The layer serves for applications like multiple instance learning, set-based and permutation invariant learning, associative learning, and many more. We show some tasks, for which we could increase the performance by integrating the new Hopfield layer into a deep learning architecture.
Period18 Nov 2020
Event title20th IEEE International Conference on Data Mining (ICDM 2020)
Event typeConference
LocationAustriaShow on map

Fields of science

  • 101031 Approximation theory
  • 102 Computer Sciences
  • 305901 Computer-aided diagnosis and therapy
  • 102033 Data mining
  • 102032 Computational intelligence
  • 101029 Mathematical statistics
  • 102013 Human-computer interaction
  • 305905 Medical informatics
  • 101028 Mathematical modelling
  • 101027 Dynamical systems
  • 101004 Biomathematics
  • 101026 Time series analysis
  • 202017 Embedded systems
  • 101024 Probability theory
  • 305907 Medical statistics
  • 102019 Machine learning
  • 202037 Signal processing
  • 102018 Artificial neural networks
  • 103029 Statistical physics
  • 202036 Sensor systems
  • 202035 Robotics
  • 106005 Bioinformatics
  • 106007 Biostatistics
  • 101019 Stochastics
  • 101018 Statistics
  • 101017 Game theory
  • 101016 Optimisation
  • 102001 Artificial intelligence
  • 101015 Operations research
  • 102004 Bioinformatics
  • 101014 Numerical mathematics
  • 102003 Image processing

JKU Focus areas

  • Digital Transformation