Skip to main navigation Skip to search Skip to main content

xLSTM: LSTM is so back

Activity: Talk or presentationInvited talkscience-to-science

Description

Long Short-Term Memory (LSTM) networks have withstood the test of time, forming the foundation of many early deep learning breakthroughs—including the first generation of Large Language Models (LLMs). However, the rise of Transformers has since overshadowed LSTMs, establishing them as the dominant architecture for LLMs.
We revisited the potential of LSTMs and ask: Can LSTMs be scaled up to compete with Transformers?
We introduce xLSTM, a significantly enhanced version of LSTM featuring exponential gating and a novel matrix memory with covariance-based updates. Our kernel implementations of xLSTM adheres to scaling laws and demonstrates faster training than Transformers. More crucially, xLSTM has linear-time inference in the number of produced tokens, in stark contrast to the quadratic complexity of attention mechanisms, making it highly efficient for deployment.
A 7-billion parameter xLSTM model achieves comparable performance to state-of-the-art Transformer models, while offering significantly faster inference. We are currently developing distilled xLSTM variants from large Transformer models with accelerated inference. Additionally, xLSTM time-series foundation models are constructed, which already outperform leading approaches such as Chronos (Amazon), TimesFM (Google), and Moirai (Salesforce).
xLSTM is already seeing real-world adoption: companies like Spleenlab and Festo have successfully integrated it into commercial products.
Period03 Jul 2025
Event titleInternational Joint Conference on Neural Networks
Event typeConference
LocationRom, ItalyShow on map
Degree of RecognitionInternational

Fields of science

  • 101019 Stochastics
  • 102003 Image processing
  • 103029 Statistical physics
  • 101018 Statistics
  • 101017 Game theory
  • 102001 Artificial intelligence
  • 202017 Embedded systems
  • 101016 Optimisation
  • 101015 Operations research
  • 101014 Numerical mathematics
  • 101029 Mathematical statistics
  • 101028 Mathematical modelling
  • 101026 Time series analysis
  • 101024 Probability theory
  • 102032 Computational intelligence
  • 102004 Bioinformatics
  • 102013 Human-computer interaction
  • 101027 Dynamical systems
  • 305907 Medical statistics
  • 101004 Biomathematics
  • 305905 Medical informatics
  • 101031 Approximation theory
  • 102033 Data mining
  • 102 Computer Sciences
  • 305901 Computer-aided diagnosis and therapy
  • 102019 Machine learning
  • 106007 Biostatistics
  • 102018 Artificial neural networks
  • 106005 Bioinformatics
  • 202037 Signal processing
  • 202036 Sensor systems
  • 202035 Robotics

JKU Focus areas

  • Digital Transformation