On Signal Propagation in Neural Networks

Research output: ThesisDoctoral thesis

Abstract

Neural networks have proven to be powerful methods for various tasks in the field of machine learning. Especially deep neural networks, which consist of multiple layers, have extraordinary predictive capabilities. However, the difficulty of training neural networks typically increases with depth. The main cause for this difficulty is believed to be vanishing gradients and distribution shifts.
Signal propagation theory is the study of how data propagates through the network. By controlling signal propagation through deep neural networks, distribution shifts and vanishing gradients can effectively be alleviated. Therefore, improvements to signal propagation in neural networks have made it possible to train ever deeper models.
Despite extensive studies, there are still numerous boundaries that limit the possibilities of signal propagation. This cumulative thesis presents a set of publications that aim to push the boundaries of signal propagation theory on multiple levels. First, boundaries between different viewpoints on signal propagation are weakened by establishing connections between these viewpoints. Especially the connection between propagation during forward and backward passes is emphasised. Furthermore, the outer boundary of signal propagation theory is expanded by generalising signal propagation theory to layers that have weights with non-zero mean. This enables the creation of a principled weight initialisation for non-convex networks, where weights are constrained to be positive. Finally, boundaries on the application domain are expanded by creating a novel architecture with particular propagation characteristics. The resulting Mass-Conserving LSTM (MC-LSTM) architecture guarantees perfect conservation and propagation of its inputs, unlocking remarkable generalisation performance.
Original languageEnglish
QualificationPhD
Awarding Institution
  • Johannes Kepler University Linz
Supervisors/Reviewers
  • Klambauer, Günter, Supervisor
  • Hernández-Lobato, José Miguel, Co-supervisor, External person
Publication statusPublished - Dec 2023

Fields of science

  • 101019 Stochastics
  • 102003 Image processing
  • 103029 Statistical physics
  • 101018 Statistics
  • 101017 Game theory
  • 102001 Artificial intelligence
  • 202017 Embedded systems
  • 101016 Optimisation
  • 101015 Operations research
  • 101014 Numerical mathematics
  • 101029 Mathematical statistics
  • 101028 Mathematical modelling
  • 101026 Time series analysis
  • 101024 Probability theory
  • 102032 Computational intelligence
  • 102004 Bioinformatics
  • 102013 Human-computer interaction
  • 101027 Dynamical systems
  • 305907 Medical statistics
  • 101004 Biomathematics
  • 305905 Medical informatics
  • 101031 Approximation theory
  • 102033 Data mining
  • 102 Computer Sciences
  • 305901 Computer-aided diagnosis and therapy
  • 102019 Machine learning
  • 106007 Biostatistics
  • 102018 Artificial neural networks
  • 106005 Bioinformatics
  • 202037 Signal processing
  • 202036 Sensor systems
  • 202035 Robotics

JKU Focus areas

  • Digital Transformation

Cite this