ELENAS: Elementary Neural Architecture Search

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Deep neural networks typically rely on a few key building blocks such as feed-forward, convolution, recurrent, long short-term memory, or attention blocks. On an elementary level, these blocks consist of a relatively small number of different mathematical operations. However, as the number of all combinations of these operations is immense, crafting such novel building blocks requires profound expert knowledge and is far from being fully explored. We propose Elementary Neural Architecture Search (ELENAS), a method that learns to combine elementary mathematical operations to form new building blocks for deep neural networks. These building blocks are represented as computational graphs, which are processed by graph neural networks as part of a reinforcement learning system. Our approach contrasts the current research direction of Neural Architecture Search, which mainly focuses on designing neural networks by altering and combining a few, already established, building blocks. In a set of experiments, we demonstrate that our method leads to efficient building blocks that achieve strong generalization and transfer well to real-world data. When stacked together, they approach and even outperform state-of-the-art neural networks at several prediction tasks. Our underlying methodological framework offers high flexibility and broad applicability across domains while requiring relatively small computational costs. Consequently, it has the potential to find novel building blocks that become of general importance for machine learning practitioners beyond specific data or use cases.
Original languageEnglish
Title of host publicationAutoML Conference 2023
Number of pages19
Publication statusPublished - 2023

Fields of science

  • 305907 Medical statistics
  • 202017 Embedded systems
  • 202036 Sensor systems
  • 101004 Biomathematics
  • 101014 Numerical mathematics
  • 101015 Operations research
  • 101016 Optimisation
  • 101017 Game theory
  • 101018 Statistics
  • 101019 Stochastics
  • 101024 Probability theory
  • 101026 Time series analysis
  • 101027 Dynamical systems
  • 101028 Mathematical modelling
  • 101029 Mathematical statistics
  • 101031 Approximation theory
  • 102 Computer Sciences
  • 102001 Artificial intelligence
  • 102003 Image processing
  • 102004 Bioinformatics
  • 102013 Human-computer interaction
  • 102018 Artificial neural networks
  • 102019 Machine learning
  • 102032 Computational intelligence
  • 102033 Data mining
  • 305901 Computer-aided diagnosis and therapy
  • 305905 Medical informatics
  • 202035 Robotics
  • 202037 Signal processing
  • 103029 Statistical physics
  • 106005 Bioinformatics
  • 106007 Biostatistics

JKU Focus areas

  • Digital Transformation

Cite this