Projects per year
Abstract
Agents interacting under partial observability require access to past observations via a memory mechanism in order to approximate the true state of the environment.
Recent work suggests that leveraging language as abstraction provides benefits for creating a representation of past events.
History Compression via Language Models (HELM) leverages a pretrained Language Model (LM) for representing the past.
It relies on a randomized attention mechanism to translate environment observations to token embeddings.
In this work, we show that the representations resulting from this attention mechanism can collapse under certain conditions.
This causes blindness of the agent to subtle changes in the environment that may be crucial for solving a certain task.
We propose a solution to this problem consisting of two parts.
First, we improve upon HELM by substituting the attention mechanism with a feature-wise centering-and-scaling operation.
Second, we take a step toward semantic history compression by leveraging foundation models, such as CLIP, to encode observations, which further improves performance.
By combining foundation models, our agent is able to solve the challenging MiniGrid-Memory environment.
Surprisingly, however, our experiments suggest that this is not due to the semantic enrichment of the representation presented to the LM, but rather due to the discriminative power provided by CLIP.
We make our code publicly available at https://github.com/ml-jku/helm.
Original language | English |
---|---|
Title of host publication | Neural Information Processing Systems Foundation (NeurIPS 2022) |
Number of pages | 90 |
Publication status | Published - 2022 |
Fields of science
- 305907 Medical statistics
- 202017 Embedded systems
- 202036 Sensor systems
- 101004 Biomathematics
- 101014 Numerical mathematics
- 101015 Operations research
- 101016 Optimisation
- 101017 Game theory
- 101018 Statistics
- 101019 Stochastics
- 101024 Probability theory
- 101026 Time series analysis
- 101027 Dynamical systems
- 101028 Mathematical modelling
- 101029 Mathematical statistics
- 101031 Approximation theory
- 102 Computer Sciences
- 102001 Artificial intelligence
- 102003 Image processing
- 102004 Bioinformatics
- 102013 Human-computer interaction
- 102018 Artificial neural networks
- 102019 Machine learning
- 102032 Computational intelligence
- 102033 Data mining
- 305901 Computer-aided diagnosis and therapy
- 305905 Medical informatics
- 202035 Robotics
- 202037 Signal processing
- 103029 Statistical physics
- 106005 Bioinformatics
- 106007 Biostatistics
JKU Focus areas
- Digital Transformation
Projects
- 1 Finished
-
JKU LIT SAL eSPML Lab
Baumgartner, S. (Researcher), Bognar, G. (Researcher), Hochreiter, S. (Researcher), Hofmarcher, M. (Researcher), Kovacs, P. (Researcher), Schmid, S. (Researcher), Shtainer, A. (Researcher), Springer, A. (Researcher), Wille, R. (Researcher) & Huemer, M. (PI)
01.07.2020 → 31.12.2023
Project: Other › Other project