Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Rethinking Uncertainty Estimation in Natural Language Generation

Publikation: Preprints, Working Paper und ForschungsberichteVorabpublikation

Abstract

Large Language Models (LLMs) are increasingly employed in real-world applications, driving the need to evaluate the trustworthiness of their generated text. To this end, reliable uncertainty estimation is essential. Since current LLMs generate text autoregressively through a stochastic process, the same prompt can lead to varying outputs. Consequently, leading uncertainty estimation methods generate and analyze multiple output sequences to determine the LLM's uncertainty. However, generating output sequences is computationally expensive, making these methods impractical at scale. In this work, we inspect the theoretical foundations of the leading methods and explore new directions to enhance their computational efficiency. Building on the framework of proper scoring rules, we find that the negative log-likelihood of the most likely output sequence constitutes a theoretically grounded uncertainty measure. To approximate this alternative measure, we propose G-NLL, which has the advantage of being obtained using only a single output sequence generated by greedy decoding. This makes uncertainty estimation more efficient and straightforward, while preserving theoretical rigor. Empirical results demonstrate that G-NLL achieves state-of-the-art performance across various LLMs and tasks. Our work lays the foundation for efficient and reliable uncertainty estimation in natural language generation, challenging the necessity of more computationally involved methods currently leading the field.
OriginalspracheEnglisch
Seitenumfang19
PublikationsstatusVeröffentlicht - 2024

Publikationsreihe

NamearXiv.org

Wissenschaftszweige

  • 305907 Medizinische Statistik
  • 202017 Embedded Systems
  • 202036 Sensorik
  • 101004 Biomathematik
  • 101014 Numerische Mathematik
  • 101015 Operations Research
  • 101016 Optimierung
  • 101017 Spieltheorie
  • 101018 Statistik
  • 101019 Stochastik
  • 101024 Wahrscheinlichkeitstheorie
  • 101026 Zeitreihenanalyse
  • 101027 Dynamische Systeme
  • 101028 Mathematische Modellierung
  • 101029 Mathematische Statistik
  • 101031 Approximationstheorie
  • 102 Informatik
  • 102001 Artificial Intelligence
  • 102003 Bildverarbeitung
  • 102004 Bioinformatik
  • 102013 Human-Computer Interaction
  • 102018 Künstliche Neuronale Netze
  • 102019 Machine Learning
  • 102032 Computational Intelligence
  • 102033 Data Mining
  • 305901 Computerunterstützte Diagnose und Therapie
  • 305905 Medizinische Informatik
  • 202035 Robotik
  • 202037 Signalverarbeitung
  • 103029 Statistische Physik
  • 106005 Bioinformatik
  • 106007 Biostatistik

JKU-Schwerpunkte

  • Digital Transformation

Dieses zitieren