Classical Estimation versus Machine Learning for Pitch Estimation

Jonas Lindenberger

Research output: ThesisMaster's / Diploma thesis

Abstract

Estimating the pitch of a periodic signal, also referred to as fundamental frequency, plays an important role in many signal processing applications, ranging from audio- and speech-signal processing to industrial applications and many more. The maximum likelihood (ML) pitch estimator is of great practical importance since it is optimal for additive white Gaussian noise (AWGN) in the sense that it is unbiased, and its variance approximately attains the Cramér-Rao lower bound (CRLB) for large data records. It also allows for a low-complex fast Fourier transform (FFT)-based implementation, subject to some mild restrictions that are often met in practice. Recently, a lot of work has been done on pitch estimation, tracking and detection using machine learning approaches. While most of these works have been developed for specific tasks such as pitch tracking in music signals or heart rate estimation, very little effort has been made to compare these new approaches with classical benchmarks such as the CRLB and the accuracy and computational complexity of the ML pitch estimator. In this work, we describe the classical ML pitch estimator and derive its accuracy for the case where windowed data is used before applying the estimator. We also train several neural networks, using both existing and new architectures, on simulated data and compare their accuracy and computational complexity with the classical ML pitch estimator. And we provide a detailed description of the insights gained from training the neural networks, some of which may be useful for other estimation problems using neural networks.
Original languageEnglish
Supervisors/Reviewers
  • Haberl, Alexander, Co-supervisor, External person
  • Huemer, Mario, Supervisor
  • Lang, Oliver, Co-supervisor
  • Roland, Theresa, Co-supervisor
  • Staudinger, Clemens, Co-supervisor
Publication statusPublished - Jul 2023

Fields of science

  • 102019 Machine learning
  • 202 Electrical Engineering, Electronics, Information Engineering
  • 202015 Electronics
  • 202022 Information technology
  • 202037 Signal processing

JKU Focus areas

  • Digital Transformation

Cite this