Improved Density Ratio Estimation for Evaluating Synthetic Data Quality

  • Lukas Gruber (Speaker)

Activity: Talk or presentationPoster presentationscience-to-science

Description

High-quality synthetic data is essential for accurate downstream analysis. Density Ratio Estimation (DRE) has emerged as a powerful tool for evaluating synthetic data quality. However, existing DRE methods are highly sensitive to hyperparameter selection, where suboptimal choices lead to poor convergence rates and degraded empirical performance. To mitigate this, we propose a novel model aggregation algorithm for DRE that trains multiple models with diverse hyperparameter configurations and combines their outputs. Our approach achieves fast convergence without requiring prior knowledge of the unknown density ratio smoothness and is minimax optimal for the squared loss. We demonstrate that our method enhances the performance of established DRE techniques across benchmark datasets, achieving state-of-the-art results on MiniDomainNet and Amazon Reviews.
Period27 Apr 2025
Event titleICLR 2025 Workshop SynthData
Event typeWorkshop
LocationSingaporeShow on map
Degree of RecognitionInternational

Fields of science

  • 101019 Stochastics
  • 102003 Image processing
  • 103029 Statistical physics
  • 101018 Statistics
  • 101017 Game theory
  • 102001 Artificial intelligence
  • 202017 Embedded systems
  • 101016 Optimisation
  • 101015 Operations research
  • 101014 Numerical mathematics
  • 101029 Mathematical statistics
  • 101028 Mathematical modelling
  • 101026 Time series analysis
  • 101024 Probability theory
  • 102032 Computational intelligence
  • 102004 Bioinformatics
  • 102013 Human-computer interaction
  • 101027 Dynamical systems
  • 305907 Medical statistics
  • 101004 Biomathematics
  • 305905 Medical informatics
  • 101031 Approximation theory
  • 102033 Data mining
  • 102 Computer Sciences
  • 305901 Computer-aided diagnosis and therapy
  • 102019 Machine learning
  • 106007 Biostatistics
  • 102018 Artificial neural networks
  • 106005 Bioinformatics
  • 202037 Signal processing
  • 202036 Sensor systems
  • 202035 Robotics

JKU Focus areas

  • Digital Transformation