Data fusion: creating new opportunities for data analysis? A study on the potential of data fusion in survey research

Research output: Contribution to journalArticlepeer-review

Abstract

Primary data collection has become increasingly challenging. There are many reasons for this: Not only has the cost of conducting face-to-face interviews increased and the number of available interviewers decreased, but the ongoing shift to web-based interviewing has resulted in shorter questionnaires, making it difficult to accurately measure latent constructs and cover a wide range of topics. Therefore, despite the advantage of conducting primary data collection to match one's research questions, secondary data analysis is often more feasible. For this purpose, data archives such as the Consortium of European Social Science Data Archives (CESSDA) provide a large amount of high-quality data. However, a common problem when working with secondary data is that important variables are missing in one dataset and are only available in another. We propose a possible solution to overcome this problem by using "data fusion", which allows to augment one dataset by including the missing variables that are initially only available in another dataset. From a statistical point of view, this corresponds to a missing value problem, which is why multiple imputation is often used to fuse datasets. Despite this promising idea, data fusion is only sporadically applied in the social sciences. This paper discusses the potential of this statistical technique in the context of social science research and derives a guide for practitioners interested in applying the method to their own research. The method and potential are discussed via an example using data from the European Social Survey (ESS) and the Austrian Social Survey (ASS).
Original languageEnglish
Pages (from-to)3305-3326
Number of pages22
JournalQuality and Quantity
Volume59
Issue number4
DOIs
Publication statusPublished - Aug 2025

Fields of science

  • 509026 Digitalisation research
  • 504007 Empirical social research
  • 101018 Statistics
  • 101029 Mathematical statistics
  • 101024 Probability theory
  • 504006 Demography
  • 504004 Population statistics
  • 102035 Data science

JKU Focus areas

  • Digital Transformation

Cite this