Abstract
Primary data collection has become increasingly challenging. There are many reasons for this: Not only has the cost of conducting face-to-face interviews increased and the number of available interviewers decreased, but the ongoing shift to web-based interviewing has resulted in shorter questionnaires, making it difficult to accurately measure latent constructs and cover a wide range of topics. Therefore, despite the advantage of conducting primary data collection to match one’s research questions, secondary data analysis is often more feasible. For this purpose, data archives such as the Consortium of European Social
Science Data Archives (CESSDA) provide a large amount of high-quality data. However, a common problem when working with secondary data is that important variables are missing in one dataset and are only available in another. We propose a possible solution to overcome this problem by using "data fusion", which allows to augment one dataset by including the missing variables that are initially only available in another dataset. From a statistical point of view, this corresponds to a missing value problem, which is why multiple imputation is often used to fuse datasets. Despite this promising idea, data fusion is
only sporadically applied in the social sciences. This paper discusses the potential of this statistical technique in the context of social science research and derives a guide for practitioners interested in applying the method to their own research. The method and potential are discussed via an example using data from the European Social Survey (ESS) and the Austrian Social Survey (ASS).
Science Data Archives (CESSDA) provide a large amount of high-quality data. However, a common problem when working with secondary data is that important variables are missing in one dataset and are only available in another. We propose a possible solution to overcome this problem by using "data fusion", which allows to augment one dataset by including the missing variables that are initially only available in another dataset. From a statistical point of view, this corresponds to a missing value problem, which is why multiple imputation is often used to fuse datasets. Despite this promising idea, data fusion is
only sporadically applied in the social sciences. This paper discusses the potential of this statistical technique in the context of social science research and derives a guide for practitioners interested in applying the method to their own research. The method and potential are discussed via an example using data from the European Social Survey (ESS) and the Austrian Social Survey (ASS).
| Originalsprache | Englisch |
|---|---|
| Seiten (von - bis) | 3305-3326 |
| Seitenumfang | 22 |
| Fachzeitschrift | Quality and Quantity |
| Volume | 59 |
| Ausgabenummer | 4 |
| DOIs | |
| Publikationsstatus | Veröffentlicht - Aug. 2025 |
Wissenschaftszweige
- 509026 Digitalisierungsforschung
- 504007 Empirische Sozialforschung
- 101018 Statistik
- 101029 Mathematische Statistik
- 101024 Wahrscheinlichkeitstheorie
- 504006 Demographie
- 504004 Bevölkerungsstatistik
- 102035 Data Science
JKU-Schwerpunkte
- Digital Transformation
Projekte
- 1 Abgeschlossen
-
Digitize! Computational Social Sciences in der digitalen und sozialen Transformation
Forstner, M. (Forscher*in), Hasengruber, K. (Forscher*in), Quatember, A. (Forscher*in), Bacher, J. (Projektleiter*in) & Prandner, D. (Projektleiter*in)
01.01.2020 → 31.12.2024
Projekt: Geförderte Forschung › Bund / Land / Gemeinden
Dieses zitieren
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver