TY - GEN
T1 - GOLDCASE: A Generic Ontology Layer for Data Catalog Semantics
AU - Schrott, Johannes
AU - Weidinger, Sabine
AU - Tiefengrabner, Martin
AU - Lettner, Christian
AU - Wöß, Wolfram
AU - Ehrlinger, Lisa
PY - 2023
Y1 - 2023
N2 - Data catalogs automatically collect metadata from distributed data sources and provide a unified and easily accessible view on the data. Many existing data catalog tools focus on the automatic collection of technical metadata (e.g., from a data dictionary) into a central repository. The functionality of annotating data with semantics (i.e., its meaning) in these tools is often not expressive enough to model complex real-world scenarios. In this paper, we propose a generic ontology layer (GOLDCASE), which maps the semantics of data in form of a high-expressive data model to the technical metadata provided by a data catalog. Hence, we achieve the following advantages: 1) users have access to an understandable description of the data objects, their relationships, and their semantics in the domain-specific data model. 2) GOLDCASE maps this knowledge directly to the metadata provided by data catalog tools and thus enables their reuse. 3) The ontology layer is machine-readable, which greatly improves automatic evaluation and data exchange. This is accompanied by improved FAIRness of the overall system. We implemented the approach at PIERER Innovation GmbH on top of an Informatica Enterprise Data Catalog to show and evaluate its applicability.
AB - Data catalogs automatically collect metadata from distributed data sources and provide a unified and easily accessible view on the data. Many existing data catalog tools focus on the automatic collection of technical metadata (e.g., from a data dictionary) into a central repository. The functionality of annotating data with semantics (i.e., its meaning) in these tools is often not expressive enough to model complex real-world scenarios. In this paper, we propose a generic ontology layer (GOLDCASE), which maps the semantics of data in form of a high-expressive data model to the technical metadata provided by a data catalog. Hence, we achieve the following advantages: 1) users have access to an understandable description of the data objects, their relationships, and their semantics in the domain-specific data model. 2) GOLDCASE maps this knowledge directly to the metadata provided by data catalog tools and thus enables their reuse. 3) The ontology layer is machine-readable, which greatly improves automatic evaluation and data exchange. This is accompanied by improved FAIRness of the overall system. We implemented the approach at PIERER Innovation GmbH on top of an Informatica Enterprise Data Catalog to show and evaluate its applicability.
UR - https://www.scopus.com/pages/publications/85171545262
U2 - 10.1007/978-3-031-39141-5_3
DO - 10.1007/978-3-031-39141-5_3
M3 - Conference proceedings
SN - 978-3-031-39140-8
VL - 1789
T3 - Communications in Computer and Information Science
SP - 26
EP - 38
BT - Metadata and Semantic Research - 16th Research Conference, MTSR 2022, Revised Selected Papers
A2 - Garoufallou, Emmanouel
A2 - Vlachidis, Andreas
PB - Springer
CY - Cham
ER -