GOLDCASE: A Generic Ontology Layer for Data Catalog Semantics

  • Johannes Schrott
  • , Sabine Weidinger
  • , Martin Tiefengrabner
  • , Christian Lettner
  • , Wolfram Wöß
  • , Lisa Ehrlinger

Research output: Chapter in Book/Report/Conference proceedingConference proceedingspeer-review

Abstract

Data catalogs automatically collect metadata from distributed data sources and provide a unified and easily accessible view on the data. Many existing data catalog tools focus on the automatic collection of technical metadata (e.g., from a data dictionary) into a central repository. The functionality of annotating data with semantics (i.e., its meaning) in these tools is often not expressive enough to model complex real-world scenarios. In this paper, we propose a generic ontology layer (GOLDCASE), which maps the semantics of data in form of a high-expressive data model to the technical metadata provided by a data catalog. Hence, we achieve the following advantages: 1) users have access to an understandable description of the data objects, their relationships, and their semantics in the domain-specific data model. 2) GOLDCASE maps this knowledge directly to the metadata provided by data catalog tools and thus enables their reuse. 3) The ontology layer is machine-readable, which greatly improves automatic evaluation and data exchange. This is accompanied by improved FAIRness of the overall system. We implemented the approach at PIERER Innovation GmbH on top of an Informatica Enterprise Data Catalog to show and evaluate its applicability.
Original languageEnglish
Title of host publicationMetadata and Semantic Research - 16th Research Conference, MTSR 2022, Revised Selected Papers
EditorsEmmanouel Garoufallou, Andreas Vlachidis
Place of PublicationCham
PublisherSpringer
Pages26-38
Number of pages13
Volume1789
ISBN (Print)978-3-031-39140-8
DOIs
Publication statusPublished - 2023

Publication series

NameCommunications in Computer and Information Science
Volume1789 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Fields of science

  • 102010 Database systems
  • 102015 Information systems
  • 102025 Distributed systems
  • 102028 Knowledge engineering
  • 102033 Data mining
  • 509018 Knowledge management

JKU Focus areas

  • Digital Transformation

Cite this