Provenance in Data Science: From Data Models to Context-Aware Knowledge Graphs

Provenance in Data Science: From Data Models to Context-Aware Knowledge Graphs

English | 2021 | ISBN: 978-3030676803 | 121 Pages | PDF, EPUB | 10 MB

RDF-based knowledge graphs require additional formalisms to be fully context-aware, which is presented in this book. This book also provides a collection of provenance techniques and state-of-the-art metadata-enhanced, provenance-aware, knowledge graph-based representations across multiple application domains, in order to demonstrate how to combine graph-based data models and provenance representations. This is important to make statements authoritative, verifiable, and reproducible, such as in biomedical, pharmaceutical, and cybersecurity applications, where the data source and generator can be just as important as the data itself.

Capturing provenance is critical to ensure sound experimental results and rigorously designed research studies for patient and drug safety, pathology reports, and medical evidence generation. Similarly, provenance is needed for cyberthreat intelligence dashboards and attack maps that aggregate and/or fuse heterogeneous data from disparate data sources to differentiate between unimportant online events and dangerous cyberattacks, which is demonstrated in this book. Without provenance, data reliability and trustworthiness might be limited, causing data reuse, trust, reproducibility and accountability issues.

This book primarily targets researchers who utilize knowledge graphs in their methods and approaches (this includes researchers from a variety of domains, such as cybersecurity, eHealth, data science, Semantic Web, etc.). This book collects core facts for the state of the art in provenance approaches and techniques, complemented by a critical review of existing approaches. New research directions are also provided that combine data science and knowledge graphs, for an increasingly important research topic.