Authentic Data - A Published Dataset - available for use/consumption by 3rd parties is built on data and data governance used to design and build a re-usable dataset from "first principles"
Presentation and discussion on "filling in the details"Authentic Data - Simple in Principle
Presentation
Slides - Authentic Data - Simple in Principle - Nov 1 DRMWG
Summary
- Authentic Data is data that has been crypto-signed by an "authority" (role) using their private key for which users of the data can verify using the "authority"s public key.
- Creating publishable/sharable data is via a Data Lifecycle where the data is initially captured/collected, then checked for input and consistency errors, cleaned of outliers, duplicates and checked for overall correctness. Each of those stages needs to be persisted and linked to the dataset that is published for 3rd party use that are part of the data provenance (trust) chain
- The Data needs to be designed with respect to structure, metadata, and "fitness for purpose". Governance needs to be designed to ensure accuracy, consistency and correctness
- Governance drives requirements for error, consistency and accuracy as an active part of the data lifecycle. Data Governance is the strategy, Data Stewardship is oversight the data lifecycle
[Presentation Note go here]- Discussion at:
- 14 mins: Kevin: Need a definition of "authentic"
- 22 mins: Burak: Going to need data transformation
- Neil: Definitely on the roadmap for future discussions
- James: Ontologies are heading to a hub model of data
- Neil: "super schemas" are out there
- James offered to present his perspective
- Burak offered to present his perspective
- James suggested open debate
- 34 mins: Carly: from a researcher's point of view, the data lifecycle is not a linear process
- 38 mins: Kevin: Root of trust is the authentic part of ACDC
- 42 mins: Burak. Let's separate secure data & semantic layers