2021-04-06 Standard Data Models and Elements Drafting Group Meeting Notes

Time	Item	Who
2 min	Welcome & Antitrust Policy Notice	Rebecca Distler & Brian Plew
5 min	Introducing Paul Knowles (new co-chair)	Paul Knowles
5 min	Data Schema	Brian Plew
10 min	Methods of Evaluating Data Models	Brian Plew
5 min	30 / 90 / 180 Day Framework	Brian Plew
30 min	CCI Data Schema Overview	Paul Knowles

Overview of CCI Schema (Paul Knowles) - Google Drive Folders

Topic: Good Health Pass - Standard Data Models and Elements
Start Time : Apr 6, 2021 12:00 PM

1. Welcome and Linux Foundation antitrust policy

2. Data Schema

Q: Do we really need to go beyond the minimum? What would we do if we were trying to describe more than minimum? What would it be used for? Do we invest extra effort into it?
- Huge amounts of data captured by legacy systems (e.g., physical address); this information isn't necessarily required as long as you have certain active identifiers you can authenticate
  - Cross-reference with CDC to have some mapping; EU has largely disregarded legacy attributes
- With eHealth group, come up with 3 specs for minimum viable capture (testing, vax, recovery)
- Same data capture used across jurisdictions
Q: Do optional attributes become the “superset”?
- Optional = not imposed on all jurisdictions
- E.g., CVX code used in North America, but not required in Europe (done with market authorization holder)

3. Methods of Evaluating Data Models

Q: If it’s a small number of jurisdictions, does it become optional? Or is it based on the element itself?
- If it’s a totally new attribute for a jurisdiction, put as optional in superset
Q: Are we doing a data dictionary (schema, layout)?

We can't define schemas used by each country / dictate what schemas are used (use any entomology or data model)
But when it comes into capture space, can rebuild semantics so it can deal with multiple languages and decide on data model to use on data exchange side
Immediate reality is that people won’t agree on a single schema - have to identify data dictionary - compromise position to interoperate between schemas

Discussion on using FHIR exchange (use data capture structure to come up with all the human readable labels)
Discussion on using JSON-LD (FHIR might be too healthcare focused)
- FHIR not most consumable structure for end users to understand data - and majority of consumers are not going to be healthcare entities
- Serialization shouldn’t matter at architecture level; need to fit in payload off of a W3C cred

Discussion on differences between data management; data collection and data exchange; collecting data from multiple sources (rebuild so you can exchange it for machine readability - can define an architecture without defining attribute names)
Should explore a few options and weigh pros and cons against them

4. 30 / 90 / 180 Day Framework

Objective: Large orgs can make announcement at 30 day point on "intent to implement;" smaller orgs may be able to move quicker
- Provide guidance to enable development of credentials and passes by 90 days
- Implementation / rollout = credentials and passes first; explore end-to-end interoperability with full health records later on
  - Still helpful to have an understanding of the "health record" first (e.g., superset) and then work backwards into credentials and passes
Delineate difference between lab results vs. medical interpretation vs. passes

5. CCI Data Schema Overview

EU is first document defining recovery minimum dataset - super helpful because we can use this as a template to start with, and compare this with CDC, and others
EU also has vaccination and testing documents as well
- Lists things like COVID tests authorized across all of Europe (capturing all of this in a global spec), including some more granularity into testing devices
  - Goes into minimum data capture for testing process
- Cross referenced with CDC (vax reporting); for example, has not included address in spec

Include formats / labels / some pre-defined entries for fields (might change for jurisdictions); some text to help users understand fields
Note unique numeric ID (e.g., iRespond use case); there may be a few use cases using biometrics. Rather than leave it out, suggests that we are aware that there is space for biometric ID if we need it

Recovery = 3 fields (cryptographically link recovery to test record; link to positive test sample)

Can push into an OCA form; generates JSON-LD form
Questions & discussion
- Could use as a stand-in for SNOMED; difficult to tell countries what do about encoding data (or common set of codes)
  - International health standards org competes with WHO standard (?)
  - Different standards from FDA
- Medical codes vs. ontology
- A lot of sensitive data should not be held in the credential; should be held in a token (a report kept separately from a credential and certificate)
Yellow Fever solution example (Paul Knowles)
- All the pre-defined entries change based on language selected; when you want to store this info to aggregate it - might want to store it in one language but capture it in another
- Multiple layers of certificates, with translations just one of them
Need to think about what model to use on the data exchange side
Note that with all datasets, high degree of convergence

6. Wrap up

Action Items

Need to discuss how to operationalize CCI Schema mapping into GHPC recommendations (including use of template + annex)
Further discussion on FHIR vs. JSON-LD data exchange model
Move working group meetings to Mondays

Space shortcuts