Skip to content

Data Standards

Data standards provide a common language for representing information, enabling different systems to understand and process data without ambiguity.

Consideration MUST be given to the use of appropriate data standards to ensure consistency and ease of integration.

Core Principles

  • APIs MUST use consistent data formats and standards across all endpoints.
  • APIs MUST validate all incoming data against defined schemas.
  • APIs MUST follow data protection and privacy requirements for sensitive data.
  • APIs MUST document any deviations from standard formats.

Data Models vs. Data Representations

It is important to differentiate between data models and data representations:

  • Data Model defines the structure, relationships, and constraints of data within a specific domain. It is a conceptual blueprint that outlines how data elements relate to each other and the rules governing their use.

  • Data Representation is the concrete format in which data is serialised for exchange or storage. For RESTful APIs, this is commonly JSON.

Industry Standards

APIs SHOULD adopt a domain-specific UKHSA data model or adopt an existing industry standard where appropriate while still using JSON as its core/principal data representation.

When defining new APIs or uplifting APIs it is important to look for industry standards and open standards that have already been adopted within UKHSA or by other related organisations and industries, such as FHIR for health data which is used by NHS England and OMOP for data analysis.

FHIR Implementations

If implementing the FHIR standard:

  • APIs MUST use FHIR UK Core profiles where they exist.
  • APIs MUST document any extensions to standard FHIR resources.
  • APIs SHOULD implement FHIR REST API patterns as described in the FHIR specification.
  • APIs MAY create custom FHIR profiles when UK Core profiles don't meet your needs.

OMOP Implementations

If implementing the OMOP Common Data Model:

  • APIs MUST use standardised clinical tables as defined in the OMOP CDM specification.
  • APIs MUST map source terminologies/vocabularies to OMOP standard concepts.
  • APIs SHOULD implement OMOP data quality assessment procedures.
  • APIs MAY create ETL processes to synchronised between OMOP and other standards such as FHIR when both are needed.

Terminology Standards

Terminology (or controlled vocabularies) play a crucial role in ensuring that data has a consistent and unambiguous meaning.

Using common terminologies is essential for data quality, consistency, and interoperability.

Terminology is not the same as FHIR. FHIR provides the structure and format for exchanging data, while terminology defines the meaning of the data elements within that structure.

APIs MUST adopt standardised terminologies (e.g., SNOMED CT, ICD-10, dm+d) whenever applicable.

APIs SHOULD specify the required terminologies for each data element within their OpenAPI definition, taking into account regional differences.

Terminology Implementations

  • APIs SHOULD use SNOMED CT for clinical terms.
  • APIs SHOULD use ICD-10 for medical diagnosis.
  • APIs SHOULD use dm+d for medicines and devices in England.
  • APIs SHOULD document any regional terminology variations for Scotland, Wales, and Northern Ireland.
  • APIs SHOULD provide terminology mappings when exchanging data across regions

Additional Considerations

Compliance

If there are regulatory or industry compliance requirements that mandate the use of specific data standards, these MUST be adhered to.

Interoperability

When APIs are designed to exchange data with external systems, especially within a specific industry or domain, a recognised data standard SHOULD be adopted. This ensures that both the API and the consuming systems can understand the data exchanged.

Over-Engineering

Data standards MUST NOT be applied blindly to every API. If an API's scope is extremely narrow, if it is not intended for data exchange, and if there are no compelling reasons for standardisation, then a custom model and representation may be more appropriate.

Performance Degradation

If adopting a data standard would introduce significant overhead in terms of processing or data size, and if interoperability is not a critical requirement, a standard MUST NOT be forced into the design.

Fit for purpose

If the data standard doesn't have the necessary types or fields to correctly describe the data, it MUST NOT be forced into the design.

Internal APIs (Limited Scope)

In cases where APIs are purely internal and their data is not intended for broader exchange, the use of data standards MAY be considered if it would improve the consistency between internal services.

Government Data Standards

As per the GDS Guidence you SHOULD design your APIs to follow appropriate government data standards in the Data Standards Catalog and External Standards Catalog.

Other relevent standards

  • JSON (RFC8259) is a lightweight, text-based, language-independent data interchange format.
  • GeoJSON (RFC7946) is a geospatial data interchange format based on JavaScript Object Notation (JSON).

See Common Data Types for additional standards.