Skip to content

Entity Specifications

This directory contains entity specifications organized by domain model level and version. Each level represents a variant of the domain model with its own versioning.

Directory Structure

entities/
├── level2/                # Base domain model
│   ├── type-registry.yaml # Type registry for this variant
│   ├── concept-1.0.md     # Entity specs with version in filename
│   └── ...                # Other entity specs
└── level2-ehds/           # EHDS-specific extension (future)

Current Levels

level1

The base domain model for describing Dataset, DatasetCollections and Organizations.

level2

The extended domain model containing core statistical metadata entities. Handles variables and populations in addition

level2-ehds (planned)

EHDS (European Health Data Space) specific extensions.

URI Mapping Formula

specs/entities/{level}/{entity}-{version}.md -> https://schemas.rutdev.se/xsd/entities/{level}/{Entity}-{version}.xsd

The version is extracted from the filename suffix (e.g., concept-1.0.md → version 1.0).

Entity Specification Format

Each entity file contains a YAML properties block that defines the entity's complexType for XSD generation.

File Structure

---
entity: EntityName              # PascalCase entity name (required)
---

# EntityName

## Properties

```yaml
properties:
  - name: propertyName
    type: prefix:TypeName
    required: true|false
    description: "..."
### Properties Block Format

Each property maps to an `xsd:element` within the entity's `xsd:complexType`.

```yaml
properties:
  - name: idAtOrigin              # Property name (camelCase)
    type: IdentifierToken         # Type name (PascalCase, no prefix)
    required: true                # Required = minOccurs 1 (default)
    description: "Identifier at the source system. Used during import to determine if the entity exists (update) or is new (create)."

  - name: name
    type: ShortString
    required: true
    description: "The name of the entity"

  - name: description
    type: LongString
    required: false               # Optional = minOccurs 0
    description: "Detailed description"

  - name: validityPeriod
    type: DateRangeOpenEnd        # Custom composite type
    required: false

  - name: tags
    type: ShortString
    required: false
    maxOccurs: unbounded          # List/array property

Property Fields

Field Required Description
name Yes Property name in camelCase
type Yes Type name in PascalCase (prefix resolved automatically)
required Yes true for mandatory, false for optional
description Yes Human-readable explanation
maxOccurs No Maximum occurrences (default: 1, use unbounded for lists)

Available Types

Types are defined in specs/types/ and discovered automatically:

Source Types
Built-in XSD string, dateTime, date, boolean, integer, positiveInteger, long, decimal, anyURI, language, token
types/strings/ IdentifierToken, ShortString, LongString, MultilingualText, MultilingualShortString, MultilingualLongString
types/dates/ DateRange, DateRangeOpenEnd
types/coordinates/ Coordinates
types/enums/ DatasetCollection, PersonalDataStatus, StatisticalDesignType

Note: MultilingualText properties automatically get maxOccurs="unbounded" in XSD to allow multiple language versions.

Type Registry

Each entity directory contains a type-registry.yaml that maps prefixes to type sources:

registry:
  xsd:
    namespace: "http://www.w3.org/2001/XMLSchema"

  strings:
    path: "types/strings"
    version: "1.0"

  dates:
    path: "types/dates"
    version: "1.0"
  • namespace entries: External/built-in types (used directly)
  • path entries: Local types with explicit version field (namespace derived from path and version)

All entities in the same directory use the same type versions.

XSD Generation

Property to Element Mapping

Property field XSD attribute Transformation
name @name camelCase → PascalCase
type @type Resolve prefix via registry, output prefix:TypeName
required: true @minOccurs Omitted (default is 1)
required: false @minOccurs="0" Explicit
maxOccurs @maxOccurs Used as-is if specified

Type Resolution

  1. Parse property type: IdentifierToken
  2. Search type specs in registry paths for matching name
  3. Found in types/strings/strings-1.0.md → prefix is strings
  4. Derive namespace from path and version
  5. Output type="strings:IdentifierToken"

Example: Entity to XSD

Entity specification:

properties:
  - name: idAtOrigin
    type: IdentifierToken
    required: true
    description: "Identifier at the source system"

  - name: name
    type: ShortString
    required: true
    description: "The name"

  - name: description
    type: LongString
    required: false
    description: "Detailed description"

Generated XSD:

<xsd:schema
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:strings="https://schemas.rutdev.se/xsd/types/strings-1.0.xsd"
    targetNamespace="https://schemas.rutdev.se/xsd/entities/level2/Concept-1.0.xsd"
    elementFormDefault="qualified"
>
    <xsd:import namespace="https://schemas.rutdev.se/xsd/types/strings-1.0.xsd"
                schemaLocation="../../types/strings-1.0.xsd" />

    <xsd:complexType name="Concept">
        <xsd:sequence>
            <xsd:element name="IdAtOrigin" type="strings:IdentifierToken" />
            <xsd:element name="Name" type="strings:ShortString" />
            <xsd:element name="Description" type="strings:LongString" minOccurs="0" />
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

Relationships to XSD Mapping

Entity relationships (defined in Mermaid diagrams) map to XSD relation elements:

Entity Specification (Mermaid):

erDiagram
    ConceptualVariable }|--|| Concept : concept
    ConceptualVariable }|--|| UnitType : unit_type

Generated XSD RelationsType:

<xsd:complexType name="ConceptualVariableRelationsType">
    <xsd:sequence>
        <xsd:element name="concept" type="strings:EntityReference" minOccurs="0"/>
        <xsd:element name="unitType" type="strings:EntityReference" minOccurs="0"/>
    </xsd:sequence>
</xsd:complexType>

Key points: - Each relationship becomes an element in the {EntityName}RelationsType - All relationship elements use the shared strings:EntityReference type - The relationship name is converted from snake_case to camelCase - Relationships are expressed as idAtOrigin values of target entities - The EntityReference type allows zero or more ref elements for flexibility

See specs/_generation/xml-schema.md for complete XSD generation guidelines.

Generation Rules

  1. Type resolution: Look up each property type in registry paths; resolve to prefix:TypeName
  2. Namespace imports: Collect unique prefixes; generate xsd:import for each non-xsd prefix
  3. Element order: Elements appear in the order defined in the properties block
  4. Name transformation: camelCase property names become PascalCase XML element names
  5. Required handling: Only emit minOccurs="0" for optional properties
  6. List properties: Properties with maxOccurs: unbounded generate repeating elements
  7. Relationships: Each relationship maps to an element using strings:EntityReference type