Skip to content

Add more standards-based conventions to schema #29

@bollwyvl

Description

@bollwyvl

thanks for starting this work ❤️!

elevator pitch

Add #/{$id,$schema,title,description} to the top-level schema object, and a top-level #/properties/$schema to be validated in instance documents.

motivation

The current guidance of using tool-specific comment annotation, pointing at a "hot" file on a repo to get a reasonable developer experience seems like a backslide into the Bad Old Days of # [meta] selectors.

Using a standards-based representation of this metadata, tools would not have to parse magic comments to determine what schema should be used to validate a given instance.

While $schema only strictly has to be an identifier, making it a locator would mean that it could resolve directly, such that it would work with yaml-language-server out of the box, and other tools would not require a comment-aware parser to use the data programmatically.

design ideas

"$id": "https://prefix-dev.github.io/recipe-format/v0.0.0/schema/recipe/schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "conda recipe",
"description": "A declarative description of the build and test processes of a conda package: see https://prefix-dev.github.io/recipe-format/v0.0.0/schema/recipe/schema.html",
"properties": {
  "$schema": {
    "type": "string",
    "title": "Schema"
    "description": "An identifier for the schema used to validate this recipe",
    "format": "uri-reference",
    "default": "https://prefix-dev.github.io/recipe-format/v0.0.0/schema/recipe/schema.json"
  }
}

Or, in pydantic-ese, which generally makes a muck of human-readable schema:

VERSION = "0.0.0"  # or arrived at by other means, such as `pyproject.toml`
URL_BASE = "https://prefix-dev.github.io/recipe-format/v{VERSION}"
SCHEMA_URI = f"{URL_BASE}/schema/recipe/schema.json"  # leave space for related documents, if needed
SCHEMA_HTML = f"{URL_BASE}/schema/recipe/schema.html" # for humans (and SEO bots), a la #21
SCHEMA_DRAFT =  "http://json-schema.org/draft-07/schema#" # or the oldest compatible with pydantic magic used

class Whatever(BaseModel):

    class Config:
        json_schema_extra = {
            "$id": SCHEMA_URI,
            "$schema": SCHEMA_DRAFT,
            "title": "conda recipe",
            "description": f"A declarative description of the build and test processes of a conda package: see {SCHEMA_HTML}"
        }

    schema_: str | None = Field(
        SCHEMA_URI,
        alias="$schema",
        title="Schema",
        description="An identifier for the schema used to validate this recipe",
        format="uri-reference",
    )

alternatives

  • add a #/properties/schema_version field
    • this sounds good, as it is possible to represent multiple valid versions in the same schema file
      • practically, this is a pretty enormous pain vs being able to look at a humanely-formatted diff of two checked-in .json files (see below)

follow-on

  • adding $schema to #/required (e.g. by removing the | None) would further strengthen this critical piece of metadata
    • adding required fields without a period where they are optional first is fairly drastic

references

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions