Skip to content

A curated list of awesome JSON Schema resources, tutorials, tools, and more

License

Notifications You must be signed in to change notification settings

sourcemeta/awesome-jsonschema

Awesome JSON Schema Awesome

PRs Welcome Join Slack OpenCollective

A curated list of awesome JSON Schema resources, tutorials, tools, and more.

JSON Schema is a JSON-based format to annotate and validate JSON documents with a vibrant community. JSON Schema is defined by a set of IETF specifications and it is the industry-standard for defining the structure and meaning of JSON documents.


Would you like to promote your company or product here? Sponsor us on GitHub


Contents

Getting Started

  • Learn JSON Schema - A comprehensive JSON Schema documentation website covering all specification versions.
  • JSON Schema Tour - An interactive tutorial to learn JSON Schema step by step.

Courses

  • Master JSON Schema for OpenAPI - A comprehensive 9+ hour video course teaching advanced JSON Schema techniques for API design, covering dynamic references, unevaluated properties, schema composition, testing, linting, and deployment to registries.

Development Tools

  • AlterSchema - Convert a JSON Schema definition between specification versions.
  • JSON Schema CLI - A comprehensive command-line tool for working with JSON Schema supporting formatting, linting, testing, bundling, and validation across all JSON Schema versions.
  • JSONBuddy - A JSON editor and validator desktop application for Windows.
  • Sourcemeta Studio - A Visual Studio Code extension providing professional JSON Schema tooling with real-time linting, automatic formatting, and metaschema validation.

Books

  • (2024) Unifying Business, Data, and Code: Designing Data Products with JSON Schema - Covers topics such as writing your own JSON Schema vocabularies, understanding JSON Schema annotations, and hosting your own JSON Schema registries. More importantly, our book teaches you a methodology for effective data management.
  • (2021) API by Design - Introduces an approach to measure API complexity by analyzing entropy in JSON Schema definitions.
  • (2017) JSON at Work - A comprehensive overview of the JSON ecosystem, including JSON Schema.
  • (2014) Using JSON Schema - Learn and Apply JSON Schema by Example, with JavaScript (Node.js) and Python Programs.

Registries

  • Apicurio Registry - A runtime server system for storing and managing API designs and schemas including OpenAPI, AsyncAPI, Avro, and JSON Schema with configurable content rules for evolution control.
  • Sourcemeta One - A self-hosted JSON Schema microservice that transforms Git repositories into searchable, discoverable schema catalogs with a web explorer, editor integration, schema health checks, and a rich HTTP API.
  • Sourcemeta Schemas - A public free instance of Sourcemeta One re-offering various open source schema collections.

Articles

Related Specifications

  • Agent2Agent Protocol (A2A) - An open protocol by Google enabling communication and interoperability between agentic applications. A2A uses JSON-RPC 2.0 over HTTP and JSON Schema for defining Agent Cards and message structures.
  • AsyncAPI - AsyncAPI is an open source initiative that seeks to improve the current state of Event-Driven Architectures (EDA). The AsyncAPI specification supports data modeling using JSON Schema.
  • JSON Schema in RDF - This document introduces an RDF vocabulary for JSON Schema definitions. This vocabulary provides a stable namespace IRI for JSON Schema keywords, as well as simple axioms, defined against schema.org's meta-model.
  • Model Context Protocol (MCP) - An open standard by Anthropic that enables AI systems like large language models to integrate with external tools and data sources. MCP uses JSON-RPC 2.0 and JSON Schema for defining server capabilities.
  • OpenAPI - The OpenAPI Specification embeds and extends JSON Schema for defining API requests and responses.
  • RAML - The RAML specification supports modeling API data using JSON Schema.
  • REST API Linked Data Keywords - An Internet-Draft proposing JSON Schema keywords to attach semantic information to OpenAPI and JSON Schema documents, enabling contract-first API design with RDF type information and JSON-LD context.
  • Semantic Definition Format (SDF) - An IETF specification for modeling Internet of Things devices and their interactions through Properties, Actions, and Events. SDF uses JSON to represent definitions and incorporates JSON Schema for data validation.
  • W3C Web of Things - The Web of Things (WoT) seeks to counter the fragmentation of the IoT by using and extending existing, standardized Web technologies. WoT models data using JSON Schema.

Videos

For more video content, check out the official JSON Schema YouTube channel and the JSON Schema Conference website.

Papers

  • (2025) Blaze: Compiling JSON Schema for 10x Faster Validation - This paper introduces Blaze, a JSON Schema validator compiles complex schemas to an efficient representation in seconds to minutes, adding minimal overhead at build time. Blaze incorporates several unique optimizations to reduce the validation time by an average of approximately 10x compared existing validators on a variety of datasets. In some cases, Blaze achieves a reduction in validation time of multiple orders of magnitude compared to the next fastest validator. We also demonstrate that several popular validators produce incorrect results in some cases, while Blaze maintains strict adherence to the JSON Schema specification..
  • (2025) Elimination of annotation dependencies in validation for Modern JSON Schema - This paper proves that the elimination of annotation dependent keywords cannot, in general, avoid an exponential increase of the schema dimension. We provide an algorithm to eliminate these keywords that, despite the theoretical lower bound, behaves quite well in practice, as we verify with an extensive set of experiments..
  • (2025) JSONSchemaBench: A Rigorous Benchmark of Structured Outputs for Language Models - This paper introduces JSONSchemaBench, a benchmark for constrained decoding comprising 10K real-world JSON schemas that encompass a wide range of constraints with varying complexity. We pair the benchmark with the existing official JSON Schema Test Suite and evaluate six state-of-the-art constrained decoding frameworks, including Guidance, Outlines, Llamacpp, XGrammar, OpenAI, and Gemini. Through extensive experiments, we gain insights into the capabilities and limitations of constrained decoding on structured generation with real-world JSON schemas..
  • (2024) Validation of Modern JSON Schema: Formalization and Complexity - In this paper, we give the first formal description of Modern JSON Schema, which we consider a central contribution of the work that we present here. We then prove that its data validation problem is PSPACE-complete. We prove that the origin of the problem lies in dynamic references, and not in annotation-dependent validation. We study the schema and data complexities, showing that the problem is PSPACE-complete with respect to the schema size even with a fixed instance, but is in PTIME when the schema is fixed and only the instance size is allowed to vary. Finally, we run experiments that show that there are families of schemas where the difference in asymptotic complexity between dynamic and static references is extremely visible, even with small schemas..
  • (2023) An Analysis of Defects in Public JSON Schemas - Analysis of common defects found in publicly available schemas leading to recommend changes to the spec.
  • (2023) Comprehending Semantic Types in JSON Data with Graph Neural Networks - Graph neural networks for semantic type detection in JSON.
  • (2023) JSONoid: Distributed JSON Schema Discovery - A tool for distributed JSON schema discovery including many properties of the data.
  • (2023) JSONoid: Monoid-based Enrichment for Configurable and Scalable Data-Driven Schema Discovery - Meaningful schema information for semi-structured data.
  • (2022) Implicit JSON Schema Versioning Triggered by Temporal Updates to JSON-Based Big Data in the τJSchema Framework - This paper proposes an approach for handling implicit schema changes triggered by temporal updates of JSON-based Big Data. More precisely, when a user specifies a temporal JSON update operation that modifies a snapshot JSON component assigning a valid-time timestamp to its new value, the execution of such an operation requires the JSON component to become temporal, which is for all intents a schema change. Thus, a new version of the τJSchema temporal characteristics document is generated, with the addition of a new valid-time characteristic. New versions of the temporal JSON schema and of the temporal JSON document are also accordingly created.
  • (2022) JSON BinPack: A space-efficient schema-driven and schema-less binary serialization specification based on JSON Schema - A survey and benchmark of JSON-compatible binary serialization specifications followed by the introduction of JSON BinPack, a novel protocol-independent schema-driven and schema-less binary serialization specification that is strictly-compatible with JSON and takes advantage of JSON Schema formal definitions to produce bit-strings that are space-efficient in comparison to every considered alternative serialization specification.
  • (2022) Machine actionable metadata models - This paper discussed the use of JSON Schema to define human and machine readable metadata models.
  • (2022) Negation-Closure for JSON Schema - Examines how JSON Schema handles negation, demonstrates that the language lacks negation closure, explores how recent schema drafts address this limitation, and proposes enrichments to the language. Includes an algebraic reformulation of JSON Schema and a prototype system for generating schema witnesses.
  • (2022) The Usage of Negation in Real-World JSON Schema Documents - Many software tools, but also formal frameworks for working with JSON Schema, do not fully support negation. This motivates us to study whether negation is actually used in practice, for which aims, and whether it could, in principle, be replaced by simpler operators. We have collected a large corpus of 80k open source JSON Schema documents. We perform a systematic analysis, quantify usage patterns of negation, and also qualitatively analyze schemas. We show that negation is indeed used, albeit infrequently, following a stable set of patterns.
  • (2022) Validating Streaming JSON Documents with Learned VPAs - This paper presents a new streaming algorithm to validate JSON documents against a set of constraints given as a JSON schema. It proves that there always exists a visibly pushdown automaton (VPA) that accepts the same set of JSON documents as a JSON schema.
  • (2022) Witness Generation for JSON Schema - JSON Schema is an important, evolving standard schema language for families of JSON documents. It is based on a complex combination of structural and Boolean assertions, and features negation and recursion. The static analysis of JSON Schema documents comprises practically relevant problems, including schema satisfiability, inclusion, and equivalence. These three problems can be reduced to witness generation: given a schema, generate an element of the schema, if it exists, and report failure otherwise.
  • (2021) Deriving Semantics-Aware Fuzzers from Web API Schemas - Discusses JSON Schema canonicalization and JSON Schema instance derivation in the context of property-based testing of APIs.
  • (2021) Enhancing JSON Schema Discovery by Uncovering Hidden Data - Enhancing discovered JSON Schemas by disambiguating data and metadata.
  • (2021) Fast Discovery of Nested Dependencies on JSON Data - Efficient dependency mining algorithms for non-relational data.
  • (2021) Not Elimination and Witness Generation for JSON Schema - In this paper, we present an algebraic characterization of JSON Schema, obtained by adding opportune operators, and by mirroring existing ones. We present then algebra-based approaches for dealing with not-elimination and witness generation problems, which play a central role as they lead to solutions for the other mentioned complex problems.
  • (2021) TILT: A GDPR-Aligned Transparency Information Language and Toolkit for Practical Privacy Engineering - We present TILT, a transparency information language and toolkit explicitly designed to represent and process transparency information in line with the requirements of the GDPR and allowing for a more automated and adaptive use of such information than established, legalese data protection policies do.
  • (2020) Challenges in Checking JSON Schema Containment over Evolving Real-World Schemas - This paper presents the results of an empirical study of the first generation of tools for checking JSON Schema containment which is applied to a diverse collection of over 230 real-world schemas and their altogether 1k historic versions.
  • (2020) JSON Schema Inference Approaches - In the context of document NoSQL databases, namely those assuming the JSON data format, this paper focuses on several representatives of the existing inference approaches and provide their thorough comparison.
  • (2020) Type Safety with JSON Subschema - Deciding whether one schema is a subschema of another is non-trivial because of the richness of the JSON Schema specification language. Given a pair of schemas, our approach first canonicalizes and simplifies both schemas, then decides the subschema question on the canonical forms, dispatching simpler subschema queries to type-specific checkers.
  • (2019) What Are Real JSON Schemas Like? - A first empirical analysis of a curated collection of real-world JSON Schemas. Knowing what real JSON Schemas are like (to borrow from a title of a related study on DTDs) helps practitioners and researchers in making realistic assumptions when building tools for JSON Schema processing.
  • (2018) An Approach for Schema Extraction of JSON and Extended JSON Document Collections - This paper presents an approach that extracts a schema from a JSON or Extended JSON document collection stored in a NoSQL document-oriented database or other document repository. Aggregation operations are considered in order to obtain a schema for each distinct structure in the collection, and a hierarchical data structure is proposed to group these schemas in order to generate a global schema in JSON Schema format.
  • (2018) Top-Down Model-Driven Engineering of Web Services from Extended OpenAPI Models - Shows how OpenAPI can be extended to add implementation details inside models. These extensions link services to assemblies of components that describe computations. Hence a top-down development process that keeps model and implementation aligned.
  • (2017) Definition of REST web services with JSON schema - The aim of this article is to demonstrate how JSON Schema, and particularly the JSON Hyper Schema extension, is suitable to describe JSON-based web services that follow the REST architectural pattern.
  • (2017) Example-Driven Web API Specification Discovery - In this paper we present an example-driven discovery process that generates model-based OpenAPI specifications for REST Web APIs by using API call examples. A tool implementing our approach and a community-driven repository for the discovered APIs are also presented.
  • (2017) Schema Inference for Massive JSON Datasets - Recent years have seen the widespread use of JSON as a data format to represent massive data collections. JSON data collections are usually schemaless. While this ensures several advantages, the absence of schema information has important negative consequences: the correctness of complex queries and programs cannot be statically checked, users cannot rely on schema information to quickly figure out structural properties that could speed up the formulation of correct queries, and many schema-based optimizations are not possible. In this paper we deal with the problem of inferring a schema from massive JSON data sets.
  • (2016) Foundations of JSON Schema - In this paper we provide the first formal definition of syntax and semantics for JSON Schema and use it to show that implementing this layer on top of JSON is feasible in practice.
  • (2016) τJSchema: A Framework for Managing Temporal JSON-Based NoSQL Databases - This paper proposes a framework called Temporal JSON Schema (τJSchema), inspired by the τXSchema framework defined for XML data. τJSchema allows defining a temporal JSON schema from a conventional JSON schema and a set of temporal logical and physical characteristics. Our framework guarantees logical and physical data independence for temporal schemas and provides a low-impact solution since it requires neither modifications of existing JSON documents, nor extensions to the JSON format, the JSON Schema language, and all related tools and languages.
  • (2015) Schema extraction and structural outlier detection for JSON-based nosql data stores - Rather than designing the schema up front, extracting a schema in hindsight can be seen as a reverse-engineering step. Based on the extracted schema information, we propose set of similarity measures that capture the degree of heterogeneity of JSON data and which reveal structural outliers in the data.
  • (2014) Jsongen: a quickcheck based library for testing JSON web services - This article describes a systematic approach to testing behavioural aspects of Web Services that communicate using the JSON data format. To generate random JSON data for populating tests we have developed a new library, jsongen, which given a characterisation of the JSON data as a JSON schema, (i) automatically derives a QuickCheck generator which can generate an infinite number of JSON values that validate against the schema, and (ii) provides a generic QuickCheck state machine which is capable of following the (hyper)links documented in the JSON schema, to automatically explore the web service.
  • (2012) User profile integration made easy: model-driven extraction and transformation of social network schemas - This paper presents, firstly, a semi-automatic approach to extract schema information from instance data. Secondly, transformations of the derived schemas to different technical spaces are utilized, thereby allowing, amongst other benefits, the application of established integration tools and methods. Finally, as a case study, schemas are derived for Facebook, Google+, and LinkedIn.

Libraries

The JSON Schema website includes an extensive list of implementations and related libraries: https://json-schema.org/implementations.html. Check out the Bowtie project that measures compliance of JSON Schema implementations to help users select properly compliant and maintained libraries.


Special thanks to @kinlane for curating the initial version of this list.

About

A curated list of awesome JSON Schema resources, tutorials, tools, and more

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Sponsor this project

 

Contributors 5