From XML Schema to Excel – Reliable Documentation at Scale

Exchanging structured data between systems is still, very often, based on XML messages defined by XML Schema (XSD).
While XSD files are ideal for machines, they are hard to read, review, and discuss for humans — especially for business analysts, domain experts, or project stakeholders.

At the same time, Excel (XLSX) remains the de-facto standard for documentation, reviews, and specifications.

This leads to a common and risky situation:

  • the XSD evolves during development

  • the Excel documentation lags behind

  • inconsistencies appear — and are often discovered too late

The Idea: One Source of Truth

The core idea behind XSD2XLSX is simple and powerful:

Declare the XSD as the master artifact
and generate all human-readable documentation from it

Instead of maintaining two large and complex files manually, all documentation is written directly inside the XSD using standard annotations. Excel files are then generated automatically whenever needed.


What XSD2XLSX Does

XSD2XLSX is a Python-based tool that:

  • parses XML Schema definitions (XSD)

  • follows xs:include and xs:import relationships

  • resolves complex types, simple types, and inheritance

  • extracts documentation, types, and cardinalities

  • generates a structured Excel (XLSX) representation

Each element is documented with:

  • element name

  • hierarchy level

  • data type

  • min/max occurrences

  • embedded documentation

This makes even very large schemas reviewable and discussable.


What Has Changed Since the First Version (2016 → Today)

The original prototype was created in 2016 as a proof of concept.
The current version is a production-ready evolution, supporting real-world schema landscapes.

Key Improvements

  • ZIP-based schema sets
    Load complete XSD distributions instead of single files.

  • Full include/import resolution
    Handles modular schemas spread across many files.

  • Multi-namespace support
    Correctly resolves prefixed types across documents.

  • External schema awareness
    External HTTP(S) imports are detected and handled gracefully.

  • Modern Web UI
    Upload XSD or ZIP files, define the root element, and download Excel results directly from the browser.

  • Scales to complex government and enterprise schemas
    Tested with large real-world schema sets (e.g. XÖV specifications).


Why This Matters in Practice

Using XSD2XLSX enables teams to:

  • keep technical and business views in sync

  • avoid duplicated documentation work

  • detect schema changes early

  • review interfaces without XML expertise

  • establish clear ownership of data structures

A common best practice is:

  1. Maintain documentation inside the XSD

  2. Treat the XSD as the single source of truth

  3. Generate Excel (or other formats) only for review and communication

From this schema, XSD2XLSX generates a structured Excel file suitable for reviews and workshops.


Related Tools

If you are interested in schema-based tooling, also have a look at our pptx-parser, which applies a similar philosophy to PowerPoint and other Office files — extracting structured information instead of manually maintaining it.


You try the XSD2XLSX mapping:

XSD2XLSX is ideal if you:

  • work with complex XML interfaces

  • maintain large schema landscapes

  • need reliable, up-to-date documentation

  • want to reduce manual documentation effort

Contact us to set up a customized solution for your needs

It is a good practice to have clear ownership of data structures. So you can declare the XSD the master and generate documentation from it and also import it into your development tool.

 

Andreas Bühlmeier, PhD.

Originally published September 2016

Updated & extended 2025