Moxygen

Doxygen XML to Markdown converter. Two-pass parser, deferred reference resolution, pluggable templates. Because Doxygen's HTML output looks like Windows 3.1.

TypeScriptDoxygenMarkdownHandlebars

Doxygen generates documentation that looks like it was designed for a Windows 3.1 help file. For C++ developers who want their API docs to live alongside their code in Git, render in any static site generator, and not make their eyes bleed - there was nothing. So we built Moxygen.

Moxygen parses Doxygen’s XML output and generates clean, readable Markdown. Drop the output into GitBook, Docusaurus, MkDocs, or any Markdown-based documentation system. Version-controlled, reviewable in PRs, portable to anywhere that renders Markdown.

Two-pass parsing

Doxygen’s XML is full of cross-references. A class member might reference a type in a different namespace, a function might link to a struct in another file. Single-pass parsing breaks on forward references - you can’t resolve a link to something you haven’t seen yet.

  Doxygen XML
       |
  +----+----+
  | Pass 1  |  index.xml --> register all compounds + members
  | Pass 2  |  per-compound XML --> parse details, extract params
  +----+----+
       |
   references map (refid --> Compound | Member)
       |
  +----+----+
  | filter  |  visibility rules (public / protected / static)
  +----+----+
       |
  +----+----+
  | render  |  Handlebars templates (cpp / java)
  +----+----+
       |
  +----+----------------------------+
  |  resolve refs (deferred)        |
  |  anchor style: pandoc or HTML   |
  +----+----------------------------+
       |
  -----+----------------------------
       |          |         |
   single     per-group  per-class
   api.md    api_%s.md   api_%s.md

Pass 1 reads index.xml - a flat list of every compound (class, namespace, struct, page) and their member stubs. Every entity gets registered in a global references map keyed by refid. Members have minimal data at this point - just name, refid, and XML attributes.

Pass 2 iterates through the sorted index again, reading individual XML files like classtransport_1_1Bicycle.xml. This populates descriptions, parameter types, qualifiers, base/derived classes, section definitions, inner classes. All cross-references resolve correctly because the map was populated in Pass 1.

Sort order matters. Entries sort by refid length ascending, so parent namespaces are created before their children. namespacefoo exists before namespacefoo_1_1bar - nested namespace assignment never fails.

References are deferred, not inline. During rendering, cross-references emit as {#ref refid #} placeholders. After the full document renders, resolveRefs() traces each placeholder’s parent chain and calculates the correct path - #refid for same-file, api-ClassName.md#refid for cross-file. The same template renders correctly regardless of output mode.

Filtering and visibility

Not everything Doxygen documents should appear in your output. Moxygen filters by section kind (public-func, protected-attrib, private-static-func) against a configurable allowlist. Filtering is recursive - compounds filter their members, then empty compounds are dropped entirely. No empty sections, no placeholder headings.

For group-based output, members are additionally filtered by groupid - only members belonging to the current @defgroup appear in that group’s file.

Templates

Templates are language-specific Markdown files compiled as Handlebars partials. C++ and Java ship out of the box, and --templates points at your own directory for full customization.

Built-in helpers handle the heavy lifting: signature generates complete function prototypes from structured data (not string concatenation), shortname strips namespace prefixes for cleaner headings, badges marks static/const/virtual qualifiers, sectionLabel maps kinds to human-readable labels. Templates receive typed member data - params[] arrays, templateParams, returnType, boolean qualifiers - so they make formatting decisions without parsing.

Output modes

Four modes from the same pipeline:

  • Single file (default) - everything in one api.md, local anchor fragments
  • Per-group (--groups) - each @defgroup becomes a separate file, cross-group refs resolve to relative paths
  • Per-class (--classes) - each class/struct/interface gets its own file with safe filename escaping (Vector<int> becomes Vector(int))
  • Per-page (--pages) - each Doxygen page becomes a separate file

Anchor style is configurable: Pandoc ({#refid}), HTML (<a id="refid"></a>), or none. Output filename template uses %s substitution: --output api-%s.md.

C++ edge cases

Moxygen handles nested namespaces (sort-by-length ensures parents exist first), template classes (parameter extraction + filename escaping), overloaded functions (each gets a unique refid), inheritance chains (rendered as extends/subclasses lists), forward references in comments (@ref becomes a resolved Markdown link), and free functions (stored in file compounds, rendered at top level).

Usage

# Add GENERATE_XML=YES to your Doxyfile, run doxygen, then:
npm install moxygen -g

moxygen --anchors /path/to/doxygen/xml                       # Single file
moxygen --groups --output api-%s.md /path/to/doxygen/xml      # Per-group
moxygen --classes --output api-%s.md /path/to/doxygen/xml     # Per-class
moxygen --language java --anchors /path/to/doxygen/xml        # Java

Programmatic API for build system integration:

import { run } from 'moxygen'

await run({
  directory: '/path/to/doxygen/xml',
  output: 'api.md',
  anchors: true,
  language: 'cpp',
})