DR-008-Infra: Generating documentation sources via Bazel#
Date: 2026-05-28
Generating documentation sources via Bazel
|
status: accepted
|
||||
Context / Problem#
The docs-as-code system builds documentation by reading from a static source_dir (default "docs/") on the workspace filesystem.
Three build paths coexist:
Live preview — Local development via sphinx-autobuild.
Direct Sphinx — Sphinx invoked in the same venv for fast iteration or CI.
Bazel sandbox —
needs_jsonand similar targets run Sphinx in a hermetic sandbox.
We have no generic solution for generating parts of the documentation source directory via Bazel. See docs-as-code issue #423 for an open feature request to implement “Extra docs pages from artifacts”.
Workarounds we already have in place are:
Use
.as source directory to place sources anywhere. This implies a careful maintenance of include/exclude patterns inconf.py. It is limited because a rule in the rootBUILDfile cannot cover files in subdirectories with anotherBUILDfile.Generate json files for special inputs like source links. This is limiting because we cannot generate whole pages or directories with this approach.
Implicitly read files from bazel folders. TestLinks in documentation appear if a
test.xmlfile exists underbazel-testlogs/.Reference integration overwrites documentation sources in a workflow. See ‘Publish build summary’ step.
The
:docs_combodoes compose a sources directory via sphinx-collections. It allows no control over the folder hierarchy and symlinks in the git workspace can be confusing.
We look for a solution which is simpler and easier to maintain, so we don’t have to keep adding more and more workarounds for each new use case.
Additionally, we repeatedly has issues with caching. Since we don’t rely on Bazel sandbox for docs building, Bazel cannot help with hermeticity and determinism. We need incremental builds locally and determinism with caching in CI. See rules_python sphinxdocs how an idea how to achieve this using Bazel.
The “Module API” proposal (somewhat implemented in tooling PR 95) fully relies on Bazel. It is not compatible with the docs-as-code live preview as of now. Another exploration by Useblocks is available but does not cover live preview either.
We want dashboards generated automatically to be included in the documentation. See infrastructure discussion 2026-05-18.
Cross-repository composed builds via :docs_combo must keep working for integrator repositories.
When composing doc sources from multiple places with the same repository, we need tracking such that errors and warnings point to the original source files. However, there is no need to provide that for generated or transformed files (as is common for Javascript or CSS assets in web development with source maps).
Requirements (all options satisfy these)#
Allow to generate parts of the documentation via Bazel (including whole pages or directories).
Live preview with fast edit-preview cycles.
Errors and warnings point to the original source files if they are in the repo.
Combo build can include full sources from multiple repositories.
Goals (optimize these with this decision)#
Flexibility: Minimise the effort for potential future extensions.
Effort: Minimise one-time implementation and ongoing maintenance cost.
Speed: Minimise the build time for documentation builds, especially for live preview.
UX: Minimize efforts necessary to documentation work.
Non-Goals#
Replacing Sphinx or Sphinx-Needs as documentation tools.
Keep Esbonio language server alive as we assume nobody is using it.
Options Considered#
Option N: No change (status quo)#
Keep the current architecture.
The docs() macro in docs.bzl accepts a source_dir parameter and reads
documentation sources directly from that directory on the workspace filesystem.
graph LR
docs@{ shape: docs, label: "docs/" }
docs --> :docs
docs --> :live_preview
:live_preview -- watch --> docs
Effort 💚: No implementation work required.
Flexibility 😡: More workarounds instead of generic solution.
Speed 💚: Fast but only covers source updates (not test result updates, for example).
UX 💚: Status quo
While sphinx-autobuild --pre-build is available to trigger some build steps before each rebuild,
this does not work with Bazel:
If you bazel run :live_preview and do a bazel build inside,
that build will wait for the run to finish, thus deadlock.
Option B: Introduce :docs_src_dir Bazel target#
In short: sphinx-autobuild from a Bazel target that re-materializes the sources continuously via ibazel / bazel-watcher.
We add an extra_docs attribute to the docs() macro
for additional sources specified via sphinx_docs_library,
which allows to adapt path prefixes.
The source tree is materialized using hardlinks (ln) inside a Bazel declare_directory action.
Symlinks fail Bazel’s output tree validation (dangling link detection),
while copies are unnecessarily expensive for large doc sets.
graph LR
docs@{ shape: docs, label: "docs/" }
extradocs@{ shape: docs, label: "extra_srcs" }
preview@{ shape: subproc, label: "live_preview" }
allsrc@{ label: ":docs_src_dir" }
docs --> allsrc --> :docs --> preview
extradocs --> allsrc
preview -- rebuild --> :docs
allsrc --> :needs_json
The live preview is replaced by a custom implementation.
This live preview cannot be executed via bazel run because of the need to rebuild via Bazel internally.
Thus, there is no :live_preview target but a live_preview script.
We cannot rely on watching file system changes to trigger rebuilds because the source directory is composed by Bazel
and may contain generated files.
The live_preview script runs two concurrent processes:
ibazel build :docs_src_dir— watches workspace sources and re-materializes the tree on change.sphinx-autobuild— watches the materialized tree and serves HTML with websocket-based browser reload.
The score_sync_toml extension writes a ubproject.toml file to the source directory
but Bazel sandboxing makes this fail.
The score_sync_toml extension’s write to the source directory is redirected via
--define=needscfg_outpath=<workspace>/docs/ubproject.toml,
which works without modifications to the extension itself.
Effort 💛: Some implementation effort but prototype already works.
Flexibility 💚: Generic solution for all build paths and future extensions.
Speed 💛: Overall latency is comparable to the status quo for edit-preview cycles, but the initial cold start is a little slower due to the extra Bazel invocation.
UX 😡: Requires a two-step setup: bazel run //:ide_support (venv) then
bazel run //:gen_live_preview (script).
The generated script is workspace-specific and should be gitignored.
The generally idea is also described in the rules_python documentation:
bazel run //docs:docs.serve # Run in separate terminal
ibazel build //docs:docs # Automatically rebuilds docs
This docs.serve target implemented in rules_python does not have
auto-refresh in the browser though.
Option D: Dual-path — keep :live_preview, add :docs_src_dir#
Keep the existing bazel run :live_preview target unchanged (sphinx-autobuild watching docs/ on the workspace filesystem).
In parallel, introduce a separate hermetic bazel build :docs_src_dir target
that materialises a composed source directory inside the Bazel sandbox before invoking Sphinx.
The obvious risk here is that the two paths do not produce the same output.
Effort 😡: By definition requires nearly the effort for option N and B combined.
Flexibility 😡: Still requires all the workarounds of option N.
Speed 💚: No slowdown.
UX 😡: Live-preview UX is unchanged, but risk of downstream breaks.
Option M: Mount external source bundles via sphinx-mounts#
In short: keep source files where they are and mount them into the Sphinx project
using [[mounts]] entries written to ubproject.toml.
This option has been explored as a proof of concept in
docs-as-code PR #554.
The PoC adds a mounts attribute to docs()
and wires it through all relevant paths,
including :docs_combo / :docs_sources and sandboxed builds.
In contrast to option B, the source files are not re-materialized. This uses the external sphinx-mounts extension. The extension modifies Sphinx internal data structures to integrate files outside of the source directory.
TODO:
Apparently :docs_combo composes external :docs_sources trees,
but does not define a clear cross-repository aggregation mechanism for mount metadata declared by upstream repositories.
Therefore, Option M does not satisfy all requirements and is not a valid option at this time.
graph LR
docs@{ shape: docs, label: "docs/" }
bundle@{ shape: docs, label: "src/docs/ or generated dir" }
mounts@{ label: "[[mounts]]" }
docs --> :docs
bundle --> mounts --> :docs
The approach avoids copying/symlinking documentation trees into docs/
and aims to keep IDE tooling aligned by reading the same ubproject.toml as Sphinx.
Known constraints from the PoC:
Each mount currently expects a directory-shaped Bazel output (e.g. via a helper like
files_to_dir).Per-bundle
ubproject.tomlgeneration is workspace-only (bazel run), while sandboxedbazel buildskips those workspace writes by design.Nested mounts are not supported.
Effort 💛: Similar order of magnitude as Option B prototype work.
Flexibility 😡: Fails strict cross-repository :docs_combo requirement in current form.
Also, using this immature external dependency is risky.
Speed 💛: Comparable iterative speed; initial setup and mount resolution add some overhead.
UX 😡: Not acceptable while strict cross-repository composition behavior remains unclear/unmet.
Evaluation#
In order of importance, most important first.
Goals |
Option N |
Option B |
Option D |
Option M |
|---|---|---|---|---|
Flexibility |
😡 |
💚 |
😡 |
😡 |
Effort |
💚 |
💛 |
😡 |
💛 |
Speed |
💚 |
💛 |
💚 |
💛 |
UX |
💚 |
😡 |
😡 |
😡 |
Decision: Option B because Option N loses wrt flexibility. Option D has no advantage over B.
Appendix: any_folder experiment#
For a brief moment, we had an any_folder extension but removed it before the docs-as-code release.
It breaks when using such documentation in :docs_combo:
It relied on configuration in conf.py but with :docs_combo
the modules’ conf.py is ignored and only the root conf.py is used.
More information can be found in PR #450.