History

Jason Stirnaman 4cb455b1ae feat(ci): add incremental builds and shared content-utils (#6582 ) - Incremental Markdown build for PRs, full build for production - Shared content-utils library for: - Mapping shared content to consuming pages (Markdown generation, Cypress) - Listing changed content pages (committed, uncommitted, staged) - Extracting source frontmatter (docs edit) - Fix CSS parsing warnings with JSDOM VirtualConsole - Remove unused imports and variables		2025-12-01 19:45:42 -05:00
..
lib	feat(ci): add incremental builds and shared content-utils (#6582 )	2025-12-01 19:45:42 -05:00
rust-markdown-converter	Feature: Generate documentation in LLM-friendly Markdown (#6555 )	2025-12-01 12:32:28 -06:00
schemas	Jts agentsmd (#6486 )	2025-10-28 07:20:13 -05:00
templates	chore(docs): Redesign docs CLI tools for creating and editing content, add content/create.md tutorial page for the How to creat… (#6506 )	2025-11-03 10:18:15 -06:00
README.md	Feature: Generate documentation in LLM-friendly Markdown (#6555 )	2025-12-01 12:32:28 -06:00
add-placeholders.js	chore(docs): Redesign docs CLI tools for creating and editing content, add content/create.md tutorial page for the How to creat… (#6506 )	2025-11-03 10:18:15 -06:00
build-llm-markdown.js	feat(ci): add incremental builds and shared content-utils (#6582 )	2025-12-01 19:45:42 -05:00
deploy-staging.sh	Feature: Generate documentation in LLM-friendly Markdown (#6555 )	2025-12-01 12:32:28 -06:00
docs-cli.js	chore(docs): Redesign docs CLI tools for creating and editing content, add content/create.md tutorial page for the How to creat… (#6506 )	2025-11-03 10:18:15 -06:00
docs-create.js	chore(docs): Redesign docs CLI tools for creating and editing content, add content/create.md tutorial page for the How to creat… (#6506 )	2025-11-03 10:18:15 -06:00
docs-edit.js	feat(ci): add incremental builds and shared content-utils (#6582 )	2025-12-01 19:45:42 -05:00
html-to-markdown.js	Feature: Generate documentation in LLM-friendly Markdown (#6555 )	2025-12-01 12:32:28 -06:00
setup-local-bin.js	chore(docs): Redesign docs CLI tools for creating and editing content, add content/create.md tutorial page for the How to creat… (#6506 )	2025-11-03 10:18:15 -06:00

README.md

Documentation Build Scripts

html-to-markdown.js

Converts Hugo-generated HTML files to fully-rendered Markdown with evaluated shortcodes, dereferenced shared content, and removed comments.

Purpose

This script generates production-ready Markdown output for LLM consumption and user downloads. The generated Markdown:

Has all Hugo shortcodes evaluated to text (e.g., {{% product-name %}} → "InfluxDB 3 Core")
Includes dereferenced shared content in the body
Removes HTML/Markdown comments
Adds product context to frontmatter
Mirrors the HTML version but in clean Markdown format

Usage

# Generate all markdown files (run after Hugo build)
yarn build:md

# Generate with verbose logging
yarn build:md:verbose

# Generate for specific path
node scripts/html-to-markdown.js --path influxdb3/core

# Generate limited number for testing
node scripts/html-to-markdown.js --limit 10

# Combine options
node scripts/html-to-markdown.js --path telegraf/v1 --verbose

Options

--path <path>: Process specific path within public/ (default: process all)
--limit <n>: Limit number of files to process (useful for testing)
--verbose: Enable detailed logging of conversion progress

Build Process

Hugo generates HTML (with all shortcodes evaluated):
```
npx hugo --quiet
```
Script converts HTML to Markdown:
```
yarn build:md
```
Generated files:
- Location: public/**/index.md (alongside index.html)
- Git status: Ignored (entire public/ directory is gitignored)
- Deployment: Generated at build time, like API docs

Features

Product Context Detection

Automatically detects and adds product information to frontmatter:

---
title: Set up InfluxDB 3 Core
description: Install, configure, and set up authorization...
url: /influxdb3/core/get-started/setup/
product: InfluxDB 3 Core
product_version: core
date: 2025-11-13
lastmod: 2025-11-13
---

Supported products:

InfluxDB 3 Core, Enterprise, Cloud Dedicated, Cloud Serverless, Clustered
InfluxDB v2, v1, Cloud (TSM), Enterprise v1
Telegraf, Chronograf, Kapacitor, Flux

Turndown Configuration

Custom Turndown rules for InfluxData documentation:

Code blocks: Preserves language identifiers
GitHub callouts: Converts to > [!Note] format
Tables: GitHub-flavored markdown tables
Lists: Preserves nested lists and formatting
Links: Keeps relative links intact
Images: Preserves alt text and paths

Content Extraction

Extracts only article content (removes navigation, footer, etc.):

Target selector: article.article--content
Skips files without article content (with warning)

Integration

Local Development:

# After making content changes
npx hugo --quiet && yarn build:md

CircleCI Build Pipeline:

The script runs automatically in the CircleCI build pipeline after Hugo generates HTML:

# .circleci/config.yml
- run:
    name: Hugo Build
    command: yarn hugo --environment production --logLevel info --gc --destination workspace/public
- run:
    name: Generate LLM-friendly Markdown
    command: node scripts/html-to-markdown.js

Build order:

Hugo builds HTML → workspace/public/**/*.html
html-to-markdown.js converts HTML → workspace/public/**/*.md
All files deployed to S3

Production Build (Manual):

npx hugo --quiet
yarn build:md

Watch Mode: For development with auto-regeneration, run Hugo server and regenerate markdown after content changes:

# Terminal 1: Hugo server
npx hugo server

# Terminal 2: After making changes
yarn build:md

Performance

Processing speed: ~10-20 files/second
Full site: 5,581 HTML files in ~5 minutes
Memory usage: Minimal (processes files sequentially)
Caching: None (regenerates from HTML each time)

Troubleshooting

No article content found:

⚠️  No article content found in /path/to/file.html

File doesn't have article.article--content selector
Usually navigation pages or redirects
Safe to ignore

Shortcodes still present:

Run after Hugo has generated HTML, not before
Hugo must complete its build first

Missing product context:

Check that URL path matches patterns in PRODUCT_MAP
Add new products to the map if needed