Commit Graph

74 Commits (gw/data-https-insecure-certificate)

Author SHA1 Message Date
Jason Stirnaman 4cb455b1ae
feat(ci): add incremental builds and shared content-utils (#6582)
- Incremental Markdown build for PRs, full build for production
- Shared content-utils library for:
  - Mapping shared content to consuming pages (Markdown generation, Cypress)
  - Listing changed content pages (committed, uncommitted, staged)
  - Extracting source frontmatter (docs edit)
- Fix CSS parsing warnings with JSDOM VirtualConsole
- Remove unused imports and variables
2025-12-01 19:45:42 -05:00
Jason Stirnaman 2dd1956a18
fix(ci): add --public-dir argument to build-llm-markdown.js (#6581)
CircleCI builds Hugo to workspace/public, but the markdown generator
was hardcoded to look in public/. Added --public-dir argument:

- Default: public (for local dev and staging)
- CI: --public-dir workspace/public

Staging deployment (deploy-staging.sh) uses default public/ and
continues to work unchanged.
2025-12-01 13:19:01 -06:00
Jason Stirnaman c2093c8212
Feature: Generate documentation in LLM-friendly Markdown (#6555)
* feat(llms): LLM-friendly Markdown, ChatGPT and Claude links.

This enables LLM-friendly documentation for entire sections,
allowing users to copy complete documentation sections with a single click.

Lambda@Edge now generates .md files on-demand with:
- Evaluated Hugo shortcodes
- Proper YAML frontmatter with product metadata
- Clean markdown without UI elements
- Section aggregation (parent + children in single file)

The llms.txt files are now generated automatically during build from
content structure and product metadata in data/products.yml, eliminating
the need for hardcoded files and ensuring maintainability.

**Testing**:
- Automated markdown generation in test setup via cy.exec()
- Implement dynamic content validation that extracts HTML content and
  verifies it appears in markdown version

**Documentation**:
Documents LLM-friendly markdown generation

**Details**:
Add gzip decompression for S3 HTML files in Lambda markdown generator

HTML files stored in S3 are gzip-compressed but the Lambda was attempting
to parse compressed data as UTF-8, causing JSDOM to fail to find article
elements. This resulted in 404 errors for .md and .section.md requests.

- Add zlib gunzip decompression in s3-utils.js fetchHtmlFromS3()
- Detect gzip via ContentEncoding header or magic bytes (0x1f 0x8b)
- Add configurable DEBUG constant for verbose logging
- Add debug logging for buffer sizes and decompression in both files

The decompression adds ~1-5ms per request but is necessary to parse
HTML correctly. CloudFront caching minimizes Lambda invocations.

Await async markdown conversion functions

The convertToMarkdown and convertSectionToMarkdown functions are async
but weren't being awaited, causing the Lambda to return a Promise object
instead of a string. This resulted in CloudFront validation errors:
"The body is not a string, is not an object, or exceeds the maximum size"

**Troubleshooting**:

- Set DEBUG for troubleshooting in lambda

* feat(llms): Add build-time LLM-friendly Markdown generation

Implements static Markdown generation during Hugo build.

**Key Features:**
- Two-phase generation: HTML→MD (memory-bounded), MD→sections (fast)
- Automatic redirect detection via file size check (skips Hugo aliases)
- Product detection using compiled TypeScript product-mappings module
- Token estimation for LLM context planning (4 chars/token heuristic)
- YAML serialization with description sanitization

**Performance:**
- ~105 seconds for 5,000 pages + 500 sections
- ~300MB peak memory (safe for 2GB CircleCI environment)
- 23 files/sec conversion rate with controlled concurrency

**Configuration Parameters:**
- MIN_HTML_SIZE_BYTES (default: 1024) - Skip files below threshold
- CHARS_PER_TOKEN (default: 4) - Token estimation ratio
- Concurrency: 10 workers (CI), 20 workers (local)

**Output:**
- Single pages: public/*/index.md (with frontmatter + content)
- Section bundles: public/*/index.section.md (aggregated child pages)

**Files Changed:**
- scripts/build-llm-markdown.js (new) - Main build script
- scripts/lib/markdown-converter.cjs (renamed from .js) - Core conversion
- scripts/html-to-markdown.js - Updated import path
- package.json - Updated exports for .cjs module

Related: Replaces Lambda@Edge on-demand generation (5s response time)
with build-time static generation for production deployment.

feat(deploy): Add staging deployment workflow and update CI

Integrates LLM markdown generation into deployment workflows with
a complete staging deployment solution.

**CircleCI Updates:**
- Switch from legacy html-to-markdown.js to optimized build:md
- 2x performance improvement (105s vs 200s+ for 5000 pages)
- Better memory management (300MB vs variable)
- Enables section bundle generation (index.section.md files)

**Staging Deployment:**
- New scripts/deploy-staging.sh for local staging deploys
- Complete workflow: Hugo build → markdown gen → S3 upload
- Environment variable driven configuration
- Optional step skipping for faster iteration
- CloudFront cache invalidation support

**NPM Scripts:**
- Added deploy:staging command for convenience
- Wraps deploy-staging.sh script

**Documentation:**
- Updated DOCS-DEPLOYING.md with comprehensive guide
- Merged staging/production workflows with Lambda@Edge docs
- Build-time generation now primary, Lambda@Edge fallback
- Troubleshooting section with common issues
- Environment variable reference
- Performance metrics and optimization tips

**Benefits:**
- Manual staging validation before production
- Consistent markdown generation across environments
- Faster CI builds with optimized script
- Better error handling and progress reporting
- Section aggregation for improved LLM context

**Usage:**
```bash
export STAGING_BUCKET="test2.docs.influxdata.com"
export AWS_REGION="us-east-1"
export STAGING_CF_DISTRIBUTION_ID="E1XXXXXXXXXX"

yarn deploy:staging
```

Related: Completes build-time markdown generation implementation

refactor: Remove Lambda@Edge implementation

Build-time markdown generation has replaced Lambda@Edge on-demand
generation as the primary method. Removed Lambda code and updated
documentation to focus on build-time generation and testing.

Removed:
- deploy/llm-markdown/ directory (Lambda@Edge code)
- Lambda@Edge section from DOCS-DEPLOYING.md

Added:
- Testing and Validation section in DOCS-DEPLOYING.md
- Focus on build-time generation workflow

* feat: Add Rust HTML-to-Markdown prototype

Implements core markdown-converter.cjs functions in Rust for performance comparison.

Performance results:
- Rust: ~257 files/sec (10× faster)
- JavaScript: ~25 files/sec average

Recommendation: Keep JavaScript for now, implement incremental builds first.
Rust migration provides 10× speedup but requires 3-4 weeks integration effort.

Files:
- Cargo.toml: Rust dependencies (html2md, scraper, serde_yaml, clap)
- src/main.rs: Core conversion logic + CLI benchmark tool
- benchmark-comparison.js: Side-by-side performance testing
- README.md: Comprehensive findings and recommendations

* fix(ui): improve dropdown positioning on viewport resize

- Ensure dropdown stays within viewport bounds (min 8px padding)
- Reposition dropdown on window resize and scroll events
- Clean up event listeners when dropdown closes

* chore(deps): add remark and unified packages for markdown processing

Add remark-parse, remark-frontmatter, remark-gfm, and unified for
enhanced markdown processing capabilities.

* fix(edge): add return to prevent trailing-slash redirect for valid extensions

Without the return statement, the Lambda@Edge function would continue
executing after the callback, eventually hitting the trailing-slash
redirect logic. This caused .md files to redirect to URLs with trailing
slashes, which returned 404 from S3.

* fix(md): add built-in product mappings and full URL support

- Add URL_PATTERN_MAP and PRODUCT_NAME_MAP constants directly in the
  CommonJS module (ESM product-mappings.js cannot be require()'d)
- Update generateFrontmatter() to accept baseUrl parameter and construct
  full URLs for the frontmatter url field
- Update generateSectionFrontmatter() similarly for section pages
- Update all call sites to pass baseUrl parameter

This fixes empty product fields and relative URLs in generated markdown
frontmatter when served via Lambda@Edge.

* feat(md): add environment flag for base URL control

Add -e, --env flag to html-to-markdown.js to control the base URL
in generated markdown frontmatter. This matches Hugo's -e flag behavior
and allows generating markdown with staging or production URLs.

Also update build-llm-markdown.js with similar environment support.

* feat(md): add Rust markdown converter and improve validation

- Add Rust-based HTML-to-Markdown converter with NAPI-RS bindings
- Update Cypress markdown validation tests
- Update deploy-staging.sh with force upload flag

* deploy-staging.sh:
  - Defaults STAGING_URL to https://test2.docs.influxdata.com
  if not set
  - Exports it so yarn build:md -e staging can use it
  - Displays it in the summary

* Delete scripts/prototypes/rust-markdown/benchmark-comparison.js

* Delete scripts/prototypes directory

* fix(llms): Include full URL for section page Markdown and list of child pages

* feat(llms): clarify format selector text for AI use case

Update button and dropdown text to make the AI/LLM purpose clearer:
- Button: "Copy page for AI" / "Copy section for AI"
- Sublabel: "Clean Markdown optimized for AI assistants"
- Section sublabel: "{N} pages combined as clean Markdown for AI assistants"

Cypress tests updated and passing (13/13).

---------

Co-authored-by: Scott Anderson <scott@influxdata.com>
2025-12-01 12:32:28 -06:00
Jason Stirnaman c034ba8f5d Updated config.yml 2025-09-15 14:19:11 -05:00
Jason Stirnaman 717ec5cd1d chore(js): Extract Hugo params imports to single-purpose modules, fix environment-specific Hugo configs, use Hugo environments instead of specifying the config file, configure source maps and ESM for development and testing 2025-06-09 16:46:26 -05:00
Jason Stirnaman 4973026adf Merge pull request #6079 from influxdata/chore-js-refactor-footer-scripts-modules
Chore: JavaScript: refactor footer scripts modules
2025-06-09 14:40:37 -05:00
Jason Stirnaman f819f48de9
Revert "Chore: JavaScript: refactor footer scripts modules" 2025-06-05 09:46:37 -05:00
Jason Stirnaman aac841a749 fix(ci): Disable unnecessary SCSS processing when building JS assets.
- Build resources if not cached
- Ensure node_modules dependencies are available for asset processing
- Be more precise in the template when building assets in production mode and avoid conflicts with SCSS and CSS processing.
- Ignore node_modules when loading source maps
- Add .vscode/launch.json with debugging configuration for localhost:1313. In VSCode, go to Run and select the site to launch Chrome, connect to Developer Tools, and start debugging.
- Add support for debugging in VS Code without using source maps. Adds a debug helpers module for developers to use in IDE debugging and interact with the browser console. This is a workaround for lack of good source map support with js.Build and the Hugo asset pipeline.

Background:
Hugo and js.Build don't have good support for external source maps.
Internal source mapping is also unreliable; the base64 in the source map reference for some files is too long for the browser console to keep on 1 line--remaining characters are printed on the next line, resulting in a syntax error.
2025-06-04 14:21:26 -05:00
Jason Stirnaman 5c419c18bb fix(ci): Clear Hugo cache before build:
- When building in CircleCI, the previous Hugo config changes prevents Hugo finding previously processed CSS in the file cache. Clean up the cache before building.
2025-06-04 13:24:51 -05:00
Jason Stirnaman 13129c5687 chore(js): Unbundle assets in Hugo development and testing environments:
- Keep assets unbundled for easier debugging in development and testing.
- Copies other config from development to testing.
2025-06-04 13:24:51 -05:00
Jason Stirnaman 1dce052e56 fix(JS): Rename CommonJS scripts to .cjs extension, keep type: module as the project default. Update and fix ESLint configuration.
- Renames JavaScript files in flux-build-scripts and api-docs/openapi/plugins to .cjs file extension to declare them as CommonJS module syntax.
2025-05-19 11:34:42 -05:00
Andreas Deininger 35ba19d3cb
Docu: fix broken link (#5410)
Fix broken link

Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>
2024-04-08 16:07:49 -06:00
Andreas Deininger 0488345ab7
Fix errors when running latest hugo version v0.123.8 (#5356)
Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>
2024-03-11 17:31:33 -06:00
Andreas Deininger 476a73e95e
Fixing typos (#5315)
* Fix typos

* Bump hugo to latest version v0.122.0

---------

Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>
2024-02-05 09:51:51 -07:00
Andreas Deininger 57e4929746
Bump hugo to latest version v0.121.2 (#5293)
* Bump hugo to latest version v0.121.2

* Fix deprecation warnings emitted from hugo

---------

Co-authored-by: Jason Stirnaman <stirnamanj@gmail.com>
Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>
2024-01-17 09:15:00 -07:00
Andreas Deininger 7634b9ba1a
Upgrade hugo to latest version 0.120.4 (#5241) 2023-11-27 13:39:23 -07:00
Scott Anderson 08b40beab4 updated circleci config 2023-01-31 13:40:57 -07:00
Scott Anderson 176ea2e420 updated circleci config 2023-01-31 13:31:46 -07:00
Scott Anderson 0082012e9a updated circleci image 2023-01-31 13:18:08 -07:00
Scott Anderson 5ce0d1ec47
Show what versions of InfluxDB Flux functions are supported in (#4219)
* automatically build flux version data file

* rendered modals on each stdlib page

* polish on flux-influxdb version support

* reverted modified frontmatter

* added new page that shows what flux versions are packaged with each InfluxDB version

* added comment to js file

* updated to address PR feedback

* moved flux build scripts into their own directory

* updated inject-flux-frontmatter script to work in subdirectory

* updated flux-versions script to work in nested directory

* fixed bug in flux-versions script
2022-07-19 15:14:10 -06:00
Scott Anderson 909980c765
Auto-generated Flux docs (#4158)
* WIP testing autogen

* WIP autogen docs, added contributors parital

* updated stdlib

* WIP autogen

* Inject Flux stdlib frontmatter script (#4157)

* added frontmatter injection script and frontmatter data file

* regenerated stdlib

* finalize frontmatter injection script

* add frontmatter injection to CI build

* remove debug line from frontmatter script

* fresh docs generate

* fresh generate
2022-06-23 16:21:58 -06:00
pierwill 4effc16f4a
Use hardcoded checksum for `s3deploy` (#2800)
Also remove unused install script
2021-06-30 15:48:17 -05:00
Jason Stirnaman f13e34de6d
Install dependencies as project dependencies from NPM repo (#2476)
* Added hugo-extended, postcss, postcss-cli, and autoprefixer as devDependencies. Run npm install or yarn install. (#2474)

* Replaced global hugo and yarn installs with project-level yarn install.

* Replaced npm package.lock with yarn.lock (#2474).

* enhancement: update README with instructions for installing NODE.JS dependencies. (#2474)

* updated api doc generator script to use npx

* Update README.md

Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>

* Update README.md

Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>

* Update README.md

Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>

* Update README.md

Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>

* Update README.md

Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>

* Update package.json

Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>

* fix: indentation. (#2476)

* update: Added separate dependencies list for api-docs.
- Moved redoc-cli to a separate package.json in api-docs. Excluded
api-docs/node_modules from generate-api-docs.sh.
- Updated redoc-cli argument sequence to agree with their docs.
- Updated READMEs.
- Fixed typos.

* update: add api-docs > yarn install to .circleci

* Added language and consistency to code block. Specify where to run the command.

Co-authored-by: Scott Anderson <scott@influxdata.com>
Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>
Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com>
2021-05-24 12:11:01 -05:00
Scott Anderson 9b312413c1 hotfix: trying to fix api docs 2021-03-31 11:12:05 -06:00
Scott Anderson d48e149544 hotfix: trying to fix api docs 2021-03-31 11:10:10 -06:00
Scott Anderson c0157b9622
Redoc fix (#2376)
* update redoc-cli version

* reverted redoc-cli version

* removed redoc version
2021-03-31 10:09:34 -06:00
Scott Anderson 05d9119720
update redoc-cli version (#2374) 2021-03-31 09:39:24 -06:00
Scott Anderson fb515407bb
Redoc fix (#2228)
* bumped redoc version
2021-03-01 14:34:05 -07:00
Scott Anderson 2c5d92fd4a
Redoc fix (#2227)
* specified version for redoc-cli

* revert redoc-cli version change

* troubleshooting cli docs build

* fixed bad merge conflict resolution
2021-03-01 14:18:18 -07:00
Scott Anderson 0e8689ab31
specified version for redoc-cli (#2225) 2021-03-01 14:04:29 -07:00
Scott Anderson 46d0f16dc0
upgraded hugo to 0.81.0, fixed isset build error, fixed formatting in grafana doc (#2218) 2021-02-26 21:54:41 -07:00
Scott Anderson 1657a0a9f1 use specific postcss-cli version 2020-12-14 13:33:24 -07:00
Scott Anderson 9f911ed81b hotfix: added specific autoprefixer version to fix CI build 2020-09-16 10:04:12 -06:00
Scott Anderson 66dba909c1 updated circle ci script 2020-09-02 17:55:33 -06:00
Scott Anderson 689b6b571f
updated deploy files (#1387) 2020-09-02 14:18:46 -06:00
Scott Anderson aa82e71618 switched from npm to yarn in ci build 2020-04-22 12:20:18 -06:00
Scott Anderson a5b4684d25 switched from npm to yarn in ci build 2020-04-22 12:18:44 -06:00
Scott Anderson 074d1b25e3 updated ci build script 2020-04-22 12:08:31 -06:00
Scott Anderson ae22623ca5 updated npm cache clear build command 2020-04-22 12:03:33 -06:00
Scott Anderson 332da50d56 updated npm cache clear build command 2020-04-22 11:58:38 -06:00
Scott Anderson 18ff330495 updated npm cache clear build command 2020-04-22 11:58:03 -06:00
Scott Anderson 7210c64729 updated npm cache clear build command 2020-04-22 11:57:29 -06:00
Scott Anderson 1f93ef576d updated npm cache clear build command 2020-04-22 11:56:31 -06:00
Scott Anderson 171e5bee3a removed tmp cache command from build script 2020-04-22 11:51:36 -06:00
Scott Anderson 14cc863ac0 added correct cache permission to ci build script 2020-04-22 11:47:55 -06:00
Scott Anderson 1e107f46bc added CI cache versions to other build commands 2020-04-22 11:46:26 -06:00
Scott Anderson 9c7032197c fixed yaml in circle ci build config 2020-04-22 11:45:47 -06:00
Scott Anderson b32c52a734 added tmp ci command to flush npm cache 2020-04-22 11:42:54 -06:00
Scott Anderson 0c62c0f470 added cache version to circle ci build config 2020-04-22 11:36:46 -06:00
Scott Anderson 31d8ee2ab6 updated circle ci image 2020-04-22 11:31:58 -06:00