chore(instructions): restructure agent instructions build process

Remove contributing.instructions.md build logic and simplify to only build platform reference from products.yml. Update lefthook to only trigger on products.yml changes instead of CONTRIBUTING.md.
2025-10-08 16:15:30 -05:00 · 2025-10-08 16:15:30 -05:00 · 3a857320e8
parent 189d86fcda
commit 3a857320e8
3 changed files with 3 additions and 615 deletions
--- a/.github/instructions/contributing.instructions.md
+++ b/.github/instructions/contributing.instructions.md
@ -1,302 +0,0 @@
 ---
 applyTo: "content/**/*.md, layouts/**/*.html"
 ---
 # Contributing instructions for InfluxData Documentation
 ## Purpose and scope
 Help document InfluxData products
 by creating clear, accurate technical content with proper
 code examples, frontmatter, shortcodes, and formatting.
 ## Quick Start
 Ready to contribute?
 1. [Sign the InfluxData CLA](#sign-the-influxdata-cla) (for substantial changes)
 2. [Fork and clone](#fork-and-clone-influxdata-documentation-repository) this repository
 3. [Install dependencies](#development-environment-setup) (Node.js, Yarn, Docker)
 4. Make your changes following [style guidelines](#making-changes)
 5. [Test your changes](TESTING.md) (pre-commit and pre-push hooks run automatically)
 6. [Submit a pull request](#submission-process)
 For detailed setup and reference information, see the sections below.
 ---
 ### Sign the InfluxData CLA
 The InfluxData Contributor License Agreement (CLA) is part of the legal framework
 for the open source ecosystem that protects both you and InfluxData.
 To make substantial contributions to InfluxData documentation, first sign the InfluxData CLA.
 What constitutes a "substantial" change is at the discretion of InfluxData documentation maintainers.
 [Sign the InfluxData CLA](https://www.influxdata.com/legal/cla/)
 _**Note:** Typo and broken link fixes are greatly appreciated and do not require signing the CLA._
 _If you're new to contributing or you're looking for an easy update, see [`docs-v2` good-first-issues](https://github.com/influxdata/docs-v2/issues?q=is%3Aissue+is%3Aopen+label%3Agood-first-issue)._
 ### Fork and clone InfluxData Documentation Repository
 [Fork this repository](https://help.github.com/articles/fork-a-repo/) and
 [clone it](https://help.github.com/articles/cloning-a-repository/) to your local machine.
 ---
 ### Prerequisites
 docs-v2 automatically runs format (Markdown, JS, and CSS) linting and code block tests for staged files that you try to commit.
 For the linting and tests to run, you need to install:
 - **Node.js and Yarn**: For managing dependencies and running build scripts
 - **Docker**: For running Vale linter and code block tests
 - **VS Code extensions** (optional): For enhanced editing experience
 ```sh
 git commit -m "<COMMIT_MESSAGE>" --no-verify
 ```
 # ... (see full CONTRIBUTING.md for complete example)
 ```bash
 docker build -t influxdata/docs-pytest:latest -f Dockerfile.pytest .
 ```
 ### Install Visual Studio Code extensions
 - Comment Anchors: recognizes tags (for example, `//SOURCE`) and makes links and filepaths clickable in comments.
 - Vale: shows linter errors and suggestions in the editor.
 - YAML Schemas: validates frontmatter attributes.
 _See full DOCS-CONTRIBUTING.md for complete details._
 #### Markdown
 Most docs-v2 documentation content uses [Markdown](https://en.wikipedia.org/wiki/Markdown).
 _Some parts of the documentation, such as `./api-docs`, contain Markdown within YAML and rely on additional tooling._
 #### Semantic line feeds
 ```diff
 -Data is taking off. This data is time series. You need a database that specializes in time series. You should check out InfluxDB.
 +Data is taking off. This data is time series. You need a database that specializes in time series. You need InfluxDB.
 # ... (see full CONTRIBUTING.md for complete example)
 ```
 ### Essential Frontmatter Reference
 ```yaml
 title: # Title of the page used in the page's h1
 description: # Page description displayed in search engine results
 # ... (see full CONTRIBUTING.md for complete example)
 ```
 ### Shared Content
 ```yaml
 source: /shared/path/to/content.md
 ```
 _See full DOCS-CONTRIBUTING.md for complete details._
 #### Callouts (notes and warnings)
 ```md
 > [!Note]
 > Insert note markdown content here.
 > [!Warning]
 > Insert warning markdown content here.
 > [!Caution]
 > Insert caution markdown content here.
 > [!Important]
 > Insert important markdown content here.
 > [!Tip]
 > Insert tip markdown content here.
 ```
 #### Tabbed content
 ```md
 {{< tabs-wrapper >}}
 {{% tabs %}}
 [Button text for tab 1](#)
 [Button text for tab 2](#)
 {{% /tabs %}}
 {{% tab-content %}}
 Markdown content for tab 1.
 {{% /tab-content %}}
 {{% tab-content %}}
 Markdown content for tab 2.
 {{% /tab-content %}}
 {{< /tabs-wrapper >}}
 ```
 #### Required elements
 ```md
 {{< req >}}
 {{< req type="key" >}}
 - {{< req "\*" >}} **This element is required**
 - {{< req "\*" >}} **This element is also required**
 - **This element is NOT required**
 ```
 For the complete shortcodes reference with all available shortcodes and usage examples, see **[SHORTCODES.md](SHORTCODES.md)**.
 Test shortcodes with working examples in **[content/example.md](content/example.md)**.
 ---
 ### InfluxDB API documentation
 docs-v2 includes the InfluxDB API reference documentation in the `/api-docs` directory. The files are written in YAML and use [OpenAPI 3.0](https://swagger.io/specification/) standard.
 InfluxData uses [Redoc](https://github.com/Redocly/redoc) to build and generate the full
 InfluxDB API documentation when documentation is deployed.
 For more information about editing and generating InfluxDB API documentation, see the
 [API Documentation README](https://github.com/influxdata/docs-v2/tree/master/api-docs#readme).
 ---
 ## Testing & Quality Assurance
 Pre-commit hooks run automatically when you commit changes, testing your staged files with Vale, Prettier, Cypress, and Pytest. To skip hooks if needed:
 ```sh
 git commit -m "<COMMIT_MESSAGE>" --no-verify
 ```
 ### Quick Testing Reference
 ```bash
 # Test code blocks
 yarn test:codeblocks:all
 # Test links
 yarn test:links content/influxdb3/core/**/*.md
 # Run style linting
 docker compose run -T vale content/**/*.md
 ```
 For comprehensive testing information, including code block testing, link validation, style linting, and advanced testing procedures, see **[TESTING.md](TESTING.md)**.
 ---
 ## Submission Process
 _See full DOCS-CONTRIBUTING.md for complete details._
 ### Commit Guidelines
 When creating commits, follow these guidelines:
 - Use a clear, descriptive commit message that explains the change
 - Start with a type prefix: `fix()`, `feat()`, `style()`, `refactor()`, `test()`, `chore()`
 - For product-specific changes, include the product in parentheses: `fix(enterprise)`, `fix(influxdb3)`, `fix(core)`
 - Keep the first line under 72 characters
 - Reference issues with "closes" or "fixes": `closes #123` or `closes influxdata/DAR#123`
 - For multiple issues, use comma separation: `closes influxdata/DAR#517, closes influxdata/DAR#518`
 **Examples:**
 ```
 fix(enterprise): correct Docker environment variable name for license email
 fix(influxdb3): correct Docker environment variable and compose examples for monolith
 feat(telegraf): add new plugin documentation
 chore(ci): update Vale configuration
 ```
 ### Submit a pull request
 Push your changes up to your forked repository, then [create a new pull request](https://help.github.com/articles/creating-a-pull-request/).
 ---
 ## Reference Documentation
 For detailed reference documentation, see:
 - **[FRONTMATTER.md](FRONTMATTER.md)** - Complete frontmatter field reference with all available options
 - **[SHORTCODES.md](SHORTCODES.md)** - Complete shortcodes reference with usage examples for all available shortcodes
 #### Vale style linting configuration
 docs-v2 includes Vale writing style linter configurations to enforce documentation writing style rules, guidelines, branding, and vocabulary terms.
 **Advanced Vale usage:**
 ```sh
 docker compose run -T vale --config=content/influxdb/cloud-dedicated/.vale.ini --minAlertLevel=error content/influxdb/cloud-dedicated/write-data/**/*.md
 ```
 - **Error**:
 - **Warning**: General style guide rules and best practices
 - **Suggestion**: Style preferences that may require refactoring or updates to an exceptions list
 #### Configure style rules
 _See full DOCS-CONTRIBUTING.md for complete details._
 #### JavaScript in the documentation UI
 The InfluxData documentation UI uses TypeScript and JavaScript with ES6+ syntax and
 `assets/js/main.js` as the entry point to import modules from
 1. In your HTML file, add a `data-component` attribute to the element that
 # ... (see full CONTRIBUTING.md for complete example)
 ```js
   import { debugLog, debugBreak, debugInspect } from './utils/debug-helpers.js';
   const data = debugInspect(someData, 'Data');
   debugLog('Processing data', 'myFunction');
   function processData() {
     // Add a breakpoint that works with DevTools
     debugBreak();
     // Your existing code...
   }
   ```
 3. Start Hugo in development mode--for example:
   ```bash
   yarn hugo server
   ```
 4. In VS Code, go to Run > Start Debugging, and select the "Debug JS (debug-helpers)" configuration.
 Your system uses the configuration in `launch.json` to launch the site in Chrome
 and attach the debugger to the Developer Tools console.
 Make sure to remove the debug statements before merging your changes.
 The debug helpers are designed to be used in development and should not be used in production.
 _See full DOCS-CONTRIBUTING.md for complete details._
--- a/helper-scripts/build-agent-instructions.js
+++ b/helper-scripts/build-agent-instructions.js
@ -10,326 +10,16 @@ import process from 'process';
 import { execSync } from 'child_process';
 // Get the current file path and directory
-export { buildContributingInstructions, buildPlatformReference };
+export { buildPlatformReference };
 (async () => {
  try {
    // Check if DOCS-CONTRIBUTING.md exists before trying to build instructions
    const contributingPath = path.join(process.cwd(), 'DOCS-CONTRIBUTING.md');
    if (fs.existsSync(contributingPath)) {
      buildContributingInstructions();
    } else {
      console.log('⚠️  DOCS-CONTRIBUTING.md not found, skipping contributing instructions');
    }
    await buildPlatformReference();
  } catch (error) {
    console.error('Error generating agent instructions:', error);
  }
 })();
 /** Build instructions from DOCS-CONTRIBUTING.md
 * This script reads DOCS-CONTRIBUTING.md, formats it appropriately,
 * and saves it to .github/instructions/contributing.instructions.md
 * Includes optimization to reduce file size for better performance
 */
 function buildContributingInstructions() {
  // Paths
  const contributingPath = path.join(process.cwd(), 'DOCS-CONTRIBUTING.md');
  const instructionsDir = path.join(process.cwd(), '.github', 'instructions');
  const instructionsPath = path.join(
    instructionsDir,
    'contributing.instructions.md'
  );
  // Ensure the instructions directory exists
  if (!fs.existsSync(instructionsDir)) {
    fs.mkdirSync(instructionsDir, { recursive: true });
  }
  // Read the CONTRIBUTING.md file
  let content = fs.readFileSync(contributingPath, 'utf8');
  // Optimize content by removing less critical sections for Copilot
  content = optimizeContentForContext(content);
  // Format the content for Copilot instructions with applyTo attribute
  content = `---
 applyTo: "content/**/*.md, layouts/**/*.html"
 ---
 # Contributing instructions for InfluxData Documentation
 ## Purpose and scope
 Help document InfluxData products
 by creating clear, accurate technical content with proper
 code examples, frontmatter, shortcodes, and formatting.
 ${content}`;
  // Write the formatted content to the instructions file
  fs.writeFileSync(instructionsPath, content);
  const fileSize = fs.statSync(instructionsPath).size;
  const sizeInKB = (fileSize / 1024).toFixed(1);
  console.log(
    `✅ Generated instructions at ${instructionsPath} (${sizeInKB}KB)`
  );
  if (fileSize > 40000) {
    console.warn(
      `⚠️  Instructions file is large (${sizeInKB}KB > 40KB) and may ` +
        `impact performance`
    );
  }
  // Add the file to git if it has changed
  try {
    const gitStatus = execSync(
      `git status --porcelain "${instructionsPath}"`
    ).toString();
    if (gitStatus.trim()) {
      execSync(`git add "${instructionsPath}"`);
      console.log('✅ Added instructions file to git staging');
    }
    // Also add any extracted files to git
    const extractedFiles = execSync(
      `git status --porcelain "${instructionsDir}/*.md"`
    ).toString();
    if (extractedFiles.trim()) {
      execSync(`git add "${instructionsDir}"/*.md`);
      console.log('✅ Added extracted files to git staging');
    }
  } catch (error) {
    console.warn('⚠️  Could not add files to git:', error.message);
  }
 }
 /**
 * Optimize content for AI agents by processing sections based on tags
 * while preserving essential documentation guidance and structure
 */
 function optimizeContentForContext(content) {
  // Split content into sections based on agent:instruct tags
  const sections = [];
  const tagRegex =
    /<!-- agent:instruct: (essential|condense|remove|extract\s+\S+) -->/g;
  let lastIndex = 0;
  let matches = [...content.matchAll(tagRegex)];
  // Process each tagged section
  for (let i = 0; i < matches.length; i++) {
    const match = matches[i];
    // Add untagged content before this tag
    if (match.index > lastIndex) {
      sections.push({
        type: 'untagged',
        content: content.slice(lastIndex, match.index),
      });
    }
    // Find the end of this section (next tag or end of content)
    const nextMatch = matches[i + 1];
    const endIndex = nextMatch ? nextMatch.index : content.length;
    sections.push({
      type: match[1],
      content: content.slice(match.index, endIndex),
    });
    lastIndex = endIndex;
  }
  // Add any remaining untagged content
  if (lastIndex < content.length) {
    sections.push({
      type: 'untagged',
      content: content.slice(lastIndex),
    });
  }
  // Process sections based on their tags
  let processedContent = '';
  sections.forEach((section) => {
    switch (section.type) {
      case 'essential':
        processedContent += cleanupSection(section.content);
        break;
      case 'condense':
        processedContent += condenseSection(section.content);
        break;
      case 'remove':
        // Skip these sections entirely
        break;
      default:
        if (section.type.startsWith('extract ')) {
          const filename = section.type.substring(8); // Remove 'extract ' prefix
          processedContent += processExtractSection(section.content, filename);
        } else {
          processedContent += processUntaggedSection(section.content);
        }
        break;
      case 'untagged':
        processedContent += processUntaggedSection(section.content);
        break;
    }
  });
  // Final cleanup
  return cleanupFormatting(processedContent);
 }
 /**
 * Clean up essential sections while preserving all content
 */
 function cleanupSection(content) {
  // Remove the tag comment itself
  content = content.replace(/<!-- agent:instruct: essential -->\n?/g, '');
  // Only basic cleanup for essential sections
  content = content.replace(/\n{4,}/g, '\n\n\n');
  return content;
 }
 /**
 * Condense sections to key information
 */
 function condenseSection(content) {
  // Remove the tag comment
  content = content.replace(/<!-- agent:instruct: condense -->\n?/g, '');
  // Extract section header
  const headerMatch = content.match(/^(#+\s+.+)/m);
  if (!headerMatch) return content;
  // Condense very long code examples
  content = content.replace(/```[\s\S]{300,}?```/g, (match) => {
    const firstLines = match.split('\n').slice(0, 3).join('\n');
    return `${firstLines}\n# ... (see full CONTRIBUTING.md for complete example)\n\`\`\``;
  });
  // Keep first paragraph and key bullet points
  const lines = content.split('\n');
  const processedLines = [];
  let inCodeBlock = false;
  let paragraphCount = 0;
  for (const line of lines) {
    if (line.startsWith('```')) {
      inCodeBlock = !inCodeBlock;
      processedLines.push(line);
    } else if (inCodeBlock) {
      processedLines.push(line);
    } else if (line.startsWith('#')) {
      processedLines.push(line);
    } else if (line.trim() === '') {
      processedLines.push(line);
    } else if (
      line.startsWith('- ') ||
      line.startsWith('* ') ||
      line.match(/^\d+\./)
    ) {
      processedLines.push(line);
    } else if (paragraphCount < 2 && line.trim() !== '') {
      processedLines.push(line);
      if (line.trim() !== '' && !line.startsWith('  ')) {
        paragraphCount++;
      }
    }
  }
  return (
    processedLines.join('\n') +
    '\n\n_See full DOCS-CONTRIBUTING.md for complete details._\n\n'
  );
 }
 /**
 * Process extract sections to create separate files and placeholders
 */
 function processExtractSection(content, filename) {
  // Remove the tag comment
  content = content.replace(/<!-- agent:instruct: extract \S+ -->\n?/g, '');
  // Extract section header
  const headerMatch = content.match(/^(#+\s+.+)/m);
  if (!headerMatch) return content;
  const header = headerMatch[1];
  const sectionTitle = header.replace(/^#+\s+/, '');
  // Write the section content to a separate file
  const instructionsDir = path.join(process.cwd(), '.github', 'instructions');
  const extractedFilePath = path.join(instructionsDir, filename);
  // Add frontmatter to the extracted file
  const extractedContent = `---
 applyTo: "content/**/*.md, layouts/**/*.html"
 ---
 ${content}`;
  fs.writeFileSync(extractedFilePath, extractedContent);
  console.log(`✅ Extracted ${sectionTitle} to ${extractedFilePath}`);
  // Create a placeholder that references the extracted file
  return `${header}\n\n_For the complete ${sectionTitle} reference, see ${filename}._\n\n`;
 }
 /**
 * Process untagged sections with moderate optimization
 */
 function processUntaggedSection(content) {
  // Apply moderate processing to untagged sections
  // Condense very long code examples but keep structure
  content = content.replace(/```[\s\S]{400,}?```/g, (match) => {
    const firstLines = match.split('\n').slice(0, 5).join('\n');
    return `${firstLines}\n# ... (content truncated)\n\`\`\``;
  });
  return content;
 }
 /**
 * Clean up formatting issues in the processed content
 */
 function cleanupFormatting(content) {
  // Fix multiple consecutive newlines
  content = content.replace(/\n{4,}/g, '\n\n\n');
  // Remove agent-instructions comments that might remain
  content = content.replace(/<!-- agent:instruct: \w+ -->\n?/g, '');
  // Fix broken code blocks
  content = content.replace(
    /```(\w+)?\n\n+```/g,
    '```$1\n# (empty code block)\n```'
  );
  // Fix broken markdown headers
  content = content.replace(/^(#+)\s*$/gm, '');
  // Fix broken list formatting
  content = content.replace(/^(-|\*|\d+\.)\s*$/gm, '');
  // Remove empty sections
  content = content.replace(/^#+\s+.+\n+(?=^#+\s+)/gm, (match) => {
    if (match.trim().split('\n').length <= 2) {
      return '';
    }
    return match;
  });
  return content;
 }
 /**
 * Build PLATFORM_REFERENCE.md from data/products.yml
--- a/lefthook.yml
+++ b/lefthook.yml
@ -10,12 +10,12 @@ pre-commit:
      run: yarn eslint {staged_files}
      fail_text: "Debug helpers found! Remove debug imports and calls before committing."
    build-agent-instructions:
-      glob: "CONTRIBUTING.md"
+      glob: "data/products.yml"
      run: yarn build:agent:instructions
    # Report linting warnings and errors, don't output files to stdout
    lint-markdown:
      tags: lint
-      glob: "{CONTRIBUTING.md,content/*.md}"
+      glob: "{README.md,DOCS-*.md,api-docs/README.md,content/*.md}"
      run: |
        docker compose run --rm --name remark-lint remark-lint '{staged_files}'
    cloud-lint: