Advanced

Step 4: Custom Rules & Config

Build a .ai-review.yml configuration system that lets teams define project-specific review rules, file filters, severity thresholds, and custom instructions for the LLM.

Why Configuration Matters

Every team has different standards. A generic code reviewer that applies the same rules everywhere becomes noisy and gets ignored. Configuration solves this by letting teams:

  • Choose which files and directories to review (or skip)
  • Define custom rules specific to their codebase
  • Set minimum severity thresholds (ignore suggestions, only show warnings+)
  • Add project-specific context to the LLM prompt
  • Choose the LLM provider and model

The .ai-review.yml Config File

Teams place this file in the root of their repository. Here is the full specification:

# .ai-review.yml - AI Code Review Configuration

# LLM settings
model:
  provider: openai          # openai | anthropic
  name: gpt-4o              # Model name
  temperature: 0.1          # Lower = more consistent

# File filtering
files:
  # Only review files matching these patterns (glob syntax)
  include:
    - "src/**/*.ts"
    - "src/**/*.js"
    - "lib/**/*.py"
  # Never review files matching these patterns
  exclude:
    - "**/*.test.ts"
    - "**/*.spec.js"
    - "**/migrations/**"
    - "**/generated/**"
    - "**/*.min.js"

# Severity filtering
severity:
  minimum: warning           # critical | warning | suggestion
  # If any critical issues found, block the PR
  block_on_critical: true

# Maximum number of comments per review
max_comments: 15

# Custom rules - these are injected into the LLM prompt
rules:
  - name: no-console-log
    description: "Do not use console.log in production code. Use the logger service instead."
    severity: warning
    files: "src/**/*.ts"

  - name: require-error-handling
    description: "All async functions must have try/catch or .catch() error handling."
    severity: critical
    files: "src/**/*.ts"

  - name: no-hardcoded-secrets
    description: "Never hardcode API keys, tokens, or passwords. Use environment variables."
    severity: critical

  - name: require-input-validation
    description: "All API endpoint handlers must validate input parameters."
    severity: warning
    files: "src/routes/**"

  - name: no-sql-injection
    description: "Never use string concatenation for SQL queries. Use parameterized queries."
    severity: critical

# Additional context for the LLM (project-specific)
context: |
  This is a Node.js/TypeScript REST API.
  We use Express for routing, Prisma for database access,
  and Zod for input validation.
  Our logging service is imported from '@/lib/logger'.
  All API responses should use the standardized response format
  from '@/lib/response'.

Building the Config Loader

Create src/config.js to load and validate the configuration:

// src/config.js
const yaml = require('js-yaml');
const { octokit } = require('./github');

// Default configuration when no .ai-review.yml exists
const DEFAULT_CONFIG = {
  model: {
    provider: 'openai',
    name: 'gpt-4o',
    temperature: 0.1,
  },
  files: {
    include: ['**/*'],
    exclude: [
      '**/node_modules/**',
      '**/vendor/**',
      '**/dist/**',
      '**/build/**',
      '**/*.min.*',
      '**/package-lock.json',
      '**/yarn.lock',
    ],
  },
  severity: {
    minimum: 'suggestion',
    block_on_critical: true,
  },
  max_comments: 20,
  rules: [],
  context: '',
};

/**
 * Load the .ai-review.yml config from the repository.
 * Falls back to defaults if the file does not exist.
 */
async function loadConfig(owner, repo, ref) {
  try {
    const response = await octokit.repos.getContent({
      owner,
      repo,
      path: '.ai-review.yml',
      ref,
    });

    // GitHub returns base64-encoded content
    const content = Buffer.from(response.data.content, 'base64').toString();
    const userConfig = yaml.load(content);

    // Deep merge with defaults
    return mergeConfig(DEFAULT_CONFIG, userConfig);
  } catch (error) {
    if (error.status === 404) {
      console.log('No .ai-review.yml found, using defaults');
      return { ...DEFAULT_CONFIG };
    }
    console.error('Error loading config:', error.message);
    return { ...DEFAULT_CONFIG };
  }
}

/**
 * Deep merge user config with defaults.
 * User values override defaults where provided.
 */
function mergeConfig(defaults, userConfig) {
  if (!userConfig) return { ...defaults };

  return {
    model: { ...defaults.model, ...userConfig.model },
    files: {
      include: userConfig.files?.include || defaults.files.include,
      exclude: [
        ...defaults.files.exclude,
        ...(userConfig.files?.exclude || []),
      ],
    },
    severity: { ...defaults.severity, ...userConfig.severity },
    max_comments: userConfig.max_comments ?? defaults.max_comments,
    rules: userConfig.rules || defaults.rules,
    context: userConfig.context || defaults.context,
  };
}

/**
 * Check if a file should be reviewed based on config.
 */
function shouldReviewFile(filename, config) {
  const { include, exclude } = config.files;

  // Check exclude patterns first
  for (const pattern of exclude) {
    if (matchGlob(filename, pattern)) return false;
  }

  // Check include patterns
  for (const pattern of include) {
    if (matchGlob(filename, pattern)) return true;
  }

  return false;
}

/**
 * Simple glob matcher supporting * and ** patterns.
 */
function matchGlob(filepath, pattern) {
  // Convert glob to regex
  const regex = pattern
    .replace(/\./g, '\\.')
    .replace(/\*\*/g, '{{GLOBSTAR}}')
    .replace(/\*/g, '[^/]*')
    .replace(/\{\{GLOBSTAR\}\}/g, '.*');

  return new RegExp(`^${regex}$`).test(filepath);
}

/**
 * Check if an issue meets the minimum severity threshold.
 */
function meetsMinimumSeverity(issueSeverity, minimumSeverity) {
  const levels = { suggestion: 0, warning: 1, critical: 2 };
  return (levels[issueSeverity] || 0) >= (levels[minimumSeverity] || 0);
}

module.exports = {
  loadConfig,
  mergeConfig,
  shouldReviewFile,
  matchGlob,
  meetsMinimumSeverity,
  DEFAULT_CONFIG,
};

Building the Rule Engine

Custom rules are injected into the LLM prompt as additional instructions. Create src/rules.js:

// src/rules.js
const { matchGlob } = require('./config');

/**
 * Build additional prompt instructions from custom rules.
 * Only includes rules that match the current file being reviewed.
 */
function buildRulePrompt(rules, filename) {
  const applicableRules = rules.filter(rule => {
    // If the rule specifies file patterns, check them
    if (rule.files) {
      return matchGlob(filename, rule.files);
    }
    // Rules without file patterns apply to all files
    return true;
  });

  if (applicableRules.length === 0) return '';

  let prompt = '\n\nAdditional project-specific rules to enforce:\n';

  for (const rule of applicableRules) {
    prompt += `\n- [${rule.severity?.toUpperCase() || 'WARNING'}] ${rule.name}: `;
    prompt += `${rule.description}`;
  }

  return prompt;
}

/**
 * Build context prompt from config.
 */
function buildContextPrompt(config) {
  if (!config.context) return '';

  return `\n\nProject context:\n${config.context}`;
}

/**
 * Filter issues based on config settings.
 */
function filterIssues(issues, config) {
  let filtered = issues;

  // Filter by minimum severity
  if (config.severity?.minimum) {
    const levels = { suggestion: 0, warning: 1, critical: 2 };
    const minLevel = levels[config.severity.minimum] || 0;

    filtered = filtered.filter(issue => {
      const issueLevel = levels[issue.severity] || 0;
      return issueLevel >= minLevel;
    });
  }

  // Limit number of comments
  if (config.max_comments && filtered.length > config.max_comments) {
    // Prioritize by severity (critical first, then warning, then suggestion)
    filtered.sort((a, b) => {
      const levels = { critical: 2, warning: 1, suggestion: 0 };
      return (levels[b.severity] || 0) - (levels[a.severity] || 0);
    });
    filtered = filtered.slice(0, config.max_comments);
  }

  return filtered;
}

module.exports = { buildRulePrompt, buildContextPrompt, filterIssues };

Integrating Config into the Pipeline

Now update src/github.js to use the config system. The key changes are loading the config at the start, using it for file filtering, and passing rules to the analyzer:

// Updated handlePullRequest in src/github.js
const { loadConfig, shouldReviewFile } = require('./config');
const { filterIssues } = require('./rules');

async function handlePullRequest({ owner, repo, pullNumber, headSha }) {
  // Load project config
  const config = await loadConfig(owner, repo, headSha);
  console.log(`Config loaded: ${config.rules.length} custom rules`);

  // Fetch and parse the diff
  const diff = await fetchPRDiff(owner, repo, pullNumber);
  const files = parseDiff(diff);

  // Filter using config-based rules
  const reviewableFiles = files.filter(f =>
    f.status !== 'removed' && shouldReviewFile(f.filename, config)
  );

  if (reviewableFiles.length === 0) {
    console.log('No reviewable files after config filtering');
    return;
  }

  // Analyze with LLM (pass config for rules and context)
  const rawIssues = await analyzeCode(reviewableFiles, config);

  // Filter issues based on severity threshold and max comments
  const issues = filterIssues(rawIssues, config);

  // Post comments
  if (issues.length > 0) {
    await postReviewComments({ owner, repo, pullNumber, headSha, issues });
  }
}
💡
Testing configs: Create a test repository with a .ai-review.yml file and open a PR that violates one of your custom rules. Verify that the bot catches the violation and applies the correct severity.

Example Configs for Common Projects

React/TypeScript Project

files:
  include:
    - "src/**/*.tsx"
    - "src/**/*.ts"
  exclude:
    - "src/**/*.test.tsx"
    - "src/**/*.stories.tsx"
rules:
  - name: no-any
    description: "Avoid using 'any' type. Use proper TypeScript types."
    severity: warning
  - name: use-memo
    description: "Expensive computations in render should use useMemo."
    severity: suggestion
context: |
  React 18 project with TypeScript strict mode.
  State management with Zustand. Styling with Tailwind CSS.

Python FastAPI Project

files:
  include:
    - "app/**/*.py"
  exclude:
    - "app/tests/**"
    - "app/migrations/**"
rules:
  - name: type-hints
    description: "All function parameters and return values must have type hints."
    severity: warning
  - name: async-db
    description: "Database operations must use async/await. Never use synchronous DB calls."
    severity: critical
context: |
  Python FastAPI backend with SQLAlchemy async ORM.
  All endpoints use Pydantic models for validation.

What Is Next

The config system is complete. Teams can now customize every aspect of their AI code review. In the next lesson, we will build Step 5: Deploy as GitHub App — packaging everything as a proper GitHub App with installation flows, production deployment, and monitoring.