Step 4: Custom Rules & Config
Build a .ai-review.yml configuration system that lets teams define project-specific review rules, file filters, severity thresholds, and custom instructions for the LLM.
Why Configuration Matters
Every team has different standards. A generic code reviewer that applies the same rules everywhere becomes noisy and gets ignored. Configuration solves this by letting teams:
- Choose which files and directories to review (or skip)
- Define custom rules specific to their codebase
- Set minimum severity thresholds (ignore suggestions, only show warnings+)
- Add project-specific context to the LLM prompt
- Choose the LLM provider and model
The .ai-review.yml Config File
Teams place this file in the root of their repository. Here is the full specification:
# .ai-review.yml - AI Code Review Configuration
# LLM settings
model:
provider: openai # openai | anthropic
name: gpt-4o # Model name
temperature: 0.1 # Lower = more consistent
# File filtering
files:
# Only review files matching these patterns (glob syntax)
include:
- "src/**/*.ts"
- "src/**/*.js"
- "lib/**/*.py"
# Never review files matching these patterns
exclude:
- "**/*.test.ts"
- "**/*.spec.js"
- "**/migrations/**"
- "**/generated/**"
- "**/*.min.js"
# Severity filtering
severity:
minimum: warning # critical | warning | suggestion
# If any critical issues found, block the PR
block_on_critical: true
# Maximum number of comments per review
max_comments: 15
# Custom rules - these are injected into the LLM prompt
rules:
- name: no-console-log
description: "Do not use console.log in production code. Use the logger service instead."
severity: warning
files: "src/**/*.ts"
- name: require-error-handling
description: "All async functions must have try/catch or .catch() error handling."
severity: critical
files: "src/**/*.ts"
- name: no-hardcoded-secrets
description: "Never hardcode API keys, tokens, or passwords. Use environment variables."
severity: critical
- name: require-input-validation
description: "All API endpoint handlers must validate input parameters."
severity: warning
files: "src/routes/**"
- name: no-sql-injection
description: "Never use string concatenation for SQL queries. Use parameterized queries."
severity: critical
# Additional context for the LLM (project-specific)
context: |
This is a Node.js/TypeScript REST API.
We use Express for routing, Prisma for database access,
and Zod for input validation.
Our logging service is imported from '@/lib/logger'.
All API responses should use the standardized response format
from '@/lib/response'.
Building the Config Loader
Create src/config.js to load and validate the configuration:
// src/config.js
const yaml = require('js-yaml');
const { octokit } = require('./github');
// Default configuration when no .ai-review.yml exists
const DEFAULT_CONFIG = {
model: {
provider: 'openai',
name: 'gpt-4o',
temperature: 0.1,
},
files: {
include: ['**/*'],
exclude: [
'**/node_modules/**',
'**/vendor/**',
'**/dist/**',
'**/build/**',
'**/*.min.*',
'**/package-lock.json',
'**/yarn.lock',
],
},
severity: {
minimum: 'suggestion',
block_on_critical: true,
},
max_comments: 20,
rules: [],
context: '',
};
/**
* Load the .ai-review.yml config from the repository.
* Falls back to defaults if the file does not exist.
*/
async function loadConfig(owner, repo, ref) {
try {
const response = await octokit.repos.getContent({
owner,
repo,
path: '.ai-review.yml',
ref,
});
// GitHub returns base64-encoded content
const content = Buffer.from(response.data.content, 'base64').toString();
const userConfig = yaml.load(content);
// Deep merge with defaults
return mergeConfig(DEFAULT_CONFIG, userConfig);
} catch (error) {
if (error.status === 404) {
console.log('No .ai-review.yml found, using defaults');
return { ...DEFAULT_CONFIG };
}
console.error('Error loading config:', error.message);
return { ...DEFAULT_CONFIG };
}
}
/**
* Deep merge user config with defaults.
* User values override defaults where provided.
*/
function mergeConfig(defaults, userConfig) {
if (!userConfig) return { ...defaults };
return {
model: { ...defaults.model, ...userConfig.model },
files: {
include: userConfig.files?.include || defaults.files.include,
exclude: [
...defaults.files.exclude,
...(userConfig.files?.exclude || []),
],
},
severity: { ...defaults.severity, ...userConfig.severity },
max_comments: userConfig.max_comments ?? defaults.max_comments,
rules: userConfig.rules || defaults.rules,
context: userConfig.context || defaults.context,
};
}
/**
* Check if a file should be reviewed based on config.
*/
function shouldReviewFile(filename, config) {
const { include, exclude } = config.files;
// Check exclude patterns first
for (const pattern of exclude) {
if (matchGlob(filename, pattern)) return false;
}
// Check include patterns
for (const pattern of include) {
if (matchGlob(filename, pattern)) return true;
}
return false;
}
/**
* Simple glob matcher supporting * and ** patterns.
*/
function matchGlob(filepath, pattern) {
// Convert glob to regex
const regex = pattern
.replace(/\./g, '\\.')
.replace(/\*\*/g, '{{GLOBSTAR}}')
.replace(/\*/g, '[^/]*')
.replace(/\{\{GLOBSTAR\}\}/g, '.*');
return new RegExp(`^${regex}$`).test(filepath);
}
/**
* Check if an issue meets the minimum severity threshold.
*/
function meetsMinimumSeverity(issueSeverity, minimumSeverity) {
const levels = { suggestion: 0, warning: 1, critical: 2 };
return (levels[issueSeverity] || 0) >= (levels[minimumSeverity] || 0);
}
module.exports = {
loadConfig,
mergeConfig,
shouldReviewFile,
matchGlob,
meetsMinimumSeverity,
DEFAULT_CONFIG,
};
Building the Rule Engine
Custom rules are injected into the LLM prompt as additional instructions. Create src/rules.js:
// src/rules.js
const { matchGlob } = require('./config');
/**
* Build additional prompt instructions from custom rules.
* Only includes rules that match the current file being reviewed.
*/
function buildRulePrompt(rules, filename) {
const applicableRules = rules.filter(rule => {
// If the rule specifies file patterns, check them
if (rule.files) {
return matchGlob(filename, rule.files);
}
// Rules without file patterns apply to all files
return true;
});
if (applicableRules.length === 0) return '';
let prompt = '\n\nAdditional project-specific rules to enforce:\n';
for (const rule of applicableRules) {
prompt += `\n- [${rule.severity?.toUpperCase() || 'WARNING'}] ${rule.name}: `;
prompt += `${rule.description}`;
}
return prompt;
}
/**
* Build context prompt from config.
*/
function buildContextPrompt(config) {
if (!config.context) return '';
return `\n\nProject context:\n${config.context}`;
}
/**
* Filter issues based on config settings.
*/
function filterIssues(issues, config) {
let filtered = issues;
// Filter by minimum severity
if (config.severity?.minimum) {
const levels = { suggestion: 0, warning: 1, critical: 2 };
const minLevel = levels[config.severity.minimum] || 0;
filtered = filtered.filter(issue => {
const issueLevel = levels[issue.severity] || 0;
return issueLevel >= minLevel;
});
}
// Limit number of comments
if (config.max_comments && filtered.length > config.max_comments) {
// Prioritize by severity (critical first, then warning, then suggestion)
filtered.sort((a, b) => {
const levels = { critical: 2, warning: 1, suggestion: 0 };
return (levels[b.severity] || 0) - (levels[a.severity] || 0);
});
filtered = filtered.slice(0, config.max_comments);
}
return filtered;
}
module.exports = { buildRulePrompt, buildContextPrompt, filterIssues };
Integrating Config into the Pipeline
Now update src/github.js to use the config system. The key changes are loading the config at the start, using it for file filtering, and passing rules to the analyzer:
// Updated handlePullRequest in src/github.js
const { loadConfig, shouldReviewFile } = require('./config');
const { filterIssues } = require('./rules');
async function handlePullRequest({ owner, repo, pullNumber, headSha }) {
// Load project config
const config = await loadConfig(owner, repo, headSha);
console.log(`Config loaded: ${config.rules.length} custom rules`);
// Fetch and parse the diff
const diff = await fetchPRDiff(owner, repo, pullNumber);
const files = parseDiff(diff);
// Filter using config-based rules
const reviewableFiles = files.filter(f =>
f.status !== 'removed' && shouldReviewFile(f.filename, config)
);
if (reviewableFiles.length === 0) {
console.log('No reviewable files after config filtering');
return;
}
// Analyze with LLM (pass config for rules and context)
const rawIssues = await analyzeCode(reviewableFiles, config);
// Filter issues based on severity threshold and max comments
const issues = filterIssues(rawIssues, config);
// Post comments
if (issues.length > 0) {
await postReviewComments({ owner, repo, pullNumber, headSha, issues });
}
}
.ai-review.yml file and open a PR that violates one of your custom rules. Verify that the bot catches the violation and applies the correct severity.Example Configs for Common Projects
React/TypeScript Project
files:
include:
- "src/**/*.tsx"
- "src/**/*.ts"
exclude:
- "src/**/*.test.tsx"
- "src/**/*.stories.tsx"
rules:
- name: no-any
description: "Avoid using 'any' type. Use proper TypeScript types."
severity: warning
- name: use-memo
description: "Expensive computations in render should use useMemo."
severity: suggestion
context: |
React 18 project with TypeScript strict mode.
State management with Zustand. Styling with Tailwind CSS.
Python FastAPI Project
files:
include:
- "app/**/*.py"
exclude:
- "app/tests/**"
- "app/migrations/**"
rules:
- name: type-hints
description: "All function parameters and return values must have type hints."
severity: warning
- name: async-db
description: "Database operations must use async/await. Never use synchronous DB calls."
severity: critical
context: |
Python FastAPI backend with SQLAlchemy async ORM.
All endpoints use Pydantic models for validation.
What Is Next
The config system is complete. Teams can now customize every aspect of their AI code review. In the next lesson, we will build Step 5: Deploy as GitHub App — packaging everything as a proper GitHub App with installation flows, production deployment, and monitoring.
Lilly Tech Systems