Intermediate

ControlNet

Take precise control of image generation using ControlNet conditions, img2img, inpainting, and outpainting techniques.

What is ControlNet?

ControlNet adds spatial conditioning to Stable Diffusion. Instead of relying solely on text prompts, you provide a reference image (edges, depth map, pose, etc.) that guides the composition and structure of the generated image.

ControlNet Types

Canny Edge Detection

Extracts edges from a reference image and generates a new image following those edges. Best for maintaining the outline and structure of objects.

Depth Map

Uses a depth estimation to preserve the 3D spatial layout. Objects in the foreground and background maintain their relative positions. Great for architectural scenes and landscapes.

OpenPose

Detects human body pose (skeleton) from a reference image. The generated image will match the same pose. Essential for character art and action scenes.

Other ControlNet Models

  • Scribble: Turn rough sketches into polished images
  • Segmentation: Use color-coded regions to define areas (sky, ground, building)
  • Normal Map: Preserve surface detail and lighting angles
  • Lineart: Convert line drawings to full illustrations
  • SoftEdge: Softer version of canny for more artistic freedom

Img2Img

Img2img takes an existing image as a starting point instead of random noise. The denoising strength controls how much the original image is preserved:

Python (diffusers)
from diffusers import StableDiffusionImg2ImgPipeline
from PIL import Image

pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5"
)

init_image = Image.open("sketch.png").resize((512, 512))

result = pipe(
    prompt="beautiful oil painting of a mountain landscape",
    image=init_image,
    strength=0.75,   # 0.0 = keep original, 1.0 = ignore original
    guidance_scale=7.5,
).images[0]

Inpainting

Inpainting replaces specific regions of an image while keeping the rest untouched. You provide a mask indicating which areas to regenerate:

Python (diffusers)
from diffusers import StableDiffusionInpaintPipeline

pipe = StableDiffusionInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-inpainting"
)

result = pipe(
    prompt="a fluffy orange cat sitting on the chair",
    image=original_image,
    mask_image=mask_image,    # White = replace, Black = keep
).images[0]

Outpainting

Outpainting extends an image beyond its original borders. Place the original image on a larger canvas, mask the empty areas, and let the model fill them in while maintaining visual coherence.

💡
Tip: You can combine multiple ControlNet models simultaneously. For example, use OpenPose for body position and Canny for structural details in the same generation.

What's Next?

The next lesson covers fine-tuning — how to train Stable Diffusion on your own images using DreamBooth, LoRA, and textual inversion.