ControlNet
Take precise control of image generation using ControlNet conditions, img2img, inpainting, and outpainting techniques.
What is ControlNet?
ControlNet adds spatial conditioning to Stable Diffusion. Instead of relying solely on text prompts, you provide a reference image (edges, depth map, pose, etc.) that guides the composition and structure of the generated image.
ControlNet Types
Canny Edge Detection
Extracts edges from a reference image and generates a new image following those edges. Best for maintaining the outline and structure of objects.
Depth Map
Uses a depth estimation to preserve the 3D spatial layout. Objects in the foreground and background maintain their relative positions. Great for architectural scenes and landscapes.
OpenPose
Detects human body pose (skeleton) from a reference image. The generated image will match the same pose. Essential for character art and action scenes.
Other ControlNet Models
- Scribble: Turn rough sketches into polished images
- Segmentation: Use color-coded regions to define areas (sky, ground, building)
- Normal Map: Preserve surface detail and lighting angles
- Lineart: Convert line drawings to full illustrations
- SoftEdge: Softer version of canny for more artistic freedom
Img2Img
Img2img takes an existing image as a starting point instead of random noise. The denoising strength controls how much the original image is preserved:
from diffusers import StableDiffusionImg2ImgPipeline from PIL import Image pipe = StableDiffusionImg2ImgPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5" ) init_image = Image.open("sketch.png").resize((512, 512)) result = pipe( prompt="beautiful oil painting of a mountain landscape", image=init_image, strength=0.75, # 0.0 = keep original, 1.0 = ignore original guidance_scale=7.5, ).images[0]
Inpainting
Inpainting replaces specific regions of an image while keeping the rest untouched. You provide a mask indicating which areas to regenerate:
from diffusers import StableDiffusionInpaintPipeline pipe = StableDiffusionInpaintPipeline.from_pretrained( "runwayml/stable-diffusion-inpainting" ) result = pipe( prompt="a fluffy orange cat sitting on the chair", image=original_image, mask_image=mask_image, # White = replace, Black = keep ).images[0]
Outpainting
Outpainting extends an image beyond its original borders. Place the original image on a larger canvas, mask the empty areas, and let the model fill them in while maintaining visual coherence.
What's Next?
The next lesson covers fine-tuning — how to train Stable Diffusion on your own images using DreamBooth, LoRA, and textual inversion.