Beginner
Image Processing
Image processing is the foundation of computer vision. Before any model can analyze an image, the image often needs to be loaded, transformed, filtered, and prepared.
Image Representation
Digital images are stored as multi-dimensional arrays (matrices). A color image is a 3D array with shape (height, width, channels).
Python - OpenCV Basics
import cv2 import numpy as np # Read an image (OpenCV uses BGR by default) img = cv2.imread("photo.jpg") print(f"Shape: {img.shape}") # (480, 640, 3) print(f"Dtype: {img.dtype}") # uint8 print(f"Size: {img.size} pixels") # 921600
Color Spaces
Different color spaces are useful for different tasks:
Python - Color Space Conversion
import cv2 img = cv2.imread("photo.jpg") # BGR to RGB (for matplotlib display) rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # BGR to Grayscale gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) print(f"Grayscale shape: {gray.shape}") # (480, 640) - single channel # BGR to HSV (useful for color detection) hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) # HSV color filtering - detect blue objects lower_blue = np.array([100, 50, 50]) upper_blue = np.array([130, 255, 255]) mask = cv2.inRange(hsv, lower_blue, upper_blue) result = cv2.bitwise_and(img, img, mask=mask)
Basic Operations
Python - Resize, Crop, Rotate
import cv2 img = cv2.imread("photo.jpg") # Resizing resized = cv2.resize(img, (320, 240)) # Exact size resized = cv2.resize(img, None, fx=0.5, fy=0.5) # Scale by 50% # Cropping (NumPy slicing) cropped = img[100:300, 200:500] # rows 100-300, cols 200-500 # Rotating h, w = img.shape[:2] center = (w // 2, h // 2) matrix = cv2.getRotationMatrix2D(center, angle=45, scale=1.0) rotated = cv2.warpAffine(img, matrix, (w, h)) # Flipping flipped_h = cv2.flip(img, 1) # Horizontal flip flipped_v = cv2.flip(img, 0) # Vertical flip
Filtering
Filters modify images by applying a kernel (small matrix) across the image:
Blur (Smoothing)
Python - Blur Filters
# Average blur blurred = cv2.blur(img, (5, 5)) # Gaussian blur (most common) gaussian = cv2.GaussianBlur(img, (5, 5), 0) # Median blur (good for salt-and-pepper noise) median = cv2.medianBlur(img, 5) # Bilateral filter (smooths while preserving edges) bilateral = cv2.bilateralFilter(img, 9, 75, 75)
Edge Detection
Python - Edge Detection
import cv2 gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Sobel edge detection (directional) sobel_x = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3) # Horizontal edges sobel_y = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3) # Vertical edges # Canny edge detection (most popular) edges = cv2.Canny(gray, threshold1=50, threshold2=150) # Laplacian edge detection laplacian = cv2.Laplacian(gray, cv2.CV_64F)
Sharpening
Python - Sharpening
import numpy as np # Sharpening kernel kernel = np.array([[ 0, -1, 0], [-1, 5, -1], [ 0, -1, 0]]) sharpened = cv2.filter2D(img, -1, kernel)
Morphological Operations
Morphological operations process binary or grayscale images based on shapes:
Python - Morphological Operations
import cv2 import numpy as np # Create a structuring element kernel = np.ones((5, 5), np.uint8) # Erosion - shrinks white regions eroded = cv2.erode(binary_img, kernel, iterations=1) # Dilation - expands white regions dilated = cv2.dilate(binary_img, kernel, iterations=1) # Opening - erosion followed by dilation (removes noise) opened = cv2.morphologyEx(binary_img, cv2.MORPH_OPEN, kernel) # Closing - dilation followed by erosion (fills holes) closed = cv2.morphologyEx(binary_img, cv2.MORPH_CLOSE, kernel)
Key takeaway: Image processing is the foundation upon which all computer vision is built. OpenCV provides a comprehensive toolkit for loading, transforming, filtering, and preparing images. These operations are essential both as standalone tools and as preprocessing steps for deep learning models.
Lilly Tech Systems