Beginner

Image Processing

Image processing is the foundation of computer vision. Before any model can analyze an image, the image often needs to be loaded, transformed, filtered, and prepared.

Image Representation

Digital images are stored as multi-dimensional arrays (matrices). A color image is a 3D array with shape (height, width, channels).

Python - OpenCV Basics
import cv2
import numpy as np

# Read an image (OpenCV uses BGR by default)
img = cv2.imread("photo.jpg")
print(f"Shape: {img.shape}")       # (480, 640, 3)
print(f"Dtype: {img.dtype}")       # uint8
print(f"Size: {img.size} pixels") # 921600

Color Spaces

Different color spaces are useful for different tasks:

Python - Color Space Conversion
import cv2

img = cv2.imread("photo.jpg")

# BGR to RGB (for matplotlib display)
rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# BGR to Grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(f"Grayscale shape: {gray.shape}")  # (480, 640) - single channel

# BGR to HSV (useful for color detection)
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# HSV color filtering - detect blue objects
lower_blue = np.array([100, 50, 50])
upper_blue = np.array([130, 255, 255])
mask = cv2.inRange(hsv, lower_blue, upper_blue)
result = cv2.bitwise_and(img, img, mask=mask)

Basic Operations

Python - Resize, Crop, Rotate
import cv2

img = cv2.imread("photo.jpg")

# Resizing
resized = cv2.resize(img, (320, 240))                    # Exact size
resized = cv2.resize(img, None, fx=0.5, fy=0.5)          # Scale by 50%

# Cropping (NumPy slicing)
cropped = img[100:300, 200:500]  # rows 100-300, cols 200-500

# Rotating
h, w = img.shape[:2]
center = (w // 2, h // 2)
matrix = cv2.getRotationMatrix2D(center, angle=45, scale=1.0)
rotated = cv2.warpAffine(img, matrix, (w, h))

# Flipping
flipped_h = cv2.flip(img, 1)   # Horizontal flip
flipped_v = cv2.flip(img, 0)   # Vertical flip

Filtering

Filters modify images by applying a kernel (small matrix) across the image:

Blur (Smoothing)

Python - Blur Filters
# Average blur
blurred = cv2.blur(img, (5, 5))

# Gaussian blur (most common)
gaussian = cv2.GaussianBlur(img, (5, 5), 0)

# Median blur (good for salt-and-pepper noise)
median = cv2.medianBlur(img, 5)

# Bilateral filter (smooths while preserving edges)
bilateral = cv2.bilateralFilter(img, 9, 75, 75)

Edge Detection

Python - Edge Detection
import cv2

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Sobel edge detection (directional)
sobel_x = cv2.Sobel(gray, cv2.CV_64F, 1, 0, ksize=3)  # Horizontal edges
sobel_y = cv2.Sobel(gray, cv2.CV_64F, 0, 1, ksize=3)  # Vertical edges

# Canny edge detection (most popular)
edges = cv2.Canny(gray, threshold1=50, threshold2=150)

# Laplacian edge detection
laplacian = cv2.Laplacian(gray, cv2.CV_64F)

Sharpening

Python - Sharpening
import numpy as np

# Sharpening kernel
kernel = np.array([[ 0, -1,  0],
                   [-1,  5, -1],
                   [ 0, -1,  0]])

sharpened = cv2.filter2D(img, -1, kernel)

Morphological Operations

Morphological operations process binary or grayscale images based on shapes:

Python - Morphological Operations
import cv2
import numpy as np

# Create a structuring element
kernel = np.ones((5, 5), np.uint8)

# Erosion - shrinks white regions
eroded = cv2.erode(binary_img, kernel, iterations=1)

# Dilation - expands white regions
dilated = cv2.dilate(binary_img, kernel, iterations=1)

# Opening - erosion followed by dilation (removes noise)
opened = cv2.morphologyEx(binary_img, cv2.MORPH_OPEN, kernel)

# Closing - dilation followed by erosion (fills holes)
closed = cv2.morphologyEx(binary_img, cv2.MORPH_CLOSE, kernel)
Key takeaway: Image processing is the foundation upon which all computer vision is built. OpenCV provides a comprehensive toolkit for loading, transforming, filtering, and preparing images. These operations are essential both as standalone tools and as preprocessing steps for deep learning models.