Image Processing Pipeline
When scraping images, you often need to pre-process them before analysis or storage.
OpenCV Basics
OpenCV is the standard for computer vision.
python
import cv2
# Read image
img = cv2.imread('screenshot.png')
# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Find contours (e.g., to isolate a document)
contours, _ = cv2.findContours(gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
Bounding Boxes & Cropping
Use ML models (like YOLO or Grounding DINO) to detect objects, then use OpenCV to crop those bounding boxes and save them as individual files.