Skip to main content

Lab 5: Signature Detection & Cropper

Difficulty: Intermediate · Estimated time: ~3 hours

Objective

  1. Load Grounding DINO Tiny model
  2. Detect signature regions in scanned document images
  3. Crop detected signatures with OpenCV
  4. Output cropped PNGs + bounding box JSON
  5. Serve via FastAPI endpoint

Step 1 — Install Grounding DINO

bash
pip install groundingdino-py
pip install opencv-python pillow fastapi uvicorn

Step 2 — Detect signatures

python
from groundingdino.util.inference import load_model, predict
import cv2
import json
from pathlib import Path

model = load_model("groundingdino_swint_ogc.pth", "GroundingDINO_SwinT_OGC.cfg.py")

def detect_signatures(image_path: str, threshold: float = 0.3):
image = cv2.imread(image_path)
boxes, logits, phrases = predict(
model=model,
image=image,
caption="signature . handwritten signature",
box_threshold=threshold,
text_threshold=0.25,
)
results = []
for box, score, phrase in zip(boxes, logits, phrases):
h, w = image.shape[:2]
x1, y1, x2, y2 = (box * [w, h, w, h]).astype(int)
results.append({
"bbox": [int(x1), int(y1), int(x2), int(y2)],
"confidence": float(score),
"label": phrase,
})
return results, image

def crop_signatures(image, detections: list, output_dir: str):
Path(output_dir).mkdir(exist_ok=True)
crops = []
for i, det in enumerate(detections):
x1, y1, x2, y2 = det["bbox"]
crop = image[y1:y2, x1:x2]
path = f"{output_dir}/signature_{i}.png"
cv2.imwrite(path, crop)
crops.append(path)
return crops

Step 3 — FastAPI endpoint

python
from fastapi import FastAPI, UploadFile, File
import tempfile

app = FastAPI()

@app.post("/detect-signatures")
async def detect(file: UploadFile = File(...)):
with tempfile.NamedTemporaryFile(suffix=".png") as tmp:
tmp.write(await file.read())
detections, image = detect_signatures(tmp.name)
crops = crop_signatures(image, detections, "output")
return {"detections": detections, "crops": crops}

Submission

GitHub repo with: detection script, FastAPI app, Dockerfile, sample results

Grading rubric

CriterionPoints
Grounding DINO loads and runs20
Signatures detected with >70% recall25
Cropped images saved correctly20
FastAPI endpoint works20
Bounding box JSON output15
Total100