Lab 5: Signature Detection & Cropper
Difficulty: Intermediate · Estimated time: ~3 hours
Objective
- Load Grounding DINO Tiny model
- Detect signature regions in scanned document images
- Crop detected signatures with OpenCV
- Output cropped PNGs + bounding box JSON
- Serve via FastAPI endpoint
Step 1 — Install Grounding DINO
bash
pip install groundingdino-py
pip install opencv-python pillow fastapi uvicorn
Step 2 — Detect signatures
python
from groundingdino.util.inference import load_model, predict
import cv2
import json
from pathlib import Path
model = load_model("groundingdino_swint_ogc.pth", "GroundingDINO_SwinT_OGC.cfg.py")
def detect_signatures(image_path: str, threshold: float = 0.3):
image = cv2.imread(image_path)
boxes, logits, phrases = predict(
model=model,
image=image,
caption="signature . handwritten signature",
box_threshold=threshold,
text_threshold=0.25,
)
results = []
for box, score, phrase in zip(boxes, logits, phrases):
h, w = image.shape[:2]
x1, y1, x2, y2 = (box * [w, h, w, h]).astype(int)
results.append({
"bbox": [int(x1), int(y1), int(x2), int(y2)],
"confidence": float(score),
"label": phrase,
})
return results, image
def crop_signatures(image, detections: list, output_dir: str):
Path(output_dir).mkdir(exist_ok=True)
crops = []
for i, det in enumerate(detections):
x1, y1, x2, y2 = det["bbox"]
crop = image[y1:y2, x1:x2]
path = f"{output_dir}/signature_{i}.png"
cv2.imwrite(path, crop)
crops.append(path)
return crops
Step 3 — FastAPI endpoint
python
from fastapi import FastAPI, UploadFile, File
import tempfile
app = FastAPI()
@app.post("/detect-signatures")
async def detect(file: UploadFile = File(...)):
with tempfile.NamedTemporaryFile(suffix=".png") as tmp:
tmp.write(await file.read())
detections, image = detect_signatures(tmp.name)
crops = crop_signatures(image, detections, "output")
return {"detections": detections, "crops": crops}
Submission
GitHub repo with: detection script, FastAPI app, Dockerfile, sample results
Grading rubric
| Criterion | Points |
|---|---|
| Grounding DINO loads and runs | 20 |
| Signatures detected with >70% recall | 25 |
| Cropped images saved correctly | 20 |
| FastAPI endpoint works | 20 |
| Bounding box JSON output | 15 |
| Total | 100 |