Plate Evaluation
Runs current model against validation set and returns JSON metrics with automated recommendations
Install
mkdir -p .claude/skills/plate-evaluation && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/13915" && unzip -o skill.zip -d .claude/skills/plate-evaluation && rm skill.zipInstalls to .claude/skills/plate-evaluation
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
Runs current model against validation set and returns JSON metrics with automated recommendationsAbout this skill
Plate Evaluation Skill
Quick evaluation utility for assessing license plate detection model performance and providing actionable recommendations.
Usage
cd /home/lol/Downloads/autotoll-ai/backend
python scripts/evaluate.py --data ../data.yaml --model specialized_plate_detector.pt
Workflow
1. Execute Evaluation
Run the evaluation script on the validation dataset:
python scripts/evaluate.py \
--data ./data.yaml \
--model specialized_plate_detector.pt \
--output current_metrics.json
Output format:
{
"model_path": "specialized_plate_detector.pt",
"timestamp": "2026-01-31T11:30:00",
"detection_metrics": {
"precision": 0.891,
"recall": 0.911,
"mAP50": 0.906,
"mAP50_95": 0.631
},
"ocr_metrics": {
"character_accuracy": 0.94,
"exact_match_rate": 0.82,
"avg_confidence": 0.87
},
"performance": {
"avg_inference_time_ms": 145,
"device": "cuda:0"
}
}
2. Compare Against Baseline
Load baseline metrics and calculate deltas:
python -c "
import json
with open('baseline.json') as f:
baseline = json.load(f)
with open('current_metrics.json') as f:
current = json.load(f)
print('Performance Comparison:')
print(f\"mAP50: {baseline['detection_metrics']['mAP50']:.3f} → {current['detection_metrics']['mAP50']:.3f} ({((current['detection_metrics']['mAP50']/baseline['detection_metrics']['mAP50']-1)*100):+.1f}%)\")
print(f\"OCR Accuracy: {baseline['ocr_metrics']['exact_match_rate']:.3f} → {current['ocr_metrics']['exact_match_rate']:.3f} ({((current['ocr_metrics']['exact_match_rate']/baseline['ocr_metrics']['exact_match_rate']-1)*100):+.1f}%)\")
"
3. Automated Decision Logic
The skill analyzes metrics and provides recommendations:
If OCR Accuracy < 95%
Diagnosis: OCR stage is the bottleneck
Recommendations:
-
Increase EasyOCR magnification ratio:
- Edit
backend/main.pyline 934, 984 - Change
mag_ratio=1.5tomag_ratio=2.0or higher - Trade-off: Slower inference (+20-40ms) for better accuracy
- Edit
-
Implement TrOCR (ViT-based OCR):
- Install:
pip install transformers torch - Add TrOCR inference as fallback for low-confidence detections
- Expected: +10-15% OCR accuracy improvement
- Install:
-
Enhance preprocessing:
- Add bilateral filter before CLAHE
- Implement adaptive thresholding for varied lighting
- Use unsharp masking to sharpen text edges
If mAP50 < 0.85
Diagnosis: Detection stage missing plates or producing false positives
Recommendations:
-
Increase training epochs:
- Current model may be undertrained
- Re-run training with
--epochs 150or--epochs 200
-
Improve data augmentation:
- Add perspective transforms for skewed angles
- Increase HSV variation for lighting robustness
- Enable mosaic augmentation (multi-scale)
-
Upgrade model architecture:
- Switch from YOLOv8n (nano) to YOLOv8s (small)
- Consider YOLOv11 with OBB for rotation handling
- Trade-off: Higher accuracy but slower inference
If Recall < 0.90
Diagnosis: Model missing valid plates (false negatives)
Recommendations:
-
Lower confidence threshold:
- Edit detection confidence in
backend/main.py - Default YOLO confidence is 0.25, try 0.15-0.20
- Increases detections but may add false positives
- Edit detection confidence in
-
Check for dataset imbalance:
- Verify validation set has diverse scenarios
- Inspect missed detections for patterns (angles, lighting)
- Add more training data for underrepresented cases
-
Enable specialized plate model:
- Train custom detector:
python train.py --data ../data.yaml --epochs 100 - Specialist model focuses only on plates, not general vehicles
- Train custom detector:
If Precision < 0.85
Diagnosis: Too many false positives (non-plates detected)
Recommendations:
-
Raise confidence threshold:
- Increase from default 0.25 to 0.35-0.40
- Reduces false alarms at cost of some missed plates
-
Improve regex filtering:
- Strengthen validation in
score_plate()function - Add more brand names to blacklist
- Require minimum alphanumeric mix
- Strengthen validation in
-
Add negative samples to training:
- Include images without plates
- Train model to recognize "background" class
- Reduces false positives on signage, text
Output Summary
The skill returns a JSON summary with actionable items:
{
"evaluation_date": "2026-01-31",
"overall_grade": "B+",
"bottleneck": "OCR accuracy",
"recommendations": [
{
"priority": "HIGH",
"action": "Increase mag_ratio to 2.5",
"expected_improvement": "+8% OCR accuracy",
"implementation_time": "5 minutes"
},
{
"priority": "MEDIUM",
"action": "Add bilateral filter preprocessing",
"expected_improvement": "+3% mAP, +5% OCR in low light",
"implementation_time": "15 minutes"
}
],
"deploy_decision": "RECOMMEND_IMPROVEMENTS_FIRST"
}
Integration with ALPR Optimizer
This skill is designed to be called within the ALPR Optimizer workflow:
- Pre-training: Baseline evaluation
- Post-training: Improved model evaluation
- Decision point: Auto-approve deployment if metrics meet thresholds