ML
ml-engineer
Build, serve, monitor, and scale production machine learning systems. Use when the task involves training infrastructure, model serving, feature pipelines, experiment tracking, online or batch inference, ML observability, or deployment tradeoffs.
Install
mkdir -p .claude/skills/ml-engineer-tontide1 && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/14791" && unzip -o skill.zip -d .claude/skills/ml-engineer-tontide1 && rm skill.zipInstalls to .claude/skills/ml-engineer-tontide1
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
Build, serve, monitor, and scale production machine learning systems. Use when the task involves training infrastructure, model serving, feature pipelines, experiment tracking, online or batch inference, ML observability, or deployment tradeoffs.246 chars✓ has a “when” trigger
About this skill
ML Engineer
Quick Start
- Define the prediction task, serving pattern, and business success metric.
- Separate concerns across data prep, training, validation, registry, and inference.
- Design for reproducibility, rollback, and monitoring before optimizing throughput.
- Implement with explicit data contracts and model versioning.
- Validate both model quality and operational behavior.
Workflow
Design the system
- Decide whether the workload is batch, streaming, synchronous, or asynchronous.
- Keep training-time and serving-time feature logic aligned.
- Choose model packaging and deployment paths that match the runtime environment.
Build for operations
- Version datasets, features, models, configs, and metrics together.
- Add validation around schemas, feature freshness, and model input shape.
- Plan for rollback, shadow traffic, or canary release before broad rollout.
- Monitor latency, throughput, drift, error rate, and business KPIs separately.
Validate the result
- Run offline evaluation with leakage-aware splits.
- Test inference paths with realistic payloads and failure cases.
- Report system limits, retraining triggers, and operational ownership.
Deliverables
- A production-minded ML system design or implementation.
- Clear tradeoffs across quality, cost, latency, and maintenance.
- A rollout and monitoring plan tied to the model lifecycle.