organize-ml-workspace

Name: organize-ml-workspace
Author: vathymut

Install

mkdir -p .claude/skills/organize-ml-workspace && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15846" && unzip -o skill.zip -d .claude/skills/organize-ml-workspace && rm skill.zip

Installs to .claude/skills/organize-ml-workspace

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Decide where files live in an ML experimentation project and scaffold the workspace layout. One `# %%` script per experiment, design notes in `journal/`, reusable code in `src/<pkg>/`. Triggers on new project setup, first experiment file, or questions about project structure.

276 charsno explicit “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

Organize ML Workspace

Where things live, when to create a new file, what each file is allowed to contain.

Next-step pointers

You came here for…	→ next
Bootstrap a fresh workspace	→ `python-env-manager` § Bootstrap; then `iterate-ml-experiment` § 0
First experiment script	→ `iterate-ml-experiment` § 0 for the design note
Add a new experiment iteration	→ `iterate-ml-experiment` § 1 (new vs edit decision)
Pipeline / evaluate / smoke-test content	→ `build-ml-pipeline` / `evaluate-ml-pipeline` / `smoke-test-ml-pipeline`

Always re-emit the Pre-flight checklist with evidence before declaring the turn done.

Sibling skills — open just-in-time

Don't pre-read every sibling at session start. Open each sibling SKILL.md when a step calls for it. Emit this tracker once per turn:

Sibling skills (just-in-time):
  - data-science-python-stack, python-env-manager, python-api,
    python-code-style, iterate-ml-experiment, explore-ml-data,
    build-ml-pipeline, evaluate-ml-pipeline, test-ml-pipeline,
    smoke-test-ml-pipeline

Stop conditions — read before anything else

Missing dependency. If import skore raises, STOP. Invoke python-env-manager. Do not drop skore.Project for mlflow/pickles.
Symbol from memory is forbidden. Signatures must come from python-api this turn.
Existing layout wins — detect first. Run Detection before scaffolding.
Notebooks are not silent. .ipynb in experiment folder → ask first.
Scratch is read-only against skore. MUST NOT call skore.evaluate or project.put(...). Lookup: summarize() → (key, id) → get(id).
G-TABULAR. Invoke data-science-python-stack. Persisted in JOURNAL.md.
G-PKG-NAME. AskUserQuestion before any manifest creation.
G-SKORE-MODE. AskUserQuestion for local|hub|mlflow before any template with skore.Project(...). Persists as skore mode:. Hub → follow-up for workspace name. MLflow → follow-up for tracking URI. → references/g_skore_mode.md for details.
Switching skore mode mid-project forbidden by default. Requires explicit confirmation. → references/g_skore_mode.md.
G-ENV-MGR. Hand off to python-env-manager. No pixi init/uv init until G-ENV-MGR has passed.
No-clarifying-questions hints do NOT waive these gates.
Post-hoc audit — required before ending the turn. Surface any unfilled pre-flight Evidence cell explicitly.

Forbidden shortcuts

→ See references/forbidden-shortcuts.md for the full table.

Pre-flight — emit before any code

Each ticked box needs an Evidence line (format spec in iterate-ml-experiment § "Pre-flight evidence requirements"). Inline python -c is NOT evidence — use scratch files.

Pre-flight (organize-ml-workspace):
- [ ] `Workspace decisions` in JOURNAL.md Status checked
      Evidence: <gate>: <value | not recorded> | "n/a — fresh"
- [ ] Tier 1 libs importable: sklearn, skrub, skore
      Evidence: scratch/<ts>_check_tier1.py + pixi run python
- [ ] Layout detection: <existing | fresh>
      Evidence: ls/Glob + matched Detection signal
- [ ] G-TABULAR resolved
      Evidence: AskUserQuestion | JOURNAL.md | user quote
- [ ] G-ENV-MGR resolved
      Evidence: AskUserQuestion | JOURNAL.md
- [ ] G-PKG-NAME resolved
      Evidence: AskUserQuestion | JOURNAL.md | manifest confirmed
- [ ] G-SKORE-MODE resolved
      Evidence: AskUserQuestion | JOURNAL.md `skore mode:` row
- [ ] pyproject.toml present + editable install wired
      Evidence: Read pyproject.toml + manager call
- [ ] python-api consulted: Project, put, evaluate
      Evidence: scratch/api cache or "n/a — cached"
- [ ] Decision: new file vs edit existing
      Evidence: AskUserQuestion | "n/a — first experiment"
- [ ] journal/ scaffolded
      Evidence: Write JOURNAL.md | "already exists"
- [ ] Pre-flight re-emitted with evidence
      Evidence: appears in end-of-turn summary

Detection — existing workspace first

Signal	Meaning
`pyproject.toml` with `[project] name` + setuptools/poetry/hatch	Package declared installable
`pixi.toml` / `[tool.poetry]` / `[tool.uv]` with name but no `[project]`	Manager knows project but package isn't installable
`src/<pkg>/__init__.py` or `<pkg>/__init__.py` at root	Package dir already chosen
`<pkg>.egg-info/` at root or under `src/`	Stale `pip install -e .` — flag drift
`experiments/`, `notebooks/`, `scripts/`, `analyses/`	Experiment location chosen
`audit/` with `# %%` files	Audit location chosen
`journal/`, `plans/`, `proposals/`	Journal location chosen
`reports/`, `results/`, `runs/`	Report location chosen
`tests/`	Test location chosen
`mlflow.db` / `mlruns/` at root	Prior tracker artifacts — leave alone
`.ipynb` files in experiment folder	Surface the shift and ask

Any signal present → glue to existing convention. No renames. None present → fresh scaffold (below).

→ next: G-PKG-NAME, then python-env-manager for G-ENV-MGR.

Default layout (fresh workspace)

project/
├── pyproject.toml          # declares src/<pkg>/ as installable
├── <manager manifest>      # pixi.toml / poetry / uv / hatch
├── src/<pkg>/
│   ├── __init__.py         # exposes PROJECT_ROOT
│   ├── data.py             # data loading, splits, split_kwargs
│   ├── features.py         # transformers, encoders, feature fns
│   ├── pipeline.py         # the learner declaration (skrub DataOps)
│   └── evaluate.py         # ONLY: CV strategy + optional metric overrides
├── journal/
│   ├── JOURNAL.md          # session-start log; index of experiments
│   └── 01_baseline.md      # one `.md` per planned experiment
├── experiments/
│   └── 01_baseline.py      # one `# %%` script per experiment
├── audit/
│   └── 01_baseline.py      # body owned by audit-ml-pipeline (read-only)
├── tests/
│   └── smoke/              # body owned by smoke-test-ml-pipeline
├── scratch/                # agent-only (gitignored entirely)
└── reports/                # skore Project lives here

The package is installable. pyproject.toml declares src/<pkg>/; the manager installs in editable mode. Runtime deps (sklearn, skrub, skore, tabular) live in the manager's manifest, not in [project.dependencies].

Deliberately absent: no data/ (user-owned), no models/ (out of scope). The sole writer into data/ is explore-ml-data.

File-creation rules

Design note first, then code

Before creating experiments/NN_<short_name>.py, the matching journal/NN_<short_name>.md must exist and have been validated by the user.

Four-way stem pairing

Every experiment is identified by NN_<short_name> in four places:

journal/NN_<short_name>.md            (design note)
experiments/NN_<short_name>.py        (script)
tests/smoke/test_NN_<short_name>.py   (smoke test)
audit/NN_<short_name>.py              (audit file — read-only)

New experiment → new file. Iterating → ask first.

Default: new file. When the user says "let's tweak experiment 02", fire AskUserQuestion:

Should this be a new experiment file (e.g. 04_text_encoder_v2.py) or an in-place edit of 02_text_encoder.py?

In-place edits overwrite the prior result in the skore Project if the same key is reused — flag this.

Decision flow (12 steps)

Full version: references/scaffold_steps.md.

Read project root; Detection matches → glue. No match → continue
G-PKG-NAME structured ask. Record in Workspace decisions 2a. G-SKORE-MODE ask: local | hub | mlflow (+ follow-ups)
Drop pyproject.toml from template. Hand off to python-env-manager
Create src/<pkg>/ with skeletons from templates/src_*.py
Create experiments/01_baseline.py (substitute <SKORE_PROJECT_INIT>)
Create empty tests/smoke/. Verify pytest 6a. Create empty audit/
Create journal/JOURNAL.md one-line placeholder
Create empty scratch/
Create empty reports/
Touch .gitignore — always ask about reports/. Never ignore data/
Hand off to python-code-style § Initial setup
Hand back to relevant sibling

Files in src/<pkg>/

__init__.py — exposes PROJECT_ROOT (absolute, from __file__).
data.py — loaders, X, y, split_kwargs at the X marker.
features.py — feature functions and transformers.
pipeline.py — the learner declaration (SkrubLearner).
evaluate.py — CV strategy + optional metric overrides only.

Experiment scripts — `experiments/NN_*.py`

# %% cell markers, not .ipynb. Template: templates/experiment.py.

Open / attach to the skore.Project at reports/.
Import learner from <pkg>.pipeline and CV from <pkg>.evaluate.
Call skore.evaluate(...).
Call project.put("<experiment-key>", report).

Project init substitution — <SKORE_PROJECT_INIT> in template replaced per recorded skore mode:. Three forms: → references/g_skore_mode.md.

Experiment key convention — file stem (e.g. 01_baseline.py → "01_baseline").

Companion skills

iterate-ml-experiment (journal/design notes), explore-ml-data (EDA in data/), build-ml-pipeline (pipeline.py/features.py/data.py), evaluate-ml-pipeline (evaluate.py), test-ml-pipeline (tests/ layout), smoke-test-ml-pipeline (smoke test body), audit-ml-pipeline (audit/), python-api (signatures), python-env-manager (install/bootstrap), data-science-python-stack (what to install), python-code-style (ruff.toml + NumPyDoc).

References (load on demand)

references/scaffold_steps.md — full 13-step Decision flow
references/g_skore_mode.md — G-SKORE-MODE gate detail
references/forbidden-shortcuts.md — shortcuts table

More by vathymut

View all by vathymut →

ponytail-debt

vathymut

Install

mkdir -p .claude/skills/organize-ml-workspace && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15846" && unzip -o skill.zip -d .claude/skills/organize-ml-workspace && rm skill.zip

Installs to .claude/skills/organize-ml-workspace

Safety

No risk patterns found

Automated static scan of the SKILL.md and repo. A flag describes what the skill can do — not a verdict. Always review code before installing.

Source & maintenance

Updated

1d ago

Repo stars

Loads

~2,443 tokens

Stars are for the whole repository, not this skill alone.

Stats

Views

Installs

Author

vathymut

2 skills published

Links

Source code

organize-ml-workspace

Install

Activation

About this skill

Organize ML Workspace

Next-step pointers

Sibling skills — open just-in-time

Stop conditions — read before anything else

Forbidden shortcuts

Pre-flight — emit before any code

Detection — existing workspace first

Default layout (fresh workspace)

File-creation rules

Design note first, then code

Four-way stem pairing

New experiment → new file. Iterating → ask first.

Decision flow (12 steps)

Files in src/<pkg>/

Experiment scripts — experiments/NN_*.py

Companion skills

References (load on demand)

More by vathymut

ponytail-debt

Search skills

Experiment scripts — `experiments/NN_*.py`