sue-update-lesson
Use whenever a durable SUE / scale-up lesson is learned in any sue-*
Install
mkdir -p .claude/skills/sue-update-lesson && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/13522" && unzip -o skill.zip -d .claude/skills/sue-update-lesson && rm skill.zipInstalls to .claude/skills/sue-update-lesson
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
Use whenever a durable SUE / scale-up lesson is learned in any sue-*About this skill
Sue Update Lesson
Overview
Append durable SUE lessons to the correct L1 or L2 SCALE_UP.md file after a
workflow issue has been solved and encoded in durable behavior.
DeepResearch Root Contract (HIGHEST PRIORITY)
Before any SUE action in this skill, verify the workspace output-root contract. If any check fails, STOP and abort; do not proceed with discovery, planning, submission, monitoring, cleanup, reporting, or lessons.
- The workspace path must be
$<SANDBOX>_DEEPRESEARCH_ROOT/workspace/<workspace>/. - All scale-up outputs must be written to
$<SANDBOX>_DEEPRESEARCH_ROOT/workspace/<workspace>/scale_up_outputs/or a subpath declared inruntime.yamlunder that root. $<SANDBOX>_DEEPRESEARCH_ROOTmust be resolvable viasue-context(or the selected sandbox's private config for remote backends). Do not fall back to$HOME, arbitrary scratch, or the source tree.
This contract takes precedence over sandbox selection, GPU accounting, manifest
Hard Sandbox Repo Root Gate
Before any remote query, sync, cleanup, dryrun/fullrun submission, monitor, quota check, or output-path mutation, verify the selected backend exports a non-empty <SANDBOX>_DEEPRESEARCH_ROOT — one of LUMI_DEEPRESEARCH_ROOT, NM5_DEEPRESEARCH_ROOT, SNELLIUS_DEEPRESEARCH_ROOT, BREV_DEEPRESEARCH_ROOT, AUTODL_DEEPRESEARCH_ROOT, or RUNPOD_DEEPRESEARCH_ROOT — and that the directory exists. If the root is unset, empty, missing, or only inferable from cwd, scratch/project roots, PROJECT_ROOT, NM5_PROJECT_ROOT, workspace name, or marker walking, terminate immediately with a blocker naming the missing key. Do not derive a replacement root, do not continue with a guessed checkout, and do not run destructive or quota-consuming commands.
validation, and all downstream steps.
Sandbox Communication
When this workflow communicates with a remote sandbox, keep SSH/API calls as
control-plane actions: launch, inspect, fetch logs, or stop work. If a remote
command is likely to run long, stream substantial output, or require repeated
polling, start it inside a detached tmux session on the sandbox and return
after verifying the session name and durable log path. Do not keep a local SSH
connection open as the job supervisor.
Use a stable session name and a log under the workspace's configured
logs_root or scale_up_outputs/logs/. Prefer
tmux new-session -d -s <name> 'bash <script> 2>&1 | tee -a <log>'; avoid
tmux send-keys. On Slurm-capable sandboxes, submit scheduler jobs normally and
use tmux only for long login-node orchestration or monitor loops. On direct-run
sandboxes such as Brev, AutoDL, or RunPod, use tmux for long-running remote
commands unless the platform provides an equivalent detached process supervisor.
If tmux is unavailable on the sandbox, stop and report the exact missing tool
instead of silently keeping the connection open.
Record a durable SUE / scale-up lesson so the same mistake is not repeated.
When to Use
Invoke this skill at the end of any sue-* workflow when:
- An unexpected issue was solved.
- A sandbox-specific quirk was discovered.
- If the quirk changes a backend's reusable access, preflight, launch, or
lifecycle rules, treat it as a sandbox-specific lesson and update the matching
deepresearch-sandbox/<SANDBOX>_SANDBOX.mdfile. - If it is a one-off or project-specific workaround, record it in the
workspace
SCALE_UP.mdinstead.
- If the quirk changes a backend's reusable access, preflight, launch, or
lifecycle rules, treat it as a sandbox-specific lesson and update the matching
- A session-specific check or command turned out to be required.
- The user says "remember this", "add this to SCALE_UP.md", "record the lesson", "update sandbox memory", or any equivalent phrase.
This skill is the canonical way to persist lessons across
workspace/<workspace_name> projects.
Before debugging any SUE / scale-up issue from scratch, first check the root
SCALE_UP.md, the active workspace's SCALE_UP.md, and the active backend's
deepresearch-sandbox/<SANDBOX>_SANDBOX.md for an existing solution in the
relevant session or section. Apply any matching lesson directly. If the issue is
new or an existing lesson needs strengthening, use this skill to update the
appropriate memory file, then update the affected .codex/skills/sue-*/SKILL.md
when workflow behavior changed.
No Interview
This skill is auto-invoked. Resolve every input from the calling context
and the workspace's authoritative scale_up_outputs/<exp_dir>/config/runtime.yaml when it
exists. If scale_up_outputs/<exp_dir>/config/runtime.yaml and scale_up_outputs/<exp_dir>/config/scale-up.yaml
are missing, stale, or disagree on backend or environment policy, stop and
report the contradiction before recording a lesson.
Per-run artifacts and final_result bundles belong under the per-experiment
output directory (paths.exp_dir / SUE_EXP_DIR), not directly under a loose
<scale_up_outputs_root>/<run_id>/ tree.
- Workspace path — from the caller's resolved workspace.
- Sandbox — from
runtime.yaml.backendor the caller's sandbox context. - Session — from the caller's workflow phase (e.g.,
dryrun,fullrun). - Scope — inferred from whether the lesson depends on project-specific code, data, or config.
- Lesson content — trigger, wrong behavior, correct behavior, from the caller's failure/fix summary.
- Source — caller workspace + current date (redact private values).
If a required value cannot be resolved from context, stop and report the exact missing field; do not interview the user.
Inputs
- Workspace path — caller's resolved workspace.
- Sandbox —
LUMI,Snellius,RunPod,Brev,AutoDL,local, orany. - Session — one of:
env_prep,dataset_prep,interface_check,scripts_writing,dryrun,fullrun,monitor,summarize,cleanup,diagnose,reset,audit. - Scope — is the lesson:
- Generic: applies to multiple workspaces (e.g., a LUMI container trap).
- Workspace-specific: tied to this project's data, code paths, or config
(e.g., a CO3D LMDB quirk in
loop-vggt). - Sandbox-specific: changes the reusable access, preflight, launch, or lifecycle
rules for a specific backend (e.g., a new LUMI MIOPEN cache requirement).
Sandbox-specific lessons go to
deepresearch-sandbox/<SANDBOX>_SANDBOX.md.
- Lesson content — trigger, wrong behavior, correct behavior.
- Source — workspace name and date (redact private values).
Resolve DeepResearch repo context
Invoke sue-context to discover the deepresearch repo root and load
memory/project.md, AGENTS.md, memory/sue/SCALE_UP.md, config/codex_sync.json,
config/sue-templates/runtime.yaml, and .codex/skills/AGENTS.md. Do not proceed
until the repo root is resolved.
Targets
| scope | target file |
|---|---|
| generic | <repo>/memory/sue/SCALE_UP.md |
| workspace-specific | <workspace_dir>/SCALE_UP.md |
| sandbox-specific | <repo>/deepresearch-sandbox/<SANDBOX>_SANDBOX.md |
| sandbox-convention | <repo>/deepresearch-sandbox/README.md (only when the lesson is a cross-backend convention) |
Workflow
-
Resolve the target path via
sue-context.- For generic lessons:
<repo>/memory/sue/SCALE_UP.md. If that file is missing, fall back to<repo>/../deepresearch-workspace/SCALE_UP.mdfor backward compatibility. - For workspace-specific lessons:
<workspace_dir>/SCALE_UP.md. If the file does not exist, runsue-initfirst to create it with the standard session sections. - For sandbox-specific lessons:
<repo>/deepresearch-sandbox/<SANDBOX>_SANDBOX.md. Normalize the backend name to uppercase. If the file does not exist, stop and report the missing path. - For cross-backend conventions:
<repo>/deepresearch-sandbox/README.md.
- For generic lessons:
-
Read the target file. Check whether an identical or equivalent lesson already exists. If it does, do not duplicate it; instead, strengthen the existing entry if the new evidence adds detail.
-
Ensure the right section exists.
- For
SCALE_UP.mdtargets: if## <Session>is missing, add it. - For sandbox targets: find the matching rule section (e.g.,
## Launch Rules,## Preflight,## Storage,## Lifecycle Rules). If no section matches, add a new one at the same level as the existing sections.
- For
-
Append the lesson using the format below. Use a short, stable title.
- For sandbox targets, keep the entry reusable: use placeholders such as
<account>,<scratch-root>, or<host>and put concrete values in the ignoreddeepresearch-sandbox/config_<sandbox>.txt.
- For sandbox targets, keep the entry reusable: use placeholders such as
-
If the lesson changes a skill's workflow, update the affected
.codex/skills/sue-*/SKILL.mdin the same turn (Codex-first; do not edit.claude/mirrors directly). -
Report the result. State which file was updated and the lesson title.
Lesson Format
### <Short lesson title>
- **sandbox**: LUMI | Snellius | RunPod | Brev | AutoDL | any
- **session**: env_prep | dataset_prep | interface_check | scripts_writing |
dryrun | fullrun | monitor | summarize | cleanup | diagnose | reset | audit
- **date**: YYYY-MM-DD
- **trigger**: What symptom or question revealed the issue?
- **wrong**: The mistake the agent made or almost made.
- **correct**: The required behavior, check, or command.
- **source**: workspace/<name> or session (private values redacted)
Output Template
Recorded SUE lesson:
scope: generic | workspace-specific (<workspace>) | sandbox (<sandbox>) | sandbox-convention
target: <path>
sandbox: <sandbox>
session: <session>
title: <title>
skill updated: <sue-*/SKILL.md> | none
Anti-Patterns
- Do not record one-off accidents or environment-specific transient failures.
- Do not duplicate an existing lesson.
- Do not commit raw transcripts, secrets, project IDs, hostnames, or tokens.
- Do not put concrete private values i
Content truncated.