failure-classify

Name: failure-classify
Author: majiayu000

Install

mkdir -p .claude/skills/failure-classify && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/13648" && unzip -o skill.zip -d .claude/skills/failure-classify && rm skill.zip

Installs to .claude/skills/failure-classify

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Group failing tests by naming convention so a real assertion failure is not hidden among tests that exist to verify a system rejects something. Use after running the test suite and before reading the report.

207 charsno explicit “when” trigger

About this skill

failure-classify

Triage failing tests so a real regression does not get lost among tests that are expected to fail (because they verify that the system rejects bad input).

When to use

After every test run that produced at least one failing test.
Before triaging a CI failure manually.
In the report-render skill's input pipeline.

Do not use this skill to decide whether a regression is "important enough to fix". That is a human judgement. The skill only sorts.

Inputs

Name	Required	Default	Description
`log_path`	yes	none	Path to the test runner's output. The skill scans for `✘` lines (or `FAIL` lines, depending on the language pattern).
`language`	no	`swift`	One of `swift`, `python`, `go`, `rust`. Picks the line-extraction regex and the classification patterns.
`self_test`	no	`false`	When `true`, runs the skill's internal unit tests and exits. Use after editing the classification rules.

Output

Channel	Content
stdout	A JSON object with three keys: `failures` (deduplicated list of failing test names), `failures_by_class` (counts per class), `failures_grouped` (test names bucketed by class).
exit code	`0` on success regardless of test outcomes. `1` if the log could not be read. `2` on self-test failure.

Classification taxonomy

Seven classes. Order matters: the first matching pattern wins. A test name that matches no pattern falls into ASSERTION_FAILURE by default.

Class	Meaning	Trigger keyword (regex fragment)
`EXPECTED_FAILURE`	the test verifies the system rejects something	`Reject`, `Refuse`, `ErrorContains`, `WithInvalid`, `DataCorrupted`, `InvalidDuration`, `LongSessionOnBattery`
`IO_BACKEND`	the test exercises a system API (IOKit, libusb, raw sockets)	language-specific prefix, e.g. `powerAssertion*` in Swift
`ENVIRONMENT_ERROR`	the test depends on a live system reading	`PowerSourceMonitor`, `BluetoothState`, `NetworkLink`
`TEST_SCAFFOLD`	the test depends on a missing setup artifact	reserved; no patterns in v0.1
`ASSERTION_FAILURE`	real bug	default bucket
`PRECONDITION_MISSING`	required env var / file is missing	`requireEnv`, `skipIf` markers
`UNKNOWN`	could not classify	fallback (rare; usually means the runner produced an unexpected line format)

Algorithm

Extract failing test names from the log. Swift Testing emits two ✘ lines per failing test (one for the issue, one for the summary). The skill deduplicates while keeping the first-seen order.
Classify each name against the patterns, first-match-wins.
Emit JSON: list, counts, grouped.

Worked example (Swift)

Log:

✔ Test foo() passed after 0.001 seconds.
✘ Test barRejectsZero() failed after 0.001 seconds with 1 issue.
✘ Test bazShouldBehave() failed after 0.001 seconds with 1 issue.

Output:

{
  "failures": ["barRejectsZero", "bazShouldBehave"],
  "failures_by_class": {
    "EXPECTED_FAILURE": 1,
    "ASSERTION_FAILURE": 1
  },
  "failures_grouped": {
    "EXPECTED_FAILURE": ["barRejectsZero"],
    "ASSERTION_FAILURE": ["bazShouldBehave"]
  }
}

The first test is an expected-failure pattern (Rejects). The second is a real regression candidate.

Anti-patterns

Treating EXPECTED_FAILURE as a green light. A failure classified as EXPECTED_FAILURE still failed. The classification only means the failure is the test's purpose, not that the system under test behaves correctly. The test author must read the message.
Adding patterns that match everything. Patterns are first-match wins. A pattern like .* would swallow every test into one bucket.
Assuming a runner that is not in the language list still works. The v0.1 patterns cover Swift Testing only. For other runners, the user must extend the patterns and add a self-test.

Cross-references

report-render — consumes the JSON output and embeds it in the Markdown report.
drill — uses this skill to verify the loop reacts to a real failure.
drift-check — the previous step in the loop.

More by majiayu000

View all by majiayu000 →

stage

majiayu000

Stage implementation changes for commit with precise file selection. Use when the user asks to \"stage changes\", \"stage files\", \"add files to staging\", or \"prepare changes for commit\".

tana

majiayu000

Enables Claude to create and manage notes with supertags in Tana via Playwright MCP

aeo

majiayu000

Sharing Skills

majiayu000

Contribute skills back to upstream via branch and PR

phoenix-observability

majiayu000

Open-source AI observability platform for LLM tracing, evaluation, and monitoring. Use when debugging LLM applications with detailed traces, running evaluations on datasets, or monitoring production AI systems with real-time insights.

sp

majiayu000

Guide for sp.h, a single-header C standard library replacement. You must use this guide when using or discussing sp.h in any capacity.

Install

mkdir -p .claude/skills/failure-classify && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/13648" && unzip -o skill.zip -d .claude/skills/failure-classify && rm skill.zip

Installs to .claude/skills/failure-classify

Safety

No risk patterns found

Automated static scan of the SKILL.md and repo. A flag describes what the skill can do — not a verdict. Always review code before installing.

Source & maintenance

Updated

27d ago

License

MIT

Repo stars

Loads

~1,063 tokens

Stars are for the whole repository, not this skill alone.

Stats

Views

Installs

Author

majiayu000

7 skills published

Links

Source code

failure-classify

Install

Activation

About this skill

failure-classify

When to use

Inputs

Output

Classification taxonomy

Algorithm

Worked example (Swift)

Anti-patterns

Cross-references

More by majiayu000

stage

tana

aeo

Sharing Skills

phoenix-observability

sp

Search skills