gtkb-benchmarks

Name: gtkb-benchmarks
Author: Remaker-Digital

Run GT-KB read-only measurement benchmarks. Outputs JSON plus markdown summary.

Install

mkdir -p .claude/skills/gtkb-benchmarks && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/13810" && unzip -o skill.zip -d .claude/skills/gtkb-benchmarks && rm skill.zip

Installs to .claude/skills/gtkb-benchmarks

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Run GT-KB read-only measurement benchmarks. Outputs JSON plus markdown summary.

79 charsno explicit “when” trigger

About this skill

GT-KB Benchmark Suite

Read-only measurement benchmarks for the GT-KB platform. Each benchmark computes a structured observation with a headline scalar plus per-dimension breakdown, and emits both JSON and markdown summary artifacts.

Operationalizes SPEC-1662 (GOV-18 Assertion Quality Standard) and GOV-ARTIFACT-ORIENTED-GOVERNANCE-001 per Self-Diagnostic Leak Closure Slice 2.

Benchmarks

ID	Question Answered
linkage_heatmap	What fraction of cross-artifact references survive across SPEC, WI, ADR or DCL or GOV, DELIB, BRIDGE pairs?
recall_coverage	What fraction of recent mutations cite prior-state evidence in change_reason?
tool_identification	What fraction of recent insertions carry a structured attribution marker?
deliberation_recall	What is the recall at 3 of the semantic index over recent owner-decision deliberations?
advisory_latency	What is the median wall-clock latency from advisory filing to first Prime acknowledgement?
assertion_signal_noise	What fraction of categorized assertions land outside chronic_noise (signal-bearing)?

Subcommands

run

Execute one or all benchmarks. Defaults to a one-year window ending now.

python -m scripts.benchmarks.cli run --all
python -m scripts.benchmarks.cli run --benchmark assertion_signal_noise

report

Print a previously emitted run summary.

python -m scripts.benchmarks.cli report --run-id 20260514-040000

compare

Diff two runs by idempotency_key and benchmark value.

python -m scripts.benchmarks.cli compare --baseline RUN_A --candidate RUN_B

Output Contract

Each run writes two files under the runs directory:

run.json -- full structured payload (run_id, idempotency_key, results).
summary.md -- human-readable markdown summary table.

The idempotency_key is a SHA-256 hash of the window bounds, benchmark IDs, and source commit. Identical inputs over identical commits produce identical keys.

Governing Artifacts

SPEC-1662 (GOV-18 Assertion Quality Standard)
GOV-ARTIFACT-ORIENTED-GOVERNANCE-001
GOV-STANDING-BACKLOG-001
ADR-DA-READ-SURFACE-PLACEMENT-001
DELIB-S312-DETERMINISTIC-SERVICES-PRINCIPLE

More by Remaker-Digital

View all by Remaker-Digital →

bridge-config

Remaker-Digital

Inspect and verify GT-KB bridge dispatch configuration, dispatchability, selected targets, rule eligibility, and last dispatch state through the gt bridge dispatch config/status/health CLI.

release-candidate-gate

Remaker-Digital

Run the non-deploying Agent Red release-candidate gate before treating a build as production deployable. Covers Python security checks, targeted regression tests, frontend builds, and GroundTruth governance adoption checks.

proposal-review

Remaker-Digital

Review proposals, plans, and technical approaches for correctness, missing assumptions, risk, and decision quality. Use for design reviews, plan critiques, and proposal stress tests.

Install

mkdir -p .claude/skills/gtkb-benchmarks && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/13810" && unzip -o skill.zip -d .claude/skills/gtkb-benchmarks && rm skill.zip

Installs to .claude/skills/gtkb-benchmarks

Safety

No risk patterns found

Automated static scan of the SKILL.md and repo. A flag describes what the skill can do — not a verdict. Always review code before installing.

Source & maintenance

Updated

20d ago

License

Proprietary - Remaker Digital

Repo stars

Loads

~653 tokens

Stars are for the whole repository, not this skill alone.

Stats

Views

Installs

Author

Remaker-Digital

4 skills published

Links

Source code

gtkb-benchmarks

Install

Activation

About this skill

GT-KB Benchmark Suite

Benchmarks

Subcommands

run

report

compare

Output Contract

Governing Artifacts

More by Remaker-Digital

bridge-config

release-candidate-gate

proposal-review

Search skills