agentskills.codes
AG

agilab-ui-robot-validation

Validate AGILAB Streamlit UI changes with the repo's browser and widget robots. Use when touching ABOUT, PROJECT, ORCHESTRATE, ANALYSIS, SETTINGS, sidebar flows, first-proof wizard links, notebook import, screenshots, or public demo UI evidence.

Install

mkdir -p .claude/skills/agilab-ui-robot-validation && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16556" && unzip -o skill.zip -d .claude/skills/agilab-ui-robot-validation && rm skill.zip

Installs to .claude/skills/agilab-ui-robot-validation

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Validate AGILAB Streamlit UI changes with the repo's browser and widget robots. Use when touching ABOUT, PROJECT, ORCHESTRATE, ANALYSIS, SETTINGS, sidebar flows, first-proof wizard links, notebook import, screenshots, or public demo UI evidence.
245 chars✓ has a “when” trigger

About this skill

AGILAB UI Robot Validation

Use this skill when a change affects user-visible Streamlit behavior, page navigation, sidebar actions, wizard links, notebook import/upload flows, UI screenshots, or public demo evidence.

The goal is to catch real browser/session-state failures that helper tests and static AppTest checks can miss: broken st.switch_page paths, recursive deep-links, hidden upload controls, stale sidebar state, and Streamlit exceptions that only appear after clicking through the UI.

Tool Choice

  • Use focused unit/helper tests first when the bug is pure Python state logic.
  • Use Streamlit AppTest when the failure is widget wiring, page hydration, or session-state initialization and no real browser behavior is needed.
  • Use tools/agilab_web_robot.py for browser-level entrypoint checks, hosted demo checks, screenshots, and notebook upload handoff behavior.
  • Use tools/agilab_widget_robot.py for page-by-page Streamlit widget flows, selected action buttons, artifact assertions, and stateful project journeys.
  • Use tools/agilab_widget_robot_matrix.py when the change touches navigation, sidebar project actions, notebook import, settings, first-launch, or broad UI behavior that must stay consistent across pages.
  • For Streamlit or React/agi-web page validation, inspect browser dev-log evidence too: console errors/warnings, pageerror, failed requests, and HTTP 4xx/5xx responses. A page is not validated just because the visible DOM rendered when the browser log shows a relevant runtime or asset failure.

Do not replace a deterministic helper regression with a slow robot. Robots are for user journeys, browser-only behavior, and release/public-demo evidence.

Preflight

  1. Confirm the repo is the source checkout you intend to test.
git status --short --branch --untracked-files=no
  1. Check the exact local workflow profile before inventing commands.
uv --preview-features extra-build-dependencies run python tools/workflow_parity.py --profile ui-robot-matrix --print-only
  1. If a change affects release evidence, also inspect the release shortcut.
./dev --print-only release

Fast Local Commands

Use this for a first-launch smoke when the entry shell, ABOUT page, or default navigation changed:

UV_PYTHON=3.13 uv --preview-features extra-build-dependencies run python tools/first_launch_robot.py --json --output /tmp/agilab-first-launch-robot.json

Use this for Streamlit dependency, run-configuration, theme, or blank-page frontend issues. It launches the dev app, checks JS/CSS MIME types, then verifies the first page hydrates in Chromium:

UV_PYTHON=3.13 uv --preview-features extra-build-dependencies run --extra ui --with playwright python tools/agilab_web_robot.py \
  --frontend-smoke-only \
  --timeout 45 \
  --target-seconds 45 \
  --json \
  --screenshot-dir /tmp/agilab-frontend-smoke-screenshots \
  > /tmp/agilab-frontend-smoke.json

Use this for browser-level ABOUT and notebook handoff issues:

UV_PYTHON=3.13 uv --preview-features extra-build-dependencies run --extra ui --with playwright python tools/agilab_web_robot.py \
  --json \
  --screenshot-dir /tmp/agilab-web-robot-screenshots \
  > /tmp/agilab-web-robot.json

Use this for a selected page/action journey. Keep labels exact and fail if the requested action is missing:

AGILAB_WIDGET_ROBOT_RUNTIME_ISOLATION=current-home \
UV_PYTHON=3.13 uv --preview-features extra-build-dependencies run --with playwright python tools/agilab_widget_robot.py \
  --apps flight_project \
  --pages ORCHESTRATE \
  --apps-pages none \
  --json \
  --json-output /tmp/agilab-widget-robot.json \
  --progress-log /tmp/agilab-widget-robot.ndjson \
  --interaction-mode full \
  --action-button-policy click-selected \
  --click-action-labels "CHECK distribute" \
  --preselect-labels "Run now" \
  --missing-selected-action-policy fail \
  --runtime-isolation current-home

Use this when an embedded ANALYSIS app surface must expose app-owned controls without firing callbacks. The text and button probes inspect the top page and child iframes, and --required-action-labels only trial-clicks buttons:

UV_PYTHON=3.13 uv --preview-features extra-build-dependencies run --with playwright python tools/agilab_widget_robot_matrix.py \
  --scenario isolated-pytorch-playground-analysis \
  --json \
  --quiet-progress \
  --no-result-cache \
  --output-dir /tmp/agilab-pytorch-analysis-robot \
  --screenshot-dir /tmp/agilab-pytorch-analysis-robot-screenshots

Use explicit browser-error evidence when React/agi-web, custom components, iframes, or Streamlit frontend assets are part of the change. The widget robot captures Chromium console warnings/errors, pageerror, failed requests, and HTTP error responses into its JSON/progress evidence and failure bundle:

UV_PYTHON=3.13 uv --preview-features extra-build-dependencies run --with playwright python tools/agilab_widget_robot_matrix.py \
  --scenario isolated-browser-error-core-pages \
  --json \
  --quiet-progress \
  --no-result-cache \
  --output-dir /tmp/agilab-browser-error-robot \
  --screenshot-dir /tmp/agilab-browser-error-robot-screenshots

Use this before release or after broad navigation/sidebar work:

UV_PYTHON=3.13 uv --preview-features extra-build-dependencies run python tools/workflow_parity.py --profile ui-robot-matrix

Use this to inspect the exact sharded matrix commands without launching the robots:

uv --preview-features extra-build-dependencies run python tools/workflow_parity.py --profile ui-robot-matrix --print-only

Choosing Scenarios

  • ABOUT / first-proof wizard: run first_launch_robot.py, the focused ABOUT tests, and at least the matrix scenario that covers entry and app pages. When the first visible copy or product journey wording changes, update tools/first_launch_robot.py expectations in the same change so CI validates the current pitch instead of stale labels.
  • Streamlit dependency, pyproject.toml, run config, theme, or launch wrapper: run tools/agilab_web_robot.py --frontend-smoke-only first. This is the fastest real-browser guard for blank pages caused by static frontend assets being served with the wrong MIME type.
  • React/agi-web, custom components, canvas/WebGL, or embedded iframe changes: run a Chromium/Chrome browser robot and inspect the captured browser issues even when the page looks correct. Treat relevant console errors, page errors, failed asset/API requests, and HTTP 4xx/5xx responses as validation failures unless there is an explicit ignore rule.
  • PROJECT sidebar, create/import/rename/delete: run focused PROJECT tests plus matrix scenarios for project page, project-import-sidebar, project-rename-sidebar, and notebook import.
  • Notebook import/upload: run notebook-import helper tests plus agilab_web_robot.py when the file chooser, upload handoff, or built-in notebook route changed.
  • ORCHESTRATE action buttons: use agilab_widget_robot.py --action-button-policy click-selected with the exact visible button labels the end user is expected to press. If the button is intentionally not clicked by generic robots because it writes local state, launches external work, or is advisory-only, add or update its disposition in tools/ui_robot_action_contract.py and cover the behavior with focused helper/AppTest regressions. Examples include LAN discovery/cache controls and advisory planning actions such as Build cluster plan.
  • When a UI action keeps the same visible button label but changes semantics behind a selector or multiselect, update the robot action disposition for the exact visible label and add focused regressions for the selector state. Do not rename robot dispositions to internal semantics such as Update selected unless that is the actual button text users see.
  • SETTINGS or Streamlit system-menu changes: run settings page tests plus the settings matrix scenario.
  • Public demo or HF Space UI: run tools/hf_space_smoke.py --json first, then run the web robot against the hosted URL if the claim is about browser-visible behavior.

Evidence Rules

  • Save JSON summaries and progress logs under /tmp for local debugging, or under test-results/ only when the artifact is intentionally part of CI or release evidence.
  • For any browser validation, inspect the dev-log evidence before declaring the page valid. In widget/matrix runs this means checking the JSON/progress output and, on failure, the browser-issues.json file in the failure bundle. For manual Chrome validation, open DevTools Console and Network and report whether relevant console errors/warnings, pageerror equivalents, failed requests, or HTTP 4xx/5xx responses were present.
  • Use --screenshot-dir for browser/UI failures. Screenshots should include the manifest generated by the robot so evidence can be traced back to the command.
  • For full matrix runs, prefer the sharded ui-robot-matrix profile. CI keeps successful scenarios lightweight and reruns only failed scenarios with --retry-failed-with-artifacts, producing trace, HAR, and video evidence under each shard's failure-artifacts/ directory.
  • When diagnosing a matrix failure, inspect the aggregate artifact first. The ui-robot-matrix-aggregate-* report links the shard, failure bundle, replay command, artifact-retry status, and any trace/HAR/video directories.
  • Use tools/ui_robot_failure_replay.py <bundle> to print the exact command recorded in a failure bundle, and add --execute only when you intentionally want to rerun that recorded command.
  • In final notes, report the scenario name, command class, JSON output path, and whether screenshots were generated. Do not claim a full UI sweep when only one page or button was tested.
  • If a robot fails from environment setup, missing Playwright, port c

Content truncated.

Search skills

Search the agent skills registry