agentskills.codes
CL

cloudflare-workers-bot-scan-defense

Make a Cloudflare Workers app resilient to bot scans that arrive within minutes of HTTPS publication via CT Log enumeration. Use when deploying a new Worker (especially with auth or paid bindings), when budget/cost is a concern, or when you want to detect "/.env" / "/admin" / "/wp-login.php" / "/.gi

Install

mkdir -p .claude/skills/cloudflare-workers-bot-scan-defense && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/14357" && unzip -o skill.zip -d .claude/skills/cloudflare-workers-bot-scan-defense && rm skill.zip

Installs to .claude/skills/cloudflare-workers-bot-scan-defense

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Make a Cloudflare Workers app resilient to bot scans that arrive within minutes of HTTPS publication via CT Log enumeration. Use when deploying a new Worker (especially with auth or paid bindings), when budget/cost is a concern, or when you want to detect "/.env" / "/admin" / "/wp-login.php" / "/.git/config" probing. Covers the mental model (CT Log → bot scan → which paths actually cost you money), the edge-cache absorption that makes most scans free, the narrow set of unauthenticated routes that do need rate limiting (auth `begin`/`verify`), the exact `wrangler.jsonc` `observability` + `ratelimits` config (plus the wrangler 3.x `unsafe.bindings` fallback for projects still on v3), the IP-keyed Hono middleware pattern (with the fail-open variant), the verification flow via `wrangler versions view` + Workers Observability — including the trap that v3's `versions view` renders neither the `unsafe` ratelimit binding nor `observability`, and a credential-free verification path for sandboxed agents / keyless CI — and the documented eventual-consistency caveat that makes synthetic burst tests look like the limiter is broken.
1136 chars✓ has a “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

Cloudflare Workers Bot Scan Defense

The day you point HTTPS at a domain — even a *.workers.dev subdomain you haven't told anyone about — your hostname appears in Certificate Transparency Logs, and within minutes scanner bots start probing /.env, /.git/config, /admin, /wp-login.php, /index.php.bak, /.DS_Store, and dozens more. Family-only apps get scanned. Staging gets scanned. Internal tools get scanned. Even sites with no UI link anywhere get scanned.

The bots are cheap to run and free to scale, so they don't care if you're a Fortune 500 or a hobby project — they just throw the wordlist and harvest whatever responds.

This skill captures the mental model, the small set of changes that actually matter on Cloudflare Workers, and the verification flow.

When to use this skill

  • Deploying a Worker that will be reachable on HTTPS (with *.workers.dev or custom domain) — even if the URL is private
  • The Worker has unauthenticated routes that do CPU work (auth begin/verify, magic-link issuance, signup, captcha, public webhooks)
  • Cost is a concern: paid plan with Workers AI / D1 reads / outbound subrequests, or you're near a free-tier ceiling
  • You don't currently have observability — i.e., you couldn't answer "how many bot probes hit my Worker last night?" right now
  • Auditing an existing Worker's exposed surface to decide what to harden first

The exclusions below apply specifically to the rate-limit binding + middleware portion of this skill. Workers Observability (the first artefact in "The minimum viable defense" below) is universally beneficial and should still be enabled even in the cases listed here — it costs nothing and gives you visibility regardless of how requests are gated.

Do not apply the rate-limit binding portion for:

  • A Worker that returns hard 401/403 with no DB hit on every unauthenticated route (you're already fine — bots can't drain you)
  • A Worker fronted by Cloudflare Access / IP allowlist where every route is already gated above the Worker layer (Access returns 302/403 before the Worker is invoked, so there's nothing left to rate-limit). Caveat: a custom domain protected by Access does not automatically protect the <worker-name>.<account>.workers.dev URL — that endpoint stays open by default and bypasses your Access policy. Set workers_dev: false in wrangler.jsonc to close it (recommended), or attach a separate Access application targeting the *.workers.dev hostname (Dashboard → Zero Trust → Access → Applications → Add → Self-hosted, with the workers.dev hostname). Verify with curl -I https://<worker-name>.<account>.workers.dev/ after deploy — expect a 302 redirect or 403.
  • DDoS-grade attacks (you need WAF / Cloudflare Pro+ rules, not just a Worker binding)
  • Application-level brute force (e.g., trying credentials against a known username) — that's auth-brute-force territory and needs per-account lockout, not just per-IP rate limit

The mental model — what bots actually drain

Most bot scans target paths that don't exist in your app: /.env, /.git/config, /wp-admin/, etc. On a Cloudflare Workers + SPA setup with not_found_handling: "single-page-application", those paths fall through to the SPA fallback (index.html). After the first Worker invocation for each unique URL, the Cloudflare edge caches the response and serves subsequent requests with cf-cache-status: HIT — the Worker is not re-invoked. Bot scanners hammer the same wordlist URLs, so the long tail is absorbed by edge cache; only the first request per unique path costs you a Worker invocation. D1 isn't touched (the SPA fallback path doesn't touch DB). CPU billing for the cached path stops. You're already fine for the wordlist 99% of the time.

This holds regardless of run_worker_first: true|false. With run_worker_first: true, the first request per path invokes the Worker, which falls through to c.env.ASSETS.fetch(c.req.raw) for unknown paths; the response is cacheable and the edge memoizes it. With run_worker_first: false, the asset binding serves directly without Worker invocation. Either way, repeat scans hit cache.

What's actually expensive is the small set of unauthenticated routes that do real work:

RouteCost per callBot drain risk
POST /api/auth/login/begin (WebAuthn challenge)crypto + JWT sign + cookieHigh — challenge generation is CPU
POST /api/auth/register/beginDB read + cryptoHigh if registration is open
POST /api/auth/login/verifyDB read + cryptoHigh — D1 read fires before validation can short-circuit
GET /healthconstant bodyLow — cheap and harmless
/api/* with session middlewarenothing if no cookie (just 401)Low — no D1 touch on missing cookie

The defense you actually need is a small one: observability everywhere, rate limit on the 3-5 routes that do CPU work pre-auth. Don't bother adding rate limit to every /api/* endpoint — sessionMiddleware + missing cookie already returns 401 in microseconds without a DB hit.

Audit your attack surface in 2 minutes

Before you change anything, probe what's actually exposed. The output tells you which paths are edge-cached (free) vs. which paths reach the Worker (potentially expensive).

BASE=https://your-app.example.workers.dev

for path in "/" "/.env" "/.git/config" "/admin" "/wp-login.php" "/.DS_Store" \
            "/api" "/api/cats" "/api/auth/me" "/api/auth/login/begin" "/health" "/robots.txt"; do
  printf '%-25s ' "$path"
  curl -skI --max-time 8 "$BASE$path" \
    | grep -iE '^(HTTP/|cf-cache-status|content-type)' \
    | tr '\n' ' '
  echo
done

What to look for:

  • cf-cache-status: HIT on bogus paths (/.env, /admin, ...) → edge is absorbing them, no Worker invocation. You don't need to do anything for these.
  • HTTP/2 401 on /api/* with application/json → session middleware is short-circuiting, good. Verify no DB query happens (read worker/middleware/session.ts).
  • HTTP/2 200 on a public unauthenticated POST endpoint (e.g., /api/auth/login/begin) → this is your protect-with-rate-limit target.
  • HTTP/2 500 on anything random → suspicious. The app.onError handler should return a generic {error:{type:"internal"}} 500 — never a stack trace. Fix this before adding rate limit.

Full probing recipe in references/attack-surface-audit.md.

The minimum viable defense

Three artefacts. None requires a paid plan.

1. Enable Workers Observability

Add to wrangler.jsonc:

{
  "observability": {
    "enabled": true,
    "head_sampling_rate": 1
  }
}

head_sampling_rate: 1 = 100%. For a low-traffic family/internal app this is fine; for a high-traffic public app drop to 0.1 or 0.01 to control cost. Without this block the Workers Observability dataset is empty for your script — you literally cannot see scan traffic.

2. Add a Workers Rate Limit binding

{
  "ratelimits": [
    {
      "name": "AUTH_RATE_LIMITER",
      "namespace_id": "1001",
      "simple": {
        "limit": 30,
        "period": 60
      }
    }
  ]
}

namespace_id is an account-unique integer string (any number you pick). simple.period must be 10 or 60 — other values fail config validation. Two bindings sharing the same namespace_id share counters, which is intentional if you want "one rate limit across multiple Workers".

Wrangler 3.x fallback (no top-level ratelimits key)

The top-level ratelimits key above requires wrangler 4.36.0+. On a project still on wrangler 3.x, that key is rejected — but the same binding is available through the unsafe.bindings escape hatch with type: "ratelimit". The runtime binding is identical (env.AUTH_RATE_LIMITER.limit(...)); only the config shape differs:

{
  // wrangler 3.x: top-level `ratelimits` is unsupported — use the unsafe form.
  "unsafe": {
    "bindings": [
      {
        "name": "AUTH_RATE_LIMITER",
        "type": "ratelimit",
        "namespace_id": "1001",
        "simple": { "limit": 30, "period": 60 }
      }
    ]
  }
}

This is verified working on [email protected]. Note the cost: unsafe bindings are not validated by wrangler at config-parse time, and (see "Verify after deploy") wrangler 3.x's versions view does not render them — so a typo here fails silently. Prefer upgrading to 4.36+ and the top-level form when you can; use this only when the upgrade is out of scope. Pair it with the fail-open middleware (below) so a missing/misnamed binding degrades to "no rate limit" rather than locking every user out.

RateLimit is a global type from @cloudflare/workers-types — no import needed in a worker tsconfig. Add it to your Bindings:

type Bindings = {
  AUTH_RATE_LIMITER: RateLimit;
  // ... other bindings
};

3. Apply per-route in Hono via middleware

// worker/middleware/rate-limit.ts
import { createMiddleware } from "hono/factory";
import type { Env } from "../types";

export const authRateLimit = createMiddleware<Env>(async (c, next) => {
  const ip = c.req.header("CF-Connecting-IP") ?? "unknown";
  const { success } = await c.env.AUTH_RATE_LIMITER.limit({ key: ip });
  if (!success) {
    return c.json({ error: { type: "rate_limited", message: "Too many requests" } }, 429);
  }
  await next();
});

// worker/routes/auth.ts — apply only to the unauthenticated CPU-spending routes
export const authRoutes = new Hono<Env>()
  .post("/register/begin", authRateLimit, async (c) => { /* ... */ })
  .post("/register/verify", authRateLimit, async (c) => { /* ... */ })
  .post("/login/begin", authRateLimit, async (c) => { /* ... */ })
  .post("/login/verify", authRateLimit, async (c) => { /* ... */ });

Don't apply it to authenticated routes — sessionMiddleware already gates those, and rate-limi


Content truncated.

Search skills

Search the agent skills registry