agentskills.codes
GC

gcp-cloud-run

Specialized skill for building production-ready serverless

Install

mkdir -p .claude/skills/gcp-cloud-run-tjsndhu && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/15494" && unzip -o skill.zip -d .claude/skills/gcp-cloud-run-tjsndhu && rm skill.zip

Installs to .claude/skills/gcp-cloud-run-tjsndhu

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Specialized skill for building production-ready serverless
58 charsno explicit “when” trigger

About this skill

GCP Cloud Run

Specialized skill for building production-ready serverless applications on GCP. Covers Cloud Run services (containerized), Cloud Run Functions (event-driven), cold start optimization, and event-driven architecture with Pub/Sub.

Principles

  • Cloud Run for containers, Functions for simple event handlers
  • Optimize for cold starts with startup CPU boost and min instances
  • Set concurrency based on workload (start with 8, adjust)
  • Memory includes /tmp filesystem - plan accordingly
  • Use VPC Connector only when needed (adds latency)
  • Containers should start fast and be stateless
  • Handle signals gracefully for clean shutdown

Patterns

Cloud Run Service Pattern

Containerized web service on Cloud Run

When to use: Web applications and APIs,Need any runtime or library,Complex services with multiple endpoints,Stateless containerized workloads

# Dockerfile - Multi-stage build for smaller image
FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM node:20-slim
WORKDIR /app

# Copy only production dependencies
COPY --from=builder /app/node_modules ./node_modules
COPY src ./src
COPY package.json ./

# Cloud Run uses PORT env variable
ENV PORT=8080
EXPOSE 8080

# Run as non-root user
USER node

CMD ["node", "src/index.js"]
// src/index.js
const express = require('express');
const app = express();

app.use(express.json());

// Health check endpoint
app.get('/health', (req, res) => {
  res.status(200).send('OK');
});

// API routes
app.get('/api/items/:id', async (req, res) => {
  try {
    const item = await getItem(req.params.id);
    res.json(item);
  } catch (error) {
    console.error('Error:', error);
    res.status(500).json({ error: 'Internal server error' });
  }
});

// Graceful shutdown
process.on('SIGTERM', () => {
  console.log('SIGTERM received, shutting down gracefully');
  server.close(() => {
    console.log('Server closed');
    process.exit(0);
  });
});

const PORT = process.env.PORT || 8080;
const server = app.listen(PORT, () => {
  console.log(`Server listening on port ${PORT}`);
});
# cloudbuild.yaml
steps:
  # Build the container image
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA', '.']

  # Push the container image
  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA']

  # Deploy to Cloud Run
  - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
    entrypoint: gcloud
    args:
      - 'run'
      - 'deploy'
      - 'my-service'
      - '--image=gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA'
      - '--region=us-central1'
      - '--platform=managed'
      - '--allow-unauthenticated'
      - '--memory=512Mi'
      - '--cpu=1'
      - '--min-instances=1'
      - '--max-instances=100'
      - '--concurrency=80'
      - '--cpu-boost'

images:
  - 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA'

Structure

project/ ├── Dockerfile ├── .dockerignore ├── src/ │ ├── index.js │ └── routes/ ├── package.json └── cloudbuild.yaml

Gcloud_deploy

Direct gcloud deployment

gcloud run deploy my-service
--source .
--region us-central1
--allow-unauthenticated
--memory 512Mi
--cpu 1
--min-instances 1
--max-instances 100
--concurrency 80
--cpu-boost

Cloud Run Functions Pattern

Event-driven functions (formerly Cloud Functions)

When to use: Simple event handlers,Pub/Sub message processing,Cloud Storage triggers,HTTP webhooks

// HTTP Function
// index.js
const functions = require('@google-cloud/functions-framework');

functions.http('helloHttp', (req, res) => {
  const name = req.query.name || req.body.name || 'World';
  res.send(`Hello, ${name}!`);
});
// Pub/Sub Function
const functions = require('@google-cloud/functions-framework');

functions.cloudEvent('processPubSub', (cloudEvent) => {
  // Decode Pub/Sub message
  const message = cloudEvent.data.message;
  const data = message.data
    ? JSON.parse(Buffer.from(message.data, 'base64').toString())
    : {};

  console.log('Received message:', data);

  // Process message
  processMessage(data);
});
// Cloud Storage Function
const functions = require('@google-cloud/functions-framework');

functions.cloudEvent('processStorageEvent', async (cloudEvent) => {
  const file = cloudEvent.data;

  console.log(`Event: ${cloudEvent.type}`);
  console.log(`Bucket: ${file.bucket}`);
  console.log(`File: ${file.name}`);

  if (cloudEvent.type === 'google.cloud.storage.object.v1.finalized') {
    await processUploadedFile(file.bucket, file.name);
  }
});
# Deploy HTTP function
gcloud functions deploy hello-http \
  --gen2 \
  --runtime nodejs20 \
  --trigger-http \
  --allow-unauthenticated \
  --region us-central1

# Deploy Pub/Sub function
gcloud functions deploy process-messages \
  --gen2 \
  --runtime nodejs20 \
  --trigger-topic my-topic \
  --region us-central1

# Deploy Cloud Storage function
gcloud functions deploy process-uploads \
  --gen2 \
  --runtime nodejs20 \
  --trigger-event-filters="type=google.cloud.storage.object.v1.finalized" \
  --trigger-event-filters="bucket=my-bucket" \
  --region us-central1

Cold Start Optimization Pattern

Minimize cold start latency for Cloud Run

When to use: Latency-sensitive applications,User-facing APIs,High-traffic services

1. Enable Startup CPU Boost

gcloud run deploy my-service \
  --cpu-boost \
  --region us-central1

2. Set Minimum Instances

gcloud run deploy my-service \
  --min-instances 1 \
  --region us-central1

3. Optimize Container Image

# Use distroless for minimal image
FROM node:20-slim AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM gcr.io/distroless/nodejs20-debian12
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY src ./src
CMD ["src/index.js"]

4. Lazy Initialize Heavy Dependencies

// Lazy load heavy libraries
let bigQueryClient = null;

function getBigQueryClient() {
  if (!bigQueryClient) {
    const { BigQuery } = require('@google-cloud/bigquery');
    bigQueryClient = new BigQuery();
  }
  return bigQueryClient;
}

// Only initialize when needed
app.get('/api/analytics', async (req, res) => {
  const client = getBigQueryClient();
  const results = await client.query({...});
  res.json(results);
});

5. Increase Memory (More CPU)

# Higher memory = more CPU during startup
gcloud run deploy my-service \
  --memory 1Gi \
  --cpu 2 \
  --region us-central1

Optimization_impact

  • Startup_cpu_boost: 50% faster cold starts
  • Min_instances: Eliminates cold starts for traffic spikes
  • Distroless_image: Smaller attack surface, faster pull
  • Lazy_init: Defers heavy loading to first request

Concurrency Configuration Pattern

Proper concurrency settings for Cloud Run

When to use: Need to optimize instance utilization,Handle traffic spikes efficiently,Reduce cold starts

Understanding Concurrency

# Default concurrency is 80
# Adjust based on your workload

# For I/O-bound workloads (most web apps)
gcloud run deploy my-service \
  --concurrency 80 \
  --cpu 1

# For CPU-bound workloads
gcloud run deploy my-service \
  --concurrency 1 \
  --cpu 1

# For memory-intensive workloads
gcloud run deploy my-service \
  --concurrency 10 \
  --memory 2Gi

Node.js Concurrency

// Node.js is single-threaded but handles I/O concurrently
// Use async/await for all I/O operations

// GOOD - async I/O
app.get('/api/data', async (req, res) => {
  const [users, products] = await Promise.all([
    fetchUsers(),
    fetchProducts()
  ]);
  res.json({ users, products });
});

// BAD - blocking operation
app.get('/api/compute', (req, res) => {
  const result = heavyCpuOperation(); // Blocks other requests!
  res.json(result);
});

Python Concurrency with Gunicorn

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .

# 4 workers for concurrency
CMD exec gunicorn --bind :$PORT --workers 4 --threads 2 main:app
# main.py
from flask import Flask
app = Flask(__name__)

@app.route('/api/data')
def get_data():
    return {'status': 'ok'}

Concurrency_guidelines

  • Concurrency=1: Only for CPU-bound or unsafe code
  • Concurrency=8 20: Memory-intensive workloads
  • Concurrency=80: Default, good for I/O-bound
  • Concurrency=250: Maximum, for very lightweight handlers

Pub/Sub Integration Pattern

Event-driven processing with Cloud Pub/Sub

When to use: Asynchronous message processing,Decoupled microservices,Event-driven architecture

Push Subscription to Cloud Run

# Create topic
gcloud pubsub topics create orders

# Create push subscription to Cloud Run
gcloud pubsub subscriptions create orders-push \
  --topic orders \
  --push-endpoint https://my-service-xxx.run.app/pubsub \
  --ack-deadline 600
// Handle Pub/Sub push messages
const express = require('express');
const app = express();
app.use(express.json());

app.post('/pubsub', async (req, res) => {
  // Verify the request is from Pub/Sub
  if (!req.body.message) {
    return res.status(400).send('Invalid Pub/Sub message');
  }

  try {
    // Decode message data
    const message = req.body.message;
    const data = message.data
      ? JSON.parse(Buffer.from(message.data, 'base64').toString())
      : {};

    console.log('Processing order:', data);

    await processOrder(data);

    // Return 200 to acknowledge
    res.status(200).send('OK');
  } catch (error) {
    console.error('Processing failed:', error);
    // Return 500 to trigger retry
    res.status(500).send('Processing failed');
  }
});

Publishing Messages

const { PubSub } = require('@google-cloud/pubsub');
con

---

*Content truncated.*

Search skills

Search the agent skills registry