logging-monitoring

Name: logging-monitoring
Author: jnPiyush

Implement observability patterns including structured logging, log levels, correlation IDs, metrics, and distributed tracing. Use when adding structured logging, implementing correlation IDs for request tracing, configuring metrics collection, setting up distributed tracing, or designing alerting ru

Install

mkdir -p .claude/skills/logging-monitoring && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16629" && unzip -o skill.zip -d .claude/skills/logging-monitoring && rm skill.zip

Installs to .claude/skills/logging-monitoring

Activation

This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.

Implement observability patterns including structured logging, log levels, correlation IDs, metrics, and distributed tracing. Use when adding structured logging, implementing correlation IDs for request tracing, configuring metrics collection, setting up distributed tracing, or designing alerting rules.

304 chars✓ has a “when” triggerlonger than Claude Code's old 250-char listing cap (fine on current versions)

About this skill

Logging & Monitoring

Purpose: Implement observability for production systems. Goal: Structured logs, correlation across requests, actionable metrics. Note: For implementation, see C# Development or Python Development.

When to Use This Skill

Adding structured logging to applications
Implementing request correlation IDs
Configuring metrics collection
Setting up distributed tracing (OpenTelemetry)
Designing alerting rules and health checks

Prerequisites

Logging framework installed
Monitoring platform access

Decision Tree

Observability concern?
+- What to log?
| +- Request start/end -> INFO with correlation ID
| +- Expected errors -> WARN (validation, not-found)
| +- Unexpected errors -> ERROR with stack trace
| - Debug details -> DEBUG (disabled in production)
+- What NOT to log?
| - PII, passwords, tokens, credit cards -> NEVER
+- Metrics needed?
| +- RED metrics: Rate, Errors, Duration (for services)
| - USE metrics: Utilization, Saturation, Errors (for resources)
+- Distributed tracing?
| - OpenTelemetry for cross-service correlation
- Alerting?
 +- SLO-based: alert on error budget burn rate
 - Avoid alert fatigue: page only for actionable issues

Structured Logging

Concept

Log structured data (key-value pairs) instead of plain text for better searchability and analysis.

[FAIL] Unstructured (hard to parse):
 "User [email protected] logged in from 192.168.1.1 at 2024-01-15 10:30:00"

[PASS] Structured (machine-readable):
 {
 "event": "user_login",
 "user_email": "[email protected]",
 "ip_address": "192.168.1.1",
 "timestamp": "2024-01-15T10:30:00Z",
 "level": "INFO"
 }

Benefits

Searchable: Query by any field
Filterable: Show only errors, specific users, etc.
Aggregatable: Count events, calculate averages
Parseable: Tools can process automatically

Log Levels

Standard Levels

Level	When to Use	Example
TRACE	Very detailed debugging	"Entering function with params: {x: 1, y: 2}"
DEBUG	Debugging information	"Cache hit for key: user_123"
INFO	Normal operations	"User logged in", "Order created"
WARN	Unexpected but recoverable	"Retry attempt 2 of 3", "Rate limit approaching"
ERROR	Failures requiring attention	"Payment failed", "Database connection lost"
FATAL	Application cannot continue	"Out of memory", "Configuration invalid"

Level Configuration by Environment

Development: DEBUG or TRACE
 - See detailed information for debugging

Staging: INFO
 - Normal operations plus warnings/errors

Production: INFO (or WARN)
 - Reduce noise, focus on significant events
 - Keep ERROR/FATAL always enabled

Core Rules

Practice	Description
Structured logging	JSON format with key-value pairs
Correlation IDs	Trace requests across services
Appropriate levels	DEBUG in dev, INFO+ in prod
No sensitive data	Never log passwords, tokens, PII
Context in errors	Include what, why, and how to fix
Meaningful metrics	Track rate, errors, duration
Health checks	Liveness + readiness endpoints
Actionable alerts	Include runbooks, reduce noise

Anti-Patterns

Log and Forget: Writing logs but never querying or reviewing them -> Set up dashboards and alerts on ERROR/FATAL; review logs in incident postmortems
PII in Logs: Logging email addresses, passwords, tokens, or credit card numbers -> Scrub sensitive fields before logging; use allowlists for loggable fields
Unstructured Strings: Logging plain text messages that are hard to parse or search -> Use structured logging (JSON key-value pairs) for all log entries
Missing Correlation: Logs from different services with no shared request ID -> Propagate W3C trace context or a correlation ID header across all service calls
Alert Fatigue: Alerting on every warning or non-actionable metric -> Page only on SLO budget burn rate; group related alerts; include runbook links
Debug in Production: Running production with DEBUG or TRACE level enabled -> Use INFO or WARN in production; enable DEBUG temporarily and only on specific components
Metric Overload: Tracking hundreds of custom metrics with no clear purpose -> Focus on RED (Rate, Errors, Duration) for services and USE (Utilization, Saturation, Errors) for resources

Observability Tools

Category	Tools
Logging	ELK Stack, Splunk, Datadog Logs, CloudWatch Logs
Metrics	Prometheus + Grafana, Datadog, New Relic, CloudWatch
Tracing	Jaeger, Zipkin, Datadog APM, Application Insights
All-in-One	Datadog, New Relic, Dynatrace, Elastic Observability

See Also: Error Handling - C# Development - Python Development

Troubleshooting

Issue	Solution
Logs not appearing in monitoring platform	Check log level configuration, verify sink/exporter endpoint
Correlation IDs missing across services	Propagate W3C trace context headers in all HTTP calls
Alert fatigue from too many notifications	Set meaningful thresholds, group related alerts, add alert suppression windows

References

More by jnPiyush

View all by jnPiyush →

ux-ui-design

jnPiyush

Design user experiences with wireframing, prototyping, user flows, accessibility, and production-ready HTML prototypes. Use when creating wireframes, building interactive prototypes, designing user flows, implementing accessibility standards, or producing HTML/CSS design deliverables.

copilot-studio-agents

jnPiyush

Design Microsoft Copilot Studio agents (formerly Power Virtual Agents) -- topics, trigger phrases, generative answers, knowledge sources, connector and MCP actions, authentication, channels, and agent flows -- so an agent can author the conversational logic that ships as a Bot component inside a Pow

verification-before-completion

jnPiyush

Block false completion claims. Force the agent to identify the claim, run the exact verification command, read the actual output, compare against the claim, and only then report. Use whenever an agent is about to say "done", "fixed", "tests pass", "deployed", "loop complete", or close an issue.

configuration

jnPiyush

Implement configuration management patterns including environment variables, secrets, feature flags, and validation strategies. Use when setting up app configuration, managing environment-specific settings, implementing feature flags, storing secrets securely, or validating configuration at startup.

docx

jnPiyush

Read, write, and transform Microsoft Word .docx files. Use when extracting text or tables from Word documents, generating reports from templates, applying styles, inserting images, building tables, or converting Markdown/HTML to Word.

error-handling

jnPiyush

Implement robust error handling with exceptions, retry logic, circuit breakers, and graceful degradation. Use when designing error handling strategies, implementing retry policies, adding circuit breakers, configuring timeouts, or building health check endpoints.

Install

mkdir -p .claude/skills/logging-monitoring && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/16629" && unzip -o skill.zip -d .claude/skills/logging-monitoring && rm skill.zip

Installs to .claude/skills/logging-monitoring

Safety

No risk patterns found

Automated static scan of the SKILL.md and repo. A flag describes what the skill can do — not a verdict. Always review code before installing.

Source & maintenance

Updated

4mo ago

License

Apache-2.0

Repo stars

Loads

~1,415 tokens

Stars are for the whole repository, not this skill alone.

Stats

Views

Installs

Author

jnPiyush

7 skills published

Links

Source code

logging-monitoring

Install

Activation

About this skill

Logging & Monitoring

When to Use This Skill

Prerequisites

Decision Tree

Structured Logging

Concept

Benefits

Log Levels

Standard Levels

Level Configuration by Environment

Core Rules

Anti-Patterns

Observability Tools

Troubleshooting

References

More by jnPiyush

ux-ui-design

copilot-studio-agents

verification-before-completion

configuration

docx

error-handling

Search skills