KQL language expertise for writing correct, efficient Kusto queries using the Fabric RTI MCP tools. Covers syntax gotchas, join patterns, dynamic types, datetime pitfalls, regex patterns, serialization, memory management, result-size discipline, and advanced functions (geo, vector, graph). USE THIS
Install
mkdir -p .claude/skills/kql && curl -L -o skill.zip "https://agentskills.codes/api/skills/download/13416" && unzip -o skill.zip -d .claude/skills/kql && rm skill.zipInstalls to .claude/skills/kql
Activation
This is the description your AI agent reads to decide when to run this skill — the better it matches your request, the more reliably it fires.
KQL language expertise for writing correct, efficient Kusto queries using the Fabric RTI MCP tools. Covers syntax gotchas, join patterns, dynamic types, datetime pitfalls, regex patterns, serialization, memory management, result-size discipline, and advanced functions (geo, vector, graph). USE THIS SKILL whenever writing, debugging, or reviewing KQL queries — even simple ones — because the gotchas section prevents the most common errors that waste tool calls and cause expensive retry cascades. Trigger on: KQL, Kusto, ADX, Azure Data Explorer, Fabric Eventhouse, log analysis, data exploration, time series, anomaly detection, summarize, where clause, join, extend, project, let statement, parse operator, extract function, any mention of pipe-forward query syntax.About this skill
KQL Mastery
Try it yourself: All
✅examples in this skill can be run against the public help cluster:https://help.kusto.windows.net, databaseSamples(containsStormEvents,SimpleGraph_Nodes/Edges,nyc_taxi, and more).
1. Running KQL with Fabric RTI MCP
Fabric RTI MCP exposes Kusto functionality as MCP tools. Authentication is handled transparently using Azure Identity.
Available tools
| Tool | Purpose |
|---|---|
kusto_query | Execute a KQL query on a database |
kusto_command | Execute a management command (.show, .create, etc.) |
kusto_list_entities | List databases, tables, external tables, materialized views, functions, graphs |
kusto_describe_database | Get schema for all entities in a database |
kusto_describe_database_entity | Get schema for a specific entity (table, function, etc.) |
kusto_sample_entity | Get sample data from a table or other entity |
kusto_graph_query | Execute a graph query using snapshots or transient graphs |
kusto_ingest_inline_into_table | Ingest inline CSV data into a table |
kusto_known_services | List configured Kusto services |
kusto_get_shots | Retrieve semantically similar shots from a shots table |
kusto_deeplink_from_query | Build a deeplink URL to open a query in the web explorer |
kusto_show_queryplan | Get the execution plan for a query without running it |
kusto_diagnostics | Get a best-effort cluster health and capacity summary |
Query vs management commands
KQL has two execution planes, each with its own MCP tool:
| Plane | Tool | Starts with | Examples |
|---|---|---|---|
| Query | kusto_query | Table name, let, print, datatable | StormEvents | where State == "TEXAS" |
| Management | kusto_command | .show, .create, .set, .drop, .alter | .show tables, .show table T schema |
Basic usage
# Query plane — use kusto_query
kusto_query(
cluster_uri="https://help.kusto.windows.net",
database="Samples",
query="StormEvents | summarize count() by EventType | top 5 by count_ desc"
)
# Management plane — use kusto_command
kusto_command(
cluster_uri="https://help.kusto.windows.net",
database="Samples",
command=".show tables"
)
# Schema exploration — use kusto_describe_database or kusto_describe_database_entity
kusto_describe_database(
cluster_uri="https://help.kusto.windows.net",
database="Samples"
)
# Sample data — use kusto_sample_entity
kusto_sample_entity(
cluster_uri="https://help.kusto.windows.net",
database="Samples",
entity_name="StormEvents",
entity_type="table",
sample_size=5
)
# Graph queries — use kusto_graph_query
kusto_graph_query(
cluster_uri="https://mycluster.kusto.windows.net",
database="MyDB",
graph_name="MyGraph",
query="| graph-match (node) project labels=labels(node)"
)
# Deeplinks — use kusto_deeplink_from_query
kusto_deeplink_from_query(
cluster_uri="https://help.kusto.windows.net",
database="Samples",
query="StormEvents | count"
)
Exploration workflow
When encountering a new cluster or database:
- List entities:
kusto_list_entities(cluster_uri, entity_type="tables", database="MyDB") - Get schema:
kusto_describe_database_entity(entity_name="MyTable", entity_type="table", cluster_uri=..., database=...) - Sample data:
kusto_sample_entity(entity_name="MyTable", entity_type="table", cluster_uri=..., sample_size=5) - Count rows:
kusto_query(query="MyTable | count", cluster_uri=..., database=...) - Run analysis:
kusto_query(query="MyTable | where ... | summarize ...", cluster_uri=..., database=...)
2. Dynamic Type Discipline
KQL's dynamic type is flexible but strict in certain contexts. A common mistake is using a dynamic column in summarize by, order by, or join on without casting.
The rule: Any time you use a dynamic-typed column in by, on, or order by, wrap it in an explicit cast.
// ❌ ERROR: "Summarize group key 'Partners' is of a 'dynamic' type"
| summarize count() by Partners
// ✅ FIX
| summarize count() by tostring(Partners)
// ❌ ERROR: "order operator: key can't be of dynamic type"
| order by Area desc
// ✅ FIX
| order by tostring(Area) desc
// ❌ ERROR in join: dynamic join key
| join kind=inner other on $left.Area == $right.Area
// ✅ FIX — cast both sides
| extend Area_str = tostring(Area)
| join kind=inner (other | extend Area_str = tostring(Area)) on Area_str
Self-correction: When you see "is of a 'dynamic' type" in an error, add tostring(), tolong(), or todouble().
3. Join Patterns & Pitfalls
KQL joins have constraints that differ from SQL.
Equality only
KQL join conditions support only ==. No <, >, !=, or function calls in join predicates.
// ❌ ERROR: "Only equality is allowed in this context"
| join on geo_distance_2points(a.Lat, a.Lon, b.Lat, b.Lon) < 1000
// ✅ WORKAROUND — pre-bucket into spatial cells, then join on cell ID
| extend cell = geo_point_to_s2cell(Lon, Lat, 8)
| join kind=inner (other | extend cell = geo_point_to_s2cell(Lon, Lat, 8)) on cell
For range joins, pre-bin values: | extend bin_val = bin(Value, 100), then join on bin_val.
Left/right attribute matching
Both sides of a join on clause must reference column entities only — not expressions, not aggregates.
// ❌ ERROR: "for each left attribute, right attribute should be selected"
| join kind=inner other on $left.col1
// ✅ FIX — specify both sides explicitly
| join kind=inner other on $left.col1 == $right.col1
Cardinality check before large joins
Always check cardinality before joining tables with >10K rows. A cross-join explosion was the source of the single E_RUNAWAY_QUERY error (25K × 195 = potential 4.8M rows).
// Before joining, check how many rows each side contributes
TableA | summarize dcount(JoinKey) // → 25,000? Too many for an unconstrained join
TableB | summarize dcount(JoinKey) // → 195? OK if filtered first
4. Regex in KQL
KQL handles regex natively — no need for Python.
The extract_all gotcha
Unlike Python's re.findall(), KQL's extract_all requires capturing groups in the regex:
// ❌ ERROR: "extractall(): argument 2 must be a valid regex with [1..16] matching groups"
| extend words = extract_all(@"[a-zA-Z]{3,}", Text)
// ✅ FIX — add parentheses around the pattern
| extend words = extract_all(@"([a-zA-Z]{3,})", Text)
Regex toolkit — don't fall back to Python
| Function | Use case | Example |
|---|---|---|
extract(regex, group, source) | Single match | extract(@"User '([^']+)'", 1, Msg) |
extract_all(regex, source) | All matches (needs ()) | extract_all(@"(\w+)", Text) |
parse | Structured extraction | parse Msg with * "User '" Sender "' sent" * |
matches regex | Boolean filter | where Url matches regex @"^https?://" |
replace_regex | Find and replace | replace_regex(Text, @"\s+", " ") |
5. Serialization Requirements
Window functions need serialized (ordered) input.
// ❌ ERROR: "Function 'row_cumsum' cannot be invoked. The row set must be serialized."
| summarize Online = sum(Direction) by bin(Timestamp, 5m)
| extend CumulativeOnline = row_cumsum(Online)
// ✅ FIX — add | serialize (or | order by, which implicitly serializes)
| summarize Online = sum(Direction) by bin(Timestamp, 5m)
| order by Timestamp asc
| extend CumulativeOnline = row_cumsum(Online)
Functions requiring serialization: row_number(), row_cumsum(), prev(), next(), row_window_session().
6. Memory-Safe Query Patterns
The most common memory error. Caused by scanning too much data without pre-filtering.
The progression of safety
Safest ──────────────────────────────────────────────── Most dangerous
| count | take 10 | where + summarize | summarize (no filter) | full scan
Rules for large tables (>1M rows)
- Always start with
| countto understand table size - Always
| wherebefore| summarize— filter time range, partition key, or category first - Never
dcount()on high-cardinality columns without pre-filtering - Check join cardinality before executing (see Section 3)
- Use
materialize()for subqueries referenced multiple times
// ❌ OUT OF MEMORY — 24M rows, no filter, dcount on every column
Consumption
| summarize dcount(Consumed), count() by Timestamp, HouseholdId, MeterType
| where dcount_Consumed > 1
// ✅ SAFE — filter first, then aggregate
Consumption
| where Timestamp between (datetime(2023-04-15) .. datetime(2023-04-16))
| summarize dcount(Consumed) by HouseholdId, MeterType
| where dcount_Consumed > 1
When you see E_LOW_MEMORY_CONDITION
The query touched too much data. Your options:
- Add
| wherefilters (time range, partition key) - Reduce the number of
bycolumns insummarize - Break into smaller time windows and union results
- Use
| sample 10000for exploratory work instead of full scans
When you see E_RUNAWAY_QUERY
A join or aggregation produced too many output rows. Check join cardinality — one or both sides is too large.
7. Result Size Discipline
Large results slow down analysis. Prevention:
| Query type | Safeguard |
|---|---|
| Exploratory | Always end with | take 10 or | take 20 |
| Aggregation | Use | top 20 by ... not unbounded summarize |
| Wide rows (vectors, JSON) | | project only needed columns |
make_list() / make_set() | Avoid on high-cardinality groups (produces huge cells) |
| Unknown size | Run | count first |
The vector trap: Tables with embedding columns (1536-dim float arrays) produce ~30KB per row. Even | take 20 yields 600KB. Always | project away vector columns unless you specifically need them.
With MCP tools: Use `kusto_sample_ent
Content truncated.