Documentation Index
Fetch the complete documentation index at: https://docs.tell.rs/llms.txt
Use this file to discover all available pages before exploring further.
The pattern transform extracts recurring patterns from log messages using the Drain algorithm. A message like "User alice logged in from 10.0.0.1" becomes the pattern "User <*> logged in from <*>" — grouping thousands of similar messages into a handful of templates.
Patterns are used by Tell’s anomaly detection to spot unusual log activity.
Quick start
[[routing.rules.transformers]]
type = "pattern_matcher"
That’s it. The defaults work well for most log volumes. Each log message gets a pattern ID attached, enabling pattern-based grouping and anomaly scoring downstream.
How it works
The Drain algorithm builds a tree of log patterns:
- Incoming messages are tokenized (split on whitespace)
- Tokens that look like variables — numbers, IPs, UUIDs, URLs, timestamps, paths, emails — are detected automatically
- Messages are matched against existing patterns by similarity
- If a match is found, the pattern’s count increments. If not, a new pattern is created.
The result is a set of templates like:
"User <*> logged in from <*>" count: 14,203
"Payment <*> failed with code <*>" count: 847
"Request to <*> timed out after <*> ms" count: 312
Similarity threshold
The similarity_threshold controls how aggressively messages are clustered:
[[routing.rules.transformers]]
type = "pattern_matcher"
similarity_threshold = 0.5
| Value | Effect |
|---|
| Lower (0.3) | More lenient — fewer patterns, broader clusters |
| Default (0.5) | Balanced clustering |
| Higher (0.7) | More strict — more patterns, tighter clusters |
Persistence
By default, patterns live in memory and are lost on restart. Enable file persistence to save them:
[[routing.rules.transformers]]
type = "pattern_matcher"
persistence_enabled = true
persistence_file = "/var/lib/tell/patterns.json"
Patterns are saved in the background — persistence doesn’t slow down the transform pipeline.
Caching
The pattern matcher uses a 3-level cache for performance:
| Level | What it caches | Typical hit rate |
|---|
| L1 | Exact message hash | 70-80% |
| L2 | Normalized template hash | Catches similar messages |
| L3 | Drain tree lookup | Fallback for new messages |
Most messages hit L1 (identical to a recent message) and skip the tree entirely. The cache_size setting controls L1 capacity.
Reference
| Field | Default | Description |
|---|
type | — | "pattern_matcher" |
similarity_threshold | 0.5 | How similar messages must be to share a pattern (0.0–1.0) |
max_child_nodes | 100 | Maximum branches per tree node |
cache_size | 100000 | L1 cache capacity (exact message hashes) |
persistence_enabled | false | Save patterns to disk |
persistence_file | — | File path for pattern storage |
enabled | true | Set to false to disable |
What’s next