Skip to main content
The pattern transform extracts recurring patterns from log messages using the Drain algorithm. A message like "User alice logged in from 10.0.0.1" becomes the pattern "User <*> logged in from <*>" — grouping thousands of similar messages into a handful of templates. Patterns are used by Tell’s anomaly detection to spot unusual log activity.

Quick start

[[routing.rules.transformers]]
type = "pattern_matcher"
That’s it. The defaults work well for most log volumes. Each log message gets a pattern ID attached, enabling pattern-based grouping and anomaly scoring downstream.

How it works

The Drain algorithm builds a tree of log patterns:
  1. Incoming messages are tokenized (split on whitespace)
  2. Tokens that look like variables — numbers, IPs, UUIDs, URLs, timestamps, paths, emails — are detected automatically
  3. Messages are matched against existing patterns by similarity
  4. If a match is found, the pattern’s count increments. If not, a new pattern is created.
The result is a set of templates like:
"User <*> logged in from <*>"              count: 14,203
"Payment <*> failed with code <*>"         count: 847
"Request to <*> timed out after <*> ms"    count: 312

Similarity threshold

The similarity_threshold controls how aggressively messages are clustered:
[[routing.rules.transformers]]
type = "pattern_matcher"
similarity_threshold = 0.5
ValueEffect
Lower (0.3)More lenient — fewer patterns, broader clusters
Default (0.5)Balanced clustering
Higher (0.7)More strict — more patterns, tighter clusters

Persistence

By default, patterns live in memory and are lost on restart. Enable file persistence to save them:
[[routing.rules.transformers]]
type = "pattern_matcher"
persistence_enabled = true
persistence_file = "/var/lib/tell/patterns.json"
Patterns are saved in the background — persistence doesn’t slow down the transform pipeline.

Caching

The pattern matcher uses a 3-level cache for performance:
LevelWhat it cachesTypical hit rate
L1Exact message hash70-80%
L2Normalized template hashCatches similar messages
L3Drain tree lookupFallback for new messages
Most messages hit L1 (identical to a recent message) and skip the tree entirely. The cache_size setting controls L1 capacity.

Reference

FieldDefaultDescription
type"pattern_matcher"
similarity_threshold0.5How similar messages must be to share a pattern (0.0–1.0)
max_child_nodes100Maximum branches per tree node
cache_size100000L1 cache capacity (exact message hashes)
persistence_enabledfalseSave patterns to disk
persistence_fileFile path for pattern storage
enabledtrueSet to false to disable

What’s next