Configuration
| Field | Default | Notes |
|---|---|---|
host | — | ClickHouse HTTP address (required) |
database | "default" | Database name |
username | "default" | Authentication username |
password | "" | Authentication password |
batch_size | 1000 | Rows per table before flush |
flush_interval | "5s" | Time-based flush interval |
Tables
The sink writes to 7 tables following the Tell v1.1 schema:| Table | Populated by |
|---|---|
events_v1 | TRACK events from SDKs |
users_v1 | IDENTIFY events (core identity) |
user_devices | IDENTIFY events (device links) |
user_traits | IDENTIFY events (key-value traits) |
context_v1 | CONTEXT events (device/location) |
logs_v1 | Log entries from syslog and SDKs |
snapshots_v1 | Connector snapshots (GitHub, Stripe, etc.) |
users_v1, user_devices, and user_traits.
Performance
- Per-table batching — each table accumulates rows independently, so a burst of logs doesn’t delay event flushes
- Concurrent flush — all tables are flushed in parallel using tokio tasks
- LZ4 compression — enabled by default for wire transport
- Retry with backoff — 3 attempts with exponential backoff (100ms base, 10s max) on transient failures
Arrow HTTP transport
Two implementations are available:- Row-based (default) — uses the clickhouse crate with row structs
- Arrow-based — sends Arrow-format HTTP inserts for higher throughput