Skip to main content
The Arrow IPC sink writes data to Arrow IPC files (also known as Feather) for fast I/O and zero-copy reads. Use it for hot data that needs frequent access — real-time dashboards, local development, and inter-process communication.

Configuration

[sinks.hot]
type = "arrow_ipc"
path = "/data/arrow"
rotation = "daily"
FieldDefaultNotes
pathOutput directory (required)
rotation"daily""hourly" or "daily"
buffer_size10,000Rows buffered before flush
flush_interval"60s"Time-based flush interval

Tables

The Arrow IPC sink writes the full Tell v1.1 schema — the same 7 tables as the ClickHouse sink:
FilePopulated by
events_v1.arrowTRACK events
logs_v1.arrowLog entries
snapshots_v1.arrowConnector snapshots
context_v1.arrowCONTEXT events (device/location)
users_v1.arrowIDENTIFY events (core identity)
user_devices.arrowIDENTIFY events (device links)
user_traits.arrowIDENTIFY events (key-value traits)
Like the ClickHouse sink, IDENTIFY events fan out to users_v1, user_devices, and user_traits. CONTEXT events extract device and location fields from the payload.

File organization

arrow_ipc/
└── {workspace_id}/
    └── {date}/
        ├── events_v1.arrow
        ├── logs_v1.arrow
        ├── context_v1.arrow
        ├── users_v1.arrow
        ├── user_devices.arrow
        └── user_traits.arrow

Reading Arrow IPC files

Arrow IPC is supported natively by all major data tools:
# Polars (recommended — zero-copy read)
import polars as pl
df = pl.read_ipc("arrow_ipc/1/2025-01-15/events_v1.arrow")

# DuckDB
SELECT * FROM 'arrow_ipc/1/2025-01-15/events_v1.arrow';

# PyArrow
import pyarrow.ipc as ipc
reader = ipc.open_file("arrow_ipc/1/2025-01-15/events_v1.arrow")
Also supported by DataFusion (native) and any Arrow-compatible tool.

Local query backend

Tell’s local query engine (Polars backend) reads from Arrow IPC files directly. Configure this in [query]:
[query]
backend = "polars"
data_dir = "/data/arrow"
This is useful for development and small deployments where ClickHouse isn’t needed.