Migrate Observability from TimescaleDB to ClickHouse

Temps stores all of its observability and analytics data in PostgreSQL/TimescaleDB by default — no extra services, single binary. When your traces, metrics, proxy logs, or analytics events outgrow TimescaleDB, you can route them to a ClickHouse cluster at runtime by setting a few environment variables on temps serve. There is no rebuild, no cargo feature flag, and no data-loss window: the ClickHouse backends are always compiled in and stay dormant until you configure them.


When to migrate

TimescaleDB is the right default for the vast majority of self-hosted deployments and needs no tuning. Reach for ClickHouse when you hit columnar-scale workloads that a row store struggles with:

  • Millions of traces where the list view's per-request GROUP BY trace_id and duration sort over the spans hypertable becomes slow.
  • High-cardinality metrics with long retention and heavy dashboard fan-out.
  • High-volume proxy/request logs (hundreds of thousands to millions of rows) queried with wide time windows and many filters.
  • Large analytics-event volumes where uniq()-based approximate counts and columnar scans beat exact PostgreSQL aggregation.

If you are not hitting these limits, stay on TimescaleDB — it is one less service to run.


What changes

Four independent domains can be backed by ClickHouse. They all activate from the same four TEMPS_CLICKHOUSE_* variables and share one ClickHouse database (temps by default), but each owns its own tables and migration-tracking table so they never collide.

DomainClickHouse table(s)Migration-tracking tableWrite model
Analytics eventsevents, events_5m_mv, sessions_temps_ch_migrationsDual-write — PostgreSQL is source of truth, fan-out worker replicates
Traces / spansspans_temps_ch_otel_migrationsBackend swap — writes go to ClickHouse instead of TimescaleDB
Resource metricsservice_metrics_temps_ch_metrics_migrationsBackend swap
Proxy / request logsproxy_logs_temps_ch_proxy_logs_migrationsBackend swap

The write model is the most important distinction for planning your migration, because it determines whether your existing data comes along automatically — see Historical data & backfill.


Prerequisites

  • A reachable ClickHouse instance or cluster (self-hosted or managed) speaking the HTTP protocol (default port 8123).
  • A ClickHouse user with permission to CREATE DATABASE, CREATE TABLE, and read/write within the target database. Temps bootstraps its own schema on first connect.
  • Network reachability from the host running temps serve to the ClickHouse HTTP endpoint.
  • Your existing PostgreSQL/TimescaleDB instance, untouched — it remains the system of record for analytics and the fallback for all four domains.

Step 1 — Stand up ClickHouse

Run a ClickHouse server reachable over HTTP. A minimal Docker example for evaluation:

docker

docker run -d --name temps-clickhouse \
  -p 8123:8123 -p 9000:9000 \
  -e CLICKHOUSE_USER=temps \
  -e CLICKHOUSE_PASSWORD=your-clickhouse-password \
  -e CLICKHOUSE_DB=temps \
  clickhouse/clickhouse-server:latest

You do not need to create any tables by hand. On first connect, each Temps backend issues CREATE DATABASE IF NOT EXISTS for the target database and applies its own idempotent ClickHouse schema migrations — tracked per-domain in its own _temps_ch_*_migrations table.


Step 2 — Configure Temps

Set the ClickHouse variables on the process running temps serve:

environment

export TEMPS_CLICKHOUSE_URL=http://clickhouse:8123
export TEMPS_CLICKHOUSE_USER=temps
export TEMPS_CLICKHOUSE_PASSWORD=your-clickhouse-password
# Optional — defaults to "temps" when TEMPS_CLICKHOUSE_URL is set:
export TEMPS_CLICKHOUSE_DATABASE=temps
VariableRequiredDefaultDescription
TEMPS_CLICKHOUSE_URLYesClickHouse HTTP endpoint (e.g. http://host:8123)
TEMPS_CLICKHOUSE_USERYesClickHouse username
TEMPS_CLICKHOUSE_PASSWORDYesClickHouse password
TEMPS_CLICKHOUSE_DATABASENotempsTarget database; auto-resolves to temps when the URL is set

Fail-closed activation

The ClickHouse backends activate only when the URL, user, and password are all present and non-empty (the database name fills in automatically). A variable that is set but empty counts as unset, so a half-configured operator never silently loses data — Temps simply stays in the default TimescaleDB-only mode. There is no partial-ClickHouse state.


Step 3 — Restart and verify

Restart temps serve so it reads the new environment. On boot, for each enabled domain Temps will:

  1. CREATE DATABASE IF NOT EXISTS for the target database (run against the always-present default database so the session never depends on a database that doesn't exist yet).
  2. Apply the embedded ClickHouse schema migrations, tracked per-domain in its own _temps_ch_*_migrations table.
  3. Swap the read (and, for the swap-model domains, write) path to the ClickHouse backend.

Migrations run on background tasks so plugin init does not block on remote calls. If a migration fails, the error surfaces on the first query for that domain rather than crashing startup.

Confirm it's live

Check the server logs for the activation lines, for example:

ClickHouse OTel backend enabled (ADR-016) — applying migrations url=http://clickhouse:8123 database=temps
ClickHouse OTel migrations applied applied=["0001_spans"] skipped_count=0
Proxy/request logs: ClickHouse backend enabled (TEMPS_CLICKHOUSE_* configured)

Then query ClickHouse directly to confirm the schema and that new data is flowing:

clickhouse-client

# Tables exist in the shared database
clickhouse-client -q "SHOW TABLES FROM temps"
# Expect: events, events_5m_mv, sessions, spans, service_metrics, proxy_logs,
#         and the four _temps_ch_*_migrations tracking tables.

# New telemetry is landing (run after generating some traffic)
clickhouse-client -q "SELECT count() FROM temps.spans"
clickhouse-client -q "SELECT count() FROM temps.proxy_logs"
clickhouse-client -q "SELECT count() FROM temps.service_metrics"
clickhouse-client -q "SELECT count() FROM temps.events FINAL"

Finally, open the console — the Observe, Analytics, Monitoring, and traces views now read from ClickHouse. They should render identically; the data just comes from the columnar store.


Historical data & backfill

This is where the two write models diverge. Read this section before you migrate so you know what comes along.

Analytics events — history replicates automatically

Analytics events dual-write: record_event always writes to PostgreSQL synchronously (the system of record), then enqueues the event id into an events_ch_outbox table (INSERT ... ON CONFLICT (event_id) DO NOTHING). A background fan-out worker drains that outbox into ClickHouse, claiming batches with FOR UPDATE SKIP LOCKED.

Because the outbox is a queue, existing analytics events replicate to ClickHouse as the worker catches up after you enable the backend. Retries are safe — the events table uses ReplacingMergeTree(_version) and dedupes over its sort key, so re-delivering an event collapses to one row.

Traces, metrics, and proxy logs — new data only

These three domains do not dual-write. When ClickHouse is enabled, the backend is chosen at construction time: new writes go to ClickHouse instead of TimescaleDB. There is no automatic backfill of the spans, service_metrics, or proxy_logs rows already sitting in TimescaleDB.

In practice this means:

  • After you switch on ClickHouse, the traces / metrics / proxy-log views show data ingested from that point forward.
  • Your historical TimescaleDB data is not deleted — it stays in the hypertables and ages out under its existing 90-day retention policy. It is simply no longer the backend the console reads from for those domains.

Behavior differences

The ClickHouse backends are designed to be drop-in, but a few behaviors differ by design:

  • Approximate unique counts. The analytics-events backend uses uniq() (HyperLogLog), accurate to within ~1% at scale, instead of the exact COUNT(DISTINCT) used by PostgreSQL.
  • Self-referral filter is PostgreSQL-only. The referrer-hostname self-referral filter checks the project_custom_domains table, which is not replicated to ClickHouse, so a referrer breakdown on the ClickHouse backend may include your own domain.
  • Visitor/session analytics stay on PostgreSQL. Only the analytics events read path is backed by ClickHouse; the separate visitor/session path always uses PostgreSQL.
  • Geolocation is denormalized. Country/region/city are written onto each ClickHouse event row at fan-out time, so breakdowns group on plain columns with no cross-database join.
  • Deduplication semantics. ClickHouse tables use ReplacingMergeTree; reads that must be exact use FINAL to collapse duplicates before aggregating.

Rollback

Rolling back is symmetric with enabling and loses no system-of-record data:

  1. Unset TEMPS_CLICKHOUSE_URL, TEMPS_CLICKHOUSE_USER, and TEMPS_CLICKHOUSE_PASSWORD (and TEMPS_CLICKHOUSE_DATABASE if you set it).
  2. Restart temps serve.

Temps reverts every domain to the PostgreSQL/TimescaleDB path with no other changes. For analytics events, PostgreSQL was the system of record the entire time, so nothing is lost. For traces / metrics / proxy logs, the console reads from TimescaleDB again — the data ingested while ClickHouse was enabled lives in ClickHouse and will not appear in the TimescaleDB-backed views (mirror of the forward-cutover gap).


Troubleshooting

  • Name
    ClickHouse stays disabled after setting variables
    Type
    config
    Description

    All of TEMPS_CLICKHOUSE_URL, _USER, and _PASSWORD must be present and non-empty. A variable that is set but empty counts as unset. Check for trailing whitespace or an empty value, and confirm the variables are exported into the same process environment as temps serve.

  • Name
    “Database temps does not exist” on boot
    Type
    error
    Description

    Temps issues CREATE DATABASE IF NOT EXISTS against the always-present default database before any other statement. If you see this, the configured user likely lacks CREATE DATABASE permission — grant it, or pre-create the database and grant table privileges.

  • Name
    Migration SQL error at boot
    Type
    error
    Description

    Each domain applies idempotent migrations tracked in its own _temps_ch_*_migrations table. Re-running is safe (CREATE ... IF NOT EXISTS). If a migration genuinely fails, the error surfaces on the first query for that domain; check the server logs for the offending statement.

  • Name
    Analytics events not appearing in ClickHouse
    Type
    data
    Description

    Events replicate via the events_ch_outbox. Verify the fan-out worker is running and ClickHouse is reachable. A growing dead-letter count (rows with attempts >= max_attempts) in the warn logs indicates delivery failures — almost always ClickHouse connectivity or auth.

  • Name
    Traces/metrics/proxy logs missing history
    Type
    expected
    Description

    Expected. These three domains do not backfill — see Historical data & backfill. Only data ingested after enabling ClickHouse appears in those views.


Was this page helpful?