Data Flow Architecture
This section explains how data flows through Temps for analytics, error tracking, session replay, and monitoring.
Analytics Data Flow
Request Analytics Pipeline
Every HTTP request generates an analytics event:
1. Request arrives at Pingora
↓
2. ProxyContext captures metadata
├── Method, path, headers
├── Client IP, User-Agent
├── Visitor ID, Session ID
└── Timestamps
↓
3. Request routed to upstream
↓
4. Response received
├── Status code
├── Response time
├── Content type
└── Response headers
↓
5. ProxyLogService creates event
└── CreateProxyLogRequest struct
↓
6. Event inserted into database
└── proxy_logs table (TimescaleDB hypertable)
↓
7. Dashboard queries events
├── Real-time display
├── Historical analysis
├── Funnels and paths
└── Performance metrics
Captured Event Data
{
"request_id": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2024-01-15T10:30:45.123Z",
"project_id": 5,
"deployment_id": 42,
"method": "GET",
"path": "/api/users",
"query_string": "page=1&limit=10",
"status": 200,
"duration_ms": 45,
"request_size": 256,
"response_size": 5120,
"ip_address": "203.0.113.45",
"user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
"referrer": "https://example.com/dashboard",
"visitor_id": "v_xyz789",
"session_id": "s_abc123",
"content_type": "application/json",
"cache_status": "MISS",
"request_headers": {
"Accept": "application/json",
"Authorization": "Bearer ..."
}
}
Error Tracking Data Flow
Error Capture to Storage
1. Error occurs in user app
├── Uncaught exception
├── API error
└── Runtime error
↓
2. Error SDK captures
├── Stack trace
├── Error message
├── Context (URL, user, etc.)
└── Source code line
↓
3. Send to Temps API
└── POST /api/errors
↓
4. Error Service processes
├── Extract error type
├── Generate fingerprint
├── Group similar errors
└── Deduplicate
↓
5. Store in database
└── errors table
↓
6. Trigger notifications
├── Email to team
├── Slack message
└── Webhook call
↓
7. Dashboard aggregates
├── Error groups
├── Trend analysis
├── Stack traces
└── Environment context
Error Event Structure
pub struct ErrorEvent {
pub project_id: i32,
pub deployment_id: Option<i32>,
pub error_type: String, // e.g., "TypeError"
pub error_message: String,
pub stack_trace: String,
pub fingerprint: String, // For grouping
pub environment: String, // dev, prod, etc.
pub user_id: Option<String>,
pub user_email: Option<String>,
pub context: serde_json::Value, // Additional context
pub source_map: Option<String>, // For JS transpiled code
pub first_seen: DateTime<Utc>,
pub last_seen: DateTime<Utc>,
pub occurrence_count: i64,
}
Session Replay Data Flow
Recording to Playback
1. User visits web app
↓
2. Analytics SDK initializes
└── Loads session replay recorder
↓
3. Record DOM mutations
├── Element additions
├── Element removals
├── Text changes
├── Attribute changes
└── Style changes
↓
4. Record user interactions
├── Clicks
├── Form inputs
├── Scrolls
├── Keyboard events
└── Touch events
↓
5. Buffer events in memory
├── Every 100ms capture snapshot
├── Store mutations between snapshots
└── Keep ~5 minutes of data
↓
6. Periodically send to server
├── POST /api/session-replay
├── Batch multiple events
├── Compress with gzip
└── Retry on failure
↓
7. Store in database
└── session_replay_events table
↓
8. Dashboard replays
├── Load stored events
├── Rebuild DOM state
├── Play mutations frame-by-frame
├── Sync with console logs
└── Show user interactions
Session Replay Event
{
"session_id": "s_abc123",
"visitor_id": "v_xyz789",
"project_id": 5,
"timestamp": "2024-01-15T10:30:45.123Z",
"type": "mutation",
"mutations": [
{
"type": "added_node",
"id": 42,
"parent_id": 41,
"tag": "div",
"attributes": {
"class": "new-element",
"data-testid": "modal"
}
},
{
"type": "text_update",
"id": 43,
"text": "User clicked button"
},
{
"type": "attribute_update",
"id": 44,
"attribute": "aria-hidden",
"value": "false"
}
]
}
Performance Metrics Data Flow
Monitoring Application Performance
1. User loads web app
↓
2. Browser measures Core Web Vitals
├── LCP (Largest Contentful Paint)
├── FID (First Input Delay)
├── CLS (Cumulative Layout Shift)
└── FCP (First Contentful Paint)
↓
3. Analytics SDK collects metrics
├── Navigation timing
├── Resource timing
├── Custom metrics
└── Error rates
↓
4. Send to Temps Analytics API
├── POST /api/analytics/metrics
├── Batch with other events
└── Include device/browser info
↓
5. Store in database
└── performance_metrics table (TimescaleDB)
↓
6. Dashboard displays
├── Performance trends
├── Device/browser comparison
├── Slow page detection
└── Alerts on degradation
Performance Metric Event
{
"project_id": 5,
"session_id": "s_abc123",
"timestamp": "2024-01-15T10:30:45.123Z",
"page": "/dashboard",
"metrics": {
"lcp": 1250, // ms
"fid": 150, // ms
"cls": 0.05, // unitless
"fcp": 800, // ms
"ttfb": 200, // Time to First Byte (ms)
"navigation_start": 0,
"load_event_end": 2000
},
"device": {
"type": "desktop",
"os": "Windows 10",
"browser": "Chrome 120"
}
}
Funnel Analysis Data Flow
Converting Events to Funnels
1. User specifies funnel definition
├── Step 1: /pricing (view page)
├── Step 2: /signup (view page)
├── Step 3: signed_up (custom event)
└── Step 4: /dashboard (view page)
↓
2. System queries events
└── Find all sessions matching pattern
↓
3. Analyze conversion
├── Step 1: 1000 sessions
├── Step 2: 800 sessions (80% conversion)
├── Step 3: 600 sessions (75% conversion from step 2)
└── Step 4: 500 sessions (83% conversion)
↓
4. Calculate dropout
├── Lost 200 sessions between step 1 and 2
├── Lost 200 sessions between step 2 and 3
├── Lost 100 sessions between step 3 and 4
└── Overall: 50% conversion
↓
5. Display in dashboard
├── Visual funnel diagram
├── Conversion rates
├── Dropout analysis
├── Time between steps
└── Segment by user properties
Visitor Segmentation Data Flow
Building User Cohorts
1. User defines segment
├── Browser = Chrome
├── Country = USA
├── Has signed up
└── Last seen in last 7 days
↓
2. Query database
├── Find visitors matching criteria
├── Get their sessions
├── Calculate properties
└── Group by dimension
↓
3. Calculate metrics
├── Segment size
├── Bounce rate
├── Average session duration
├── Pages per session
└── Conversion rate
↓
4. Display results
├── Size breakdown
├── Comparative metrics
├── Trend over time
└── Behavior analysis
Real-Time Dashboard Updates
WebSocket-Based Updates
1. Client connects to dashboard
├── Opens WebSocket connection
├── Subscribes to project_id
└── Subscribes to metric types
↓
2. Proxy logs new event
├── Event inserted into database
├── Database triggers event
├── Temps receives NOTIFY
└── New event broadcasts to connected clients
↓
3. Updates appear in dashboard
├── Page view count updates
├── Live visitor count
├── Error count badge
├── Performance metrics
└── No page refresh needed
Data Retention
Storage Strategy
Real-time data (24 hours)
├── All events stored at full resolution
├── Quick access for debugging
└── High write throughput
↓
Short-term data (7 days)
├── Aggregated by hour
├── Details available on demand
└── Moderate compression
↓
Medium-term data (30 days)
├── Aggregated by day
├── Trends and patterns
└── Heavy compression
↓
Long-term data (1 year)
├── Only summary statistics
├── Archived in cold storage
└── For annual reports
TimescaleDB Hypertables
Temps uses TimescaleDB for time-series optimization:
Features:
- Automatic chunking - Data split by time intervals (daily chunks)
- Automatic compression - Old data automatically compressed for storage efficiency
- Optimized queries - Fast time-range queries with automatic index selection
- Parallel processing - Large aggregations processed across chunks
Result: Queries on 24 hours of data are instant, even with millions of events. Queries on 1 year of data complete in seconds.
Monitoring the System
System Health Metrics
Database Performance
├── Query latency
├── Connection pool usage
├── Slow query log
└── Table sizes
Proxy Performance
├── Request rate (req/sec)
├── Response time (p50, p95, p99)
├── Error rate (4xx, 5xx)
├── Upstream latency
└── Certificate cache hits
Application Performance
├── Memory usage
├── CPU usage
├── Goroutine count
├── File descriptors
└── Error rates
Data Privacy
Sensitive Data Handling
PII Protection
├── IP addresses captured for geographic info
├── User email optional (for notifications)
├── Custom data depends on user implementation
└── Client-side control with analytics SDK
Data Encryption
├── Secrets encrypted with AES-GCM
├── Password hashes with Argon2
├── TLS for all data in transit
└── At-rest encryption available
Data Retention
├── User can configure retention policy
├── Automatic deletion after retention period
├── Manual deletion of specific events
└── Data export capability
Next Steps
- Request Flow - How requests generate events
- Pingora Load Balancer - Where events originate
- Security Architecture - How data is protected