Technical Documentation

How StatsAware works internally

How StatsAware Works

Technical overview for developers and system administrators

§Architecture Overview

StatsAware is a distributed team activity tracking system that provides real-time visibility into when team members are online and working across different timezones.

Core Components

  • Frontend: Next.js application with real-time dashboard
  • Backend API: FastAPI/Express server handling integrations
  • Database: PostgreSQL with time-series optimized schema
  • Integration Layer: Connectors to external services (Slack, Git, etc.)

§Data Collection

Activity Sources

StatsAware aggregates activity data from multiple sources:

  • Communication tools (Slack, Teams, Discord) - presence status
  • Development tools (Git, GitHub, GitLab) - commit activity
  • Project management (Jira, Linear, Asana) - task updates
  • Calendar systems (Google Calendar, Outlook) - meeting times

Data Types

We track two primary data patterns:

Activity Blocks

Time-duration activities stored as start/end timestamps:

-- Duration-based tracking
activity_blocks {
  start_time: timestamp,
  end_time: timestamp,
  duration_minutes: integer,
  kind: 'meeting' | 'focus_time' | 'break'
}

Activity Events

Point-in-time events with metadata:

-- Event-based tracking
activity_events {
  timestamp: timestamp,
  kind: 'commit' | 'message' | 'task_update',
  metadata: jsonb
}

§Privacy & Data Handling

What We Track

  • Presence status: Online/offline, active/idle
  • Activity timing: When work sessions start/end
  • Event metadata: Commit counts, message frequency (not content)
  • Timezone data: Local time calculations

What We Don't Track

  • Message content or file contents
  • Keystrokes or screen recordings
  • Personal data beyond work activity
  • Location data (only timezone for scheduling)

Data Retention

  • Active accounts: Full history during subscription
  • Free accounts: 7 days of activity data
  • Cancelled accounts: Data deleted within 30 days

§Integration Architecture

API Pattern

All integrations follow a consistent pattern:

interface Integration {
  kind: 'presence' | 'events' | 'duration'
  poll_interval: number
  authentication: 'oauth' | 'api_key' | 'webhook'
  endpoints: {
    auth: string
    data: string
    webhook?: string
  }
}

Polling vs Webhooks

  • Polling: Regular API calls for most services (1-5 minute intervals)
  • Webhooks: Real-time updates where supported (Git pushes, Slack events)
  • Hybrid: Combination approach for optimal performance

Rate Limiting & Reliability

  • Exponential backoff on API failures
  • Circuit breaker pattern for unhealthy services
  • Request queuing to respect rate limits
  • Graceful degradation when integrations are offline

Calendar View Logic

The dashboard renders a calendar-style view:

  • Y-axis: Time (24-hour local time)
  • X-axis: Team members
  • Blocks: Activity sessions with color coding
  • Events: Overlaid as dots/markers

§Performance & Scaling

Caching Strategy

  • Redis cache for frequently accessed data
  • CDN caching for static dashboard assets
  • Browser caching for integration metadata
  • Query result caching for expensive aggregations

Database Performance

  • Connection pooling to handle concurrent requests
  • Read replicas for dashboard queries
  • Batch processing for data imports
  • Async job queues for heavy operations

Monitoring

  • Health checks on all services
  • Error tracking with detailed logging
  • Performance metrics (API response times, query performance)
  • Integration status monitoring

§Security

Authentication

  • Passwordless auth using 6-digit email codes
  • JWT tokens for API authentication
  • OAuth flows for integration connections
  • Multi-factor auth for admin accounts

Data Protection

  • Encryption at rest for sensitive data
  • TLS encryption for all API communications
  • API key rotation for integrations
  • Access logging for audit trails

Authorization

  • Role-based access (admin, manager, user)
  • Account-level isolation (multi-tenant security)
  • Integration permissions (read-only by default)
  • API rate limiting per account

§Deployment

Infrastructure

  • Containerized services (Docker)
  • Auto-scaling based on load
  • Load balancing across multiple instances
  • Database clustering for high availability

CI/CD Pipeline

  • Automated testing (unit, integration, E2E)
  • Staged deployments (dev → staging → production)
  • Database migrations with rollback capability
  • Zero-downtime deployments

§API Reference

Health Check

GET /health
Response: { "status": "ok", "timestamp": "2025-06-07T14:30:00Z" }

Authentication

POST /auth/signup
Body: { "email": "[email protected]", "integrations": ["slack", "github"] }

Activity Data

GET /api/v1/accounts/{account_id}/activity
Query: ?start_date=2025-06-01&end_date=2025-06-07&person_id=123

§Integration Examples

Slack Integration

// Polls Slack Users API every 2 minutes
const slackPresence = await slack.users.getPresence({ user: userId })
// Maps to: { status: 'online', timestamp: Date.now() }

Git Integration

// Webhook on git push events
webhook.on('push', (data) => {
  createActivityEvent({
    kind: 'commit',
    timestamp: data.timestamp,
    metadata: { commits: data.commits.length }
  })
})

§Troubleshooting

Common Issues

  • Integration auth failures: Check OAuth token expiration
  • Missing activity data: Verify polling intervals and API limits
  • Timezone display issues: Confirm user timezone settings
  • Performance problems: Check database query performance

Debug Endpoints

  • GET /debug/integrations - Integration health status
  • GET /debug/activity/{person_id} - Recent activity for debugging
  • GET /debug/api-calls - Integration API call logs

Need help? Contact our technical team at [email protected]

More questions? We are happy to assist!