All projects

Distributed Task Orchestrator

A fault-tolerant task scheduling system built on event sourcing, handling millions of jobs across heterogeneous worker pools.

Date Nov 20, 2024 Tags Systems, Go, Distributed Code GitHub ↗
Systems

Problem

Coordinating long-running, heterogeneous tasks across a fleet of workers is deceptively hard.

Approach

Built a custom orchestrator in Go, grounded in event sourcing and optimistic concurrency.

  • Event store: Append-only log (PostgreSQL) as the source of truth
  • Scheduler: Priority queue with deadline awareness
  • Worker protocol: gRPC-based with heartbeat and graceful drain

Scale

  • Sustained throughput: ~50K tasks/hour across 200 workers
  • Median scheduling latency: 12ms