Distributed Task Orchestrator
A fault-tolerant task scheduling system built on event sourcing, handling millions of jobs across heterogeneous worker pools.
Systems
Problem
Coordinating long-running, heterogeneous tasks across a fleet of workers is deceptively hard.
Approach
Built a custom orchestrator in Go, grounded in event sourcing and optimistic concurrency.
- Event store: Append-only log (PostgreSQL) as the source of truth
- Scheduler: Priority queue with deadline awareness
- Worker protocol: gRPC-based with heartbeat and graceful drain
Scale
- Sustained throughput: ~50K tasks/hour across 200 workers
- Median scheduling latency: 12ms