Back to work

Case Study

Distributed Task Scheduler (High-Throughput Scaling)

A high-throughput, event-driven orchestration system built to capture users who dropped off during multi-stage financial funnels. The system dynamically calculates wait times and schedules highly targeted re-engagement notifications.

The Scale Problem: Breaking the Cron Boundary

The V1 MVP relied on synchronous database polling. A 1-minute cron job queried a PostgreSQL outbox table to find delayed events and assign them to workers. This worked perfectly on Day 1. However, during peak seasonal traffic, the sheer volume of funnel drop-offs caused the database queries to slow down. Cron executions began exceeding their 1-minute intervals, causing overlapping runs, severe database table locks, and ultimately, delayed customer touchpoints that cost the business high-intent leads.

Event-Driven Backpressure

To handle the massive scale, I deprecated the synchronous DB polling model and architected an asynchronous, event-driven pipeline capable of handling backpressure natively.

Kafka Fan-Out & Parallel Processing

  • Kafka as a Buffer: Drop-off events were evaluated and periodically pushed into a distributed Kafka queue, completely decoupling event generation from event processing and protecting the database from read-heavy lockups.
  • Consumer Tuning: To clear massive event backlogs during peak spikes, I tuned our Kafka consumer groups, bumping the minimum batch fetch size by 10X (from the default 500 up to 5000).
  • Multi-Threading: Once a batch of 5000 events was pulled, I implemented a parallel multi-threading loop at the application level to process and dispatch events to worker queues simultaneously, maximizing CPU utilization.

Period

2024 - 2025

KafkaPostgreSQLJavaMultithreading

Outcomes

  • The architectural pivot successfully scaled the system's throughput by 3X.
  • By eliminating cron overlap and database locking, we improved our workflow SLA adherence by 35%.
  • Most importantly, delivering these targeted re-engagement messages exactly on time drove a 35% overall increase in lead conversions across our digital channels.