How We Built Our Outreach Sequences Engine (and What Hurt)

TL;DR

We built an outreach engine on top of an identity graph linking Twitter IDs, emails, and crypto addresses. Users defined Sequences (follow → DM → branch on reply), and a 6-hour cron advanced every lead through the steps. It worked at scale, but we hit latency, rate-limit pain, and limited observability. This post documents the v1 architecture, the data model, and the cracks we found—groundwork for the next post on the event-driven rebuild.


Context & Goals

  • Audience graph: ~100M Twitter profiles scraped, plus separate tables for email and crypto; mapping tables linked identities.
  • Static segments: audiences were snapshots (not dynamic filters).
  • Sequences: actions and triggers modeled in SQL; templates supported variables and A/B variants.
  • Execution: a cron ran every 6 hours, locking leads row-by-row and firing the next action, with exponential backoff.
  • Signals: reply/follow-back detection via polling (Twitter inbox/relationships).
  • Limits: per-account/day/hour, tuned per provider (lowest for Twitter, higher for email).
  • Metrics: replies, positive replies, reads (basic counts).
  • Gaps: no DND/consent list, no cross-channel dedup.

HLD — v1 Component Diagram


Data Model (Core Tables)

We modeled sequences as nodes (actions) and edges (triggers), with per-lead progress.

Enums (Actions & Triggers)

  • Actions: SEND_CONNECTION_REQUEST, SEND_TWITTER_DM, SEND_LINKEDIN_DM, SEND_XMTP_DM, FOLLOW_ON_TWITTER, UNFOLLOW_ON_TWITTER, CHECK_FOR_REPLY
  • Triggers: REPLIED, NOT_REPLIED, CONNECTION_ACCEPTED, CONNECTION_NOT_ACCEPTED, FOLLOWED_BACK, NOT_FOLLOWED_BACK, END, NORMAL_EDGE, REPLIED_TO_NOTE

LLD — v1 Execution Flow (Follow → DM → Branch)


What Hurt (and Why)

  • High latency: 6-hour ticks meant slow reactions to replies → missed windows.
  • Polling overhead: reading inboxes for every account is noisy and rate-limit heavy.
  • Coarse rate limiting: per-day/hour caps help, but without per-send token buckets you get bursts or unnecessary throttling.
  • Idempotency risk: retries + crashes can double-send unless strictly guarded per (lead, action).
  • Limited observability: no immutable, normalized event log → harder to build funnels and audit trails.
  • Compliance gaps: no DND/consent enforcement; limited auto-unsubscribe handling.
  • No cross-channel dedup: the same human could be approached across multiple channels unintentionally.
  • Versioning friction: editing a sequence in place makes in-flight state tricky.

That’s our honest snapshot of v1—the foundation for the next evolution.