feat: add hourly dune large-transfer monitor by ctmotox2 · Pull Request #212 · yearn/monitoring

ctmotox2 · 2026-04-29T11:15:21Z

hourly query to Dune for USDai, iUSD, cUSD

spalen0

Thanks for the contribution! Nice routing pattern — flagged a few issues below that I think need to be addressed before merge. The cache-corruption one is the main blocker since it would affect every other hourly monitor.

🔴 Blocking

1. Cache row-key contains `:`, corrupting the shared cache file

_row_key returns f"{tx_hash}|{block_time}|{contract}". Dune block_time columns are timestamps (e.g. 2026-05-12 14:30:00.000 UTC) — they contain colons. The cache file format in utils/cache.py:55 is parsed with key, value = line.strip().split(":") (no maxsplit). The first time this monitor writes a value, every subsequent read of cache-id.txt — across all hourly monitors sharing this file — will raise ValueError: too many values to unpack.

Suggested fix: drop block_time from the key (tx_hash is usually enough to dedup a transfer; add log_index if you need to disambiguate multiple transfers in one tx), or strip : from the stored value. Don't change utils/cache.py itself — other monitors rely on the current format.

2. Dedup only compares the top row — older rows get re-alerted

if newest_key == last_key:
    return

If the top row changes but rows #2–#10 are unchanged from the previous run, all of them get re-alerted. Walk alert_rows until you hit last_key and only alert on the new prefix.

3. Ordering assumption is implicit

alert_rows[0] is assumed to be the newest, but nothing in the Python code sorts, and the Dune query's ORDER BY isn't pinned anywhere code-side. If someone edits the query and drops the ordering, dedup silently breaks. Either sort defensively by block_time desc, or at minimum add a comment in the module docstring stating the query MUST ORDER BY block_time DESC.

🟡 Design / robustness

Threshold fallback is unit-mixed. _is_large_transfer uses amount_usd when > 0 else falls back to raw amount. A 5M-token transfer of a token where amount_usd is missing or 0 (e.g. price feed failure Dune-side) would pass the $5M threshold even if actual USD value is $50. Safer: skip the row when amount_usd is missing/0 rather than falling back to a unitless amount.
DUNE_LARGE_TRANSFER_THRESHOLD is read but undocumented in .env.example. Add a commented-out line so operators know the knob exists.
stables protocol fallback. Unrouted (chain, addr) pairs alert to protocol \"stables\", which requires TELEGRAM_BOT_TOKEN_STABLES / TELEGRAM_CHAT_ID_STABLES — neither documented nor configured today. If the Dune query only ever returns the three routed tokens, the fallback is dead code; otherwise it's a silent send-failure path. Either document the env vars or drop the fallback and skip unrouted rows.
MAX_ROWS_IN_ALERT = 10 is per-protocol-group, not global. With 3 tokens in scope, you can get 3 separate alerts of up to 10 rows each. Probably fine — just a naming clarification.

🟢 Style / minor

LOGGER (uppercase) is inconsistent with the rest of the codebase (logger). See stables/main.py:13 and utils/logging.py convention.
No unit tests added for _route_for_row / _row_key / _is_large_transfer. These are pure functions and would be cheap to cover in tests/.

✅ Nice

dune-client already pinned in pyproject.toml.
DUNE_API_KEY correctly wired as a secret, query ID as a var.
Per-protocol Telegram routing matches the existing stables/main.py pattern.

spalen0

Re-reviewed after 34eb9f9 — all 9 items from the prior review are resolved (cache-key colon, top-row-only dedup, implicit ordering, unit-mixed threshold, undocumented env var, stables fallback, naming, LOGGER casing, no tests). Fix is minimal and surgical, and the new test file pins the cache-key fix nicely.

A few things still worth a look before merge — none are blockers on their own, but the first one is a silent failure mode:

🟡 Cap-then-cache silently drops rows on bursty hours

At stables/dune_large_transfers.py:197 the cache is written as _row_key(alert_rows[0]) (newest across all protocols), but each protocol group is truncated to MAX_ROWS_PER_PROTOCOL_ALERT = 10 at line 187. If a single protocol gets, say, 15 routed transfers above threshold in one hour, rows #11–15 are dropped from the alert AND the cache marker is advanced past them, so they're never alerted on the next run. Low probability, silent failure. Two safe fixes:

(a) cache the oldest-included row's key per protocol instead of the global newest, or
(b) append a +N more not shown — see Dune query directly tail line when truncating, so an operator knows to look.

(b) is a one-liner and probably sufficient.

🟡 `_row_key` won't actually use `log_index` for the current Dune query

The expected-columns list in the module docstring (lines 4–13) doesn't include log_index, so in practice the key is just tx_hash|contract. Two Transfer events of the same token in the same tx (router splits, multi-leg flows) would collide — only one alerts. The test test_row_key_uses_log_index_when_present exercises a branch the real query never hits. Either add log_index to the Dune SQL (preferred — it's the canonical disambiguator) or extend the docstring to require it.

🟢 Defensive sort uses lexicographic string compare

_sort_rows_newest_first sorts by _as_str(row.get(\"block_time\")). Works for the 2026-05-12 14:30:00.000 UTC format Dune currently emits, but silently breaks if the column ever switches to epoch int or ISO-with-T. Worth a one-line comment documenting the expected format.

🟢 `from` / `to` listed in docstring but never used

The docstring lists from and to as expected columns, but _build_row_line never references them. They're the most informative fields in a large-transfer alert — consider including them in the message (or drop from the docstring).

Approving in spirit pending the cap-then-cache fix; everything else is nits.

feat: add hourly dune large-transfer monitor

eca4e07

ctmotox2 marked this pull request as ready for review May 5, 2026 12:06

spalen0 requested changes May 12, 2026

View reviewed changes

fix dune large transfer dedup

34eb9f9

spalen0 reviewed May 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add hourly dune large-transfer monitor#212

feat: add hourly dune large-transfer monitor#212
ctmotox2 wants to merge 2 commits into
yearn:mainfrom
ctmotox2:dune-hourly-transfer-monitor

ctmotox2 commented Apr 29, 2026

Uh oh!

spalen0 left a comment

Uh oh!

spalen0 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ctmotox2 commented Apr 29, 2026

Uh oh!

spalen0 left a comment

Choose a reason for hiding this comment

🔴 Blocking

1. Cache row-key contains :, corrupting the shared cache file

2. Dedup only compares the top row — older rows get re-alerted

3. Ordering assumption is implicit

🟡 Design / robustness

🟢 Style / minor

✅ Nice

Uh oh!

spalen0 left a comment

Choose a reason for hiding this comment

🟡 Cap-then-cache silently drops rows on bursty hours

🟡 _row_key won't actually use log_index for the current Dune query

🟢 Defensive sort uses lexicographic string compare

🟢 from / to listed in docstring but never used

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Cache row-key contains `:`, corrupting the shared cache file

🟡 `_row_key` won't actually use `log_index` for the current Dune query

🟢 `from` / `to` listed in docstring but never used