The open-source Data Hub for the Bitcoin development ecosystem. This repository provides unified ingestion and forensic analytics for Bitcoin Core (git-logs), BIPs, Mailing Lists, and Delving Bitcoin research.
To ensure data integrity and full support for the analytical pipeline, all scripts MUST be run using the Anaconda environment:
# Core execution path for AI and Humans:
/opt/anaconda3/bin/python3 scripts/rebuild_daily.pyThe scripts are organized by functional stage to maintain a clean Sources โ Raw โ Enriched โ Output lifecycle:
01_ingest/: Raw extraction from Git mirrors and Discourse APIs.02_process/: Identity resolution, social merging, and technical categorization.03_analyze/: Global PageRank influence and expertise fingerprinting.04_deliver/: Final public artifact generation for the UI dashboards.
/opt/anaconda3/bin/python3 scripts/rebuild_daily.py/opt/anaconda3/bin/python3 scripts/rebuild_monthly.pyFor detailed architectural maps, reference the /docs folder:
- Architecture: Three-tier system overview, data lifecycle, shared utilities.
- Pipeline Walkthrough: Step-by-step script reference โ inputs, outputs, counts, daily vs monthly diff.
- Identity Resolution: 4-level resolution hierarchy,
build_identities.pymechanics, curation workflow, audit procedures. - Metadata Reference: Schema and ownership of every file in
metadata/. - Script Reference: Index of all pipeline scripts with folder, cadence, and one-line purpose.
- Environment: Python & Anaconda configuration.