feat: add fastCRW document loader node#6511
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the fastCRW document loader, adding support for crawling, scraping, extracting, and searching web content using a Firecrawl-compatible API. It includes the credential configuration, the core loader class, the Flowise node integration, and comprehensive unit tests. The code review identified three high-severity issues: a potential runtime TypeError when modifying this.params without initialization, and premature job failures in both the crawl and extract status polling loops due to unhandled intermediate statuses like 'pending', 'active', and 'cancelled'.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if (!this.url) { | ||
| throw new Error('fastCRW: URL is required for extract mode') | ||
| } | ||
| this.params!.urls = [this.url] |
There was a problem hiding this comment.
| case 'scraping': | ||
| case 'failed': | ||
| if (statusData.status === 'failed') { | ||
| throw new Error('Crawl job failed') | ||
| } | ||
| await new Promise((resolve) => setTimeout(resolve, Math.max(checkInterval, 2) * 1000)) | ||
| break |
There was a problem hiding this comment.
The crawl job can have intermediate statuses like 'pending' or 'active' before transitioning to 'scraping' or 'completed'. Currently, any status other than 'completed', 'scraping', or 'failed' will hit the default block and throw an error, causing the crawl to fail prematurely. We should include 'pending' and 'active' in the polling loop.
case 'pending':
case 'active':
case 'scraping':
case 'failed':
if (statusData.status === 'failed') {
throw new Error('Crawl job failed')
}
await new Promise((resolve) => setTimeout(resolve, Math.max(checkInterval, 2) * 1000))
break| case 'processing': | ||
| case 'failed': | ||
| if (statusData.status === 'failed') { | ||
| throw new Error('Extract job failed') | ||
| } | ||
| await new Promise((resolve) => setTimeout(resolve, Math.max(checkInterval, 2) * 1000)) | ||
| break |
There was a problem hiding this comment.
The ExtractStatusResponse interface explicitly defines 'pending' and 'cancelled' as valid statuses. However, they are not handled in the switch-case, meaning a 'pending' status will hit the default block and throw an error immediately. We should handle 'pending' by waiting/polling, and 'cancelled' by throwing a specific error.
case 'pending':
case 'processing':
case 'failed':
case 'cancelled':
if (statusData.status === 'failed') {
throw new Error('Extract job failed')
}
if (statusData.status === 'cancelled') {
throw new Error('Extract job was cancelled')
}
await new Promise((resolve) => setTimeout(resolve, Math.max(checkInterval, 2) * 1000))
break- guard this.params init in extract mode to avoid TypeError when params is undefined - treat non-terminal crawl statuses (scraping/pending/active) as keep-polling; only fail on failed/error/cancelled - handle pending/processing extract statuses as keep-polling; fail explicitly on failed/cancelled
What
Adds a fastCRW document loader node + credential, mirroring the existing FireCrawl node.
Why
fastCRW is a Firecrawl-API-compatible web engine in a single ~8MB binary — self-host free or managed cloud. Flat pricing (1 credit = 1 page; no 4x stealth surcharge, no billed-on-failure), free anti-bot stealth — a drop-in alternative to the FireCrawl loader.
Changes (additive only)
packages/components/nodes/documentloaders/Crw/Crw.ts(+ icon) mirroringFireCrawl/FireCrawl.ts(scrape/crawl modes, same inputs).packages/components/credentials/CrwApi.credential.tsmirroring the FireCrawl credential.Crw.test.ts.CRW_API_KEYfrom https://fastcrw.com/dashboard (free tier). Happy to adjust — I maintain it and can provide free credits.