fixing output writing#2462
Conversation
Neo - PR Security ReviewNo security issues found Comment |
WalkthroughAdded Options.HasMatcherOrFilter() to detect active matcher/filter settings. Adjusted runner response persistence to skip immediate on-disk writes when matchers/filters are active (deferring to the output loop). Added tests covering the new detection method and StoreResponse interaction. ChangesMatcher detection + response persistence
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
LGTM, just:
|
|
I tested this PR locally and it works as expected on my side. I also checked edge cases and did not notice regressions. If helpful, I can add a small test case covering this behavior to help with review/merge confidence. |
# Conflicts: # runner/runner_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
runner/runner_test.go (1)
518-539: ⚡ Quick winThese tests don't actually verify storage behavior despite their names.
TestStoreResponse_withoutMatchersStoresAllandTestStoreResponse_withMatcherSetsFlagonly assert the value ofHasMatcherOrFilter()— they create anOptionsvalue, callValidateOptions(), and check the flag. The runner is never instantiated, no responses are processed, and the contents ofStoreResponseDirare never inspected. They cover the same surface asTestOptions_hasMatcherOrFilterwould with two extra rows.Given the regression being fixed (responses persisted regardless of
-ms), a test that exercises the analyze/output pipeline and asserts on which files end up underStoreResponseDir/response/for matched vs. unmatched responses would more credibly cover the bug. At minimum, rename these to reflect that they only assert flag detection (e.g.,..._FlagFalseWithoutMatchers/..._FlagTrueWithMatcher) so the names don't suggest behavior they don't verify.Minimal rename diff
-func TestStoreResponse_withoutMatchersStoresAll(t *testing.T) { +func TestStoreResponse_hasMatcherOrFilterFalseWithoutMatchers(t *testing.T) { @@ -func TestStoreResponse_withMatcherSetsFlag(t *testing.T) { +func TestStoreResponse_hasMatcherOrFilterTrueWithMatcher(t *testing.T) {Want me to draft an integration-style test that runs
analyzeagainst a localhttptest.Serverand asserts on the files actually written underStoreResponseDirfor matched vs. unmatched responses?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@runner/runner_test.go` around lines 518 - 539, The two tests TestStoreResponse_withoutMatchersStoresAll and TestStoreResponse_withMatcherSetsFlag are misnamed because they only exercise Options.ValidateOptions() and HasMatcherOrFilter() rather than verifying files written to StoreResponseDir; rename them to e.g. TestOptions_FlagFalseWithoutMatchers and TestOptions_FlagTrueWithMatcher (update the test function names and any comments) to reflect they only assert the matcher-flag behavior (references: Options, ValidateOptions, HasMatcherOrFilter, StoreResponseDir); alternatively replace them with an integration-style test that instantiates a runner, sends requests to an httptest.Server, runs analyze/output, and asserts which files appear under StoreResponseDir/response/ for matched vs unmatched responses if you want full coverage.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@runner/runner.go`:
- Around line 2461-2512: Remove the analyze-side file write block that runs when
scanopts.StoreResponse or scanopts.StoreChain is true (the code that creates
response files using os.OpenFile with O_EXCL, builds data from
requestDump/resp.Raw/resp.GetChain(), and sets fileNameHash), and instead
implement the O_EXCL + incremental _<idx> filename creation, the
truncation/omit-body logic that uses resp.Data (resp.Raw =
strings.ReplaceAll(resp.Raw, string(resp.Data), "") when r.options.OmitBody),
and the StoreChain append behavior inside the existing output loop that
currently calls os.WriteFile(responsePath, data, 0644); after successfully
creating/writing the file there update Result.FileNameHash (fileNameHash) to
include the _<idx> suffix so downstream consumers see the exact filename — this
consolidates persistence to a single path, prevents double-writes, preserves
uniqueness when r.options.HasMatcherOrFilter() is true, and ensures omit-body
works consistently.
---
Nitpick comments:
In `@runner/runner_test.go`:
- Around line 518-539: The two tests TestStoreResponse_withoutMatchersStoresAll
and TestStoreResponse_withMatcherSetsFlag are misnamed because they only
exercise Options.ValidateOptions() and HasMatcherOrFilter() rather than
verifying files written to StoreResponseDir; rename them to e.g.
TestOptions_FlagFalseWithoutMatchers and TestOptions_FlagTrueWithMatcher (update
the test function names and any comments) to reflect they only assert the
matcher-flag behavior (references: Options, ValidateOptions, HasMatcherOrFilter,
StoreResponseDir); alternatively replace them with an integration-style test
that instantiates a runner, sends requests to an httptest.Server, runs
analyze/output, and asserts which files appear under StoreResponseDir/response/
for matched vs unmatched responses if you want full coverage.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f70c0739-a36b-4fc6-b39a-4a0b207fb8f9
📒 Files selected for processing (2)
runner/runner.gorunner/runner_test.go
| // store response — when matchers/filters are active, defer writing to the | ||
| // output loop so only matched responses are persisted to disk. | ||
| if scanopts.StoreResponse || scanopts.StoreChain { | ||
| if r.options.OmitBody { | ||
| resp.Raw = strings.ReplaceAll(resp.Raw, string(resp.Data), "") | ||
| } | ||
| responsePath = fileutilz.AbsPathOrDefault(filepath.Join(responseBaseDir, domainResponseFile)) | ||
| // URL.EscapedString returns that can be used as filename | ||
| respRaw := resp.Raw | ||
| reqRaw := requestDump | ||
| if len(respRaw) > scanopts.MaxResponseBodySizeToSave { | ||
| respRaw = respRaw[:scanopts.MaxResponseBodySizeToSave] | ||
| } | ||
| data := reqRaw | ||
| if scanopts.StoreChain && resp.HasChain() { | ||
| data = append(data, append([]byte("\n"), []byte(resp.GetChain())...)...) | ||
| } | ||
| data = append(data, respRaw...) | ||
| data = append(data, []byte("\n\n\n")...) | ||
| data = append(data, []byte(fullURL)...) | ||
| _ = fileutil.CreateFolder(responseBaseDir) | ||
|
|
||
| basePath := strings.TrimSuffix(responsePath, ".txt") | ||
| var idx int | ||
| for idx = 0; ; idx++ { | ||
| targetPath := responsePath | ||
| if idx > 0 { | ||
| targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx) | ||
| } | ||
| f, err := os.OpenFile(targetPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0644) | ||
| if err == nil { | ||
| _, writeErr := f.Write(data) | ||
| _ = f.Close() | ||
| if writeErr != nil { | ||
| gologger.Error().Msgf("Could not write to '%s': %s", targetPath, writeErr) | ||
| fileNameHash = hash | ||
|
|
||
| if !r.options.HasMatcherOrFilter() { | ||
| if r.options.OmitBody { | ||
| resp.Raw = strings.ReplaceAll(resp.Raw, string(resp.Data), "") | ||
| } | ||
| responsePath = fileutilz.AbsPathOrDefault(filepath.Join(responseBaseDir, domainResponseFile)) | ||
| // URL.EscapedString returns that can be used as filename | ||
| respRaw := resp.Raw | ||
| reqRaw := requestDump | ||
| if len(respRaw) > scanopts.MaxResponseBodySizeToSave { | ||
| respRaw = respRaw[:scanopts.MaxResponseBodySizeToSave] | ||
| } | ||
| data := reqRaw | ||
| if scanopts.StoreChain && resp.HasChain() { | ||
| data = append(data, append([]byte("\n"), []byte(resp.GetChain())...)...) | ||
| } | ||
| data = append(data, respRaw...) | ||
| data = append(data, []byte("\n\n\n")...) | ||
| data = append(data, []byte(fullURL)...) | ||
| _ = fileutil.CreateFolder(responseBaseDir) | ||
|
|
||
| basePath := strings.TrimSuffix(responsePath, ".txt") | ||
| var idx int | ||
| for idx = 0; ; idx++ { | ||
| targetPath := responsePath | ||
| if idx > 0 { | ||
| targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx) | ||
| } | ||
| f, err := os.OpenFile(targetPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0644) | ||
| if err == nil { | ||
| _, writeErr := f.Write(data) | ||
| _ = f.Close() | ||
| if writeErr != nil { | ||
| gologger.Error().Msgf("Could not write to '%s': %s", targetPath, writeErr) | ||
| } | ||
| break | ||
| } | ||
| if !os.IsExist(err) { | ||
| gologger.Error().Msgf("Failed to create file '%s': %s", targetPath, err) | ||
| break | ||
| } | ||
| break | ||
| } | ||
| if !os.IsExist(err) { | ||
| gologger.Error().Msgf("Failed to create file '%s': %s", targetPath, err) | ||
| break | ||
| } | ||
| } | ||
|
|
||
| if idx == 0 { | ||
| fileNameHash = hash | ||
| } else { | ||
| fileNameHash = fmt.Sprintf("%s_%d", hash, idx) | ||
| if idx > 0 { | ||
| fileNameHash = fmt.Sprintf("%s_%d", hash, idx) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Persistence is now split between analyze and the output loop, causing double writes without matchers and a uniqueness regression with matchers.
The deferred persistence path (the output loop at lines 1303-1340) still writes the response file unconditionally for every matched result via plain os.WriteFile(responsePath, data, 0644) — it does not use O_EXCL and does not apply the _<idx> suffix logic added here. Concrete consequences of the split:
- Double write when no matcher/filter is active. Both this block and the output loop run, so the file freshly created here with
O_EXCLis then re-written (overwritten) by the output loop'sos.WriteFile, doubling I/O for every response. - Lost uniqueness when a matcher/filter is active. This branch is skipped, so
fileNameHashstays at the barehash(no_<idx>suffix). The hash issha1(method + ":" + URL.EscapedString()), which is identical across distinct results that share method+URL — e.g.-probe-all-ips(differentCustomIP, same URL string), thegoto retrypath after protocol fallback, or repeated probing. With matchers active those matched results all collide on the same<hash>.txtand silently overwrite each other. OmitBodyis a no-op in the deferred path. The output loop strips viastrings.ReplaceAll(resp.Raw, resp.ResponseBody, ""), butResult.ResponseBodyis only populated when-response/-base64-response/-mdc/-fdcare set (see lines 2180-2190). The branch here usesstring(resp.Data)and works correctly; the deferred branch leaves the body inresp.Rawon disk whenever-omit-bodyis combined with a matcher/filter without one of those flags.
Recommendation: consolidate persistence into a single path. The cleanest fix is to drop this analyze-side write entirely and move the O_EXCL + _<idx> uniqueness logic and the string(resp.Data)-based OmitBody handling into the output loop (which already runs after matcher/filter evaluation) so uniqueness, body omission, and "only matched" all work together. Updating Result.FileNameHash in the output loop after a successful create would keep downstream consumers (index.txt, callers) consistent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@runner/runner.go` around lines 2461 - 2512, Remove the analyze-side file
write block that runs when scanopts.StoreResponse or scanopts.StoreChain is true
(the code that creates response files using os.OpenFile with O_EXCL, builds data
from requestDump/resp.Raw/resp.GetChain(), and sets fileNameHash), and instead
implement the O_EXCL + incremental _<idx> filename creation, the
truncation/omit-body logic that uses resp.Data (resp.Raw =
strings.ReplaceAll(resp.Raw, string(resp.Data), "") when r.options.OmitBody),
and the StoreChain append behavior inside the existing output loop that
currently calls os.WriteFile(responsePath, data, 0644); after successfully
creating/writing the file there update Result.FileNameHash (fileNameHash) to
include the _<idx> suffix so downstream consumers see the exact filename — this
consolidates persistence to a single path, prevents double-writes, preserves
uniqueness when r.options.HasMatcherOrFilter() is true, and ensures omit-body
works consistently.
Proposed changes
Close #2418
Checklist
Summary by CodeRabbit
Improvements
Tests