Skip to content

fixing output writing#2462

Merged
dogancanbakir merged 2 commits intodevfrom
2418-store-response-only-matched
May 6, 2026
Merged

fixing output writing#2462
dogancanbakir merged 2 commits intodevfrom
2418-store-response-only-matched

Conversation

@Mzack9999
Copy link
Copy Markdown
Member

@Mzack9999 Mzack9999 commented Mar 22, 2026

Proposed changes

Close #2418

Checklist

  • Pull request is created against the dev branch
  • All checks passed (lint, unit/integration/regression tests etc.) with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Summary by CodeRabbit

  • Improvements

    • Added matcher/filter detection and adjusted response persistence: when matchers or filters are active, response storage is deferred to the output loop rather than written immediately, preserving correct behavior and avoiding unnecessary on-disk writes.
  • Tests

    • Added tests covering matcher/filter detection and interactions with response storage, including scenarios with and without active filters.

@auto-assign auto-assign Bot requested a review from dogancanbakir March 22, 2026 12:35
@neo-by-projectdiscovery-dev
Copy link
Copy Markdown

neo-by-projectdiscovery-dev Bot commented Mar 22, 2026

Neo - PR Security Review

No security issues found

Comment @pdneo help for available commands. · Open in Neo

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

Walkthrough

Added Options.HasMatcherOrFilter() to detect active matcher/filter settings. Adjusted runner response persistence to skip immediate on-disk writes when matchers/filters are active (deferring to the output loop). Added tests covering the new detection method and StoreResponse interaction.

Changes

Matcher detection + response persistence

Layer / File(s) Summary
Option predicate
runner/options.go
Added func (options *Options) HasMatcherOrFilter() bool to report when any matcher or filter option (string fields, parsed slices, numeric/boolean flags) is set.
Core persistence change
runner/runner.go
When `scanopts.StoreResponse
Tests
runner/runner_test.go
Added TestOptions_hasMatcherOrFilter, TestStoreResponse_withoutMatchersStoresAll, and TestStoreResponse_withMatcherSetsFlag to validate matcher/filter detection and interaction with StoreResponse flags.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble through options, clever and spry,
I sniff which matchers wake and which lie.
I skip the clutter, save only the prize,
Soft-footed responses under moonlit skies. ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'fixing output writing' is vague and generic, lacking specificity about what output writing issue is being fixed or the main change. Consider a more descriptive title such as 'Store responses only for matched requests' or 'Fix response storage to respect matchers and filters' to better convey the core objective.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed The PR successfully implements the coding requirement from issue #2418: httpx now stores only responses matching the specified matcher/filter instead of saving all responses.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue objective: implementing matcher/filter detection, updating response persistence logic, and adding comprehensive test coverage for the new functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch 2418-store-response-only-matched

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@dogancanbakir
Copy link
Copy Markdown
Member

LGTM, just:

  • merge conflict

@denisl
Copy link
Copy Markdown

denisl commented Apr 16, 2026

I tested this PR locally and it works as expected on my side. I also checked edge cases and did not notice regressions. If helpful, I can add a small test case covering this behavior to help with review/merge confidence.

# Conflicts:
#	runner/runner_test.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
runner/runner_test.go (1)

518-539: ⚡ Quick win

These tests don't actually verify storage behavior despite their names.

TestStoreResponse_withoutMatchersStoresAll and TestStoreResponse_withMatcherSetsFlag only assert the value of HasMatcherOrFilter() — they create an Options value, call ValidateOptions(), and check the flag. The runner is never instantiated, no responses are processed, and the contents of StoreResponseDir are never inspected. They cover the same surface as TestOptions_hasMatcherOrFilter would with two extra rows.

Given the regression being fixed (responses persisted regardless of -ms), a test that exercises the analyze/output pipeline and asserts on which files end up under StoreResponseDir/response/ for matched vs. unmatched responses would more credibly cover the bug. At minimum, rename these to reflect that they only assert flag detection (e.g., ..._FlagFalseWithoutMatchers/..._FlagTrueWithMatcher) so the names don't suggest behavior they don't verify.

Minimal rename diff
-func TestStoreResponse_withoutMatchersStoresAll(t *testing.T) {
+func TestStoreResponse_hasMatcherOrFilterFalseWithoutMatchers(t *testing.T) {
@@
-func TestStoreResponse_withMatcherSetsFlag(t *testing.T) {
+func TestStoreResponse_hasMatcherOrFilterTrueWithMatcher(t *testing.T) {

Want me to draft an integration-style test that runs analyze against a local httptest.Server and asserts on the files actually written under StoreResponseDir for matched vs. unmatched responses?

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@runner/runner_test.go` around lines 518 - 539, The two tests
TestStoreResponse_withoutMatchersStoresAll and
TestStoreResponse_withMatcherSetsFlag are misnamed because they only exercise
Options.ValidateOptions() and HasMatcherOrFilter() rather than verifying files
written to StoreResponseDir; rename them to e.g.
TestOptions_FlagFalseWithoutMatchers and TestOptions_FlagTrueWithMatcher (update
the test function names and any comments) to reflect they only assert the
matcher-flag behavior (references: Options, ValidateOptions, HasMatcherOrFilter,
StoreResponseDir); alternatively replace them with an integration-style test
that instantiates a runner, sends requests to an httptest.Server, runs
analyze/output, and asserts which files appear under StoreResponseDir/response/
for matched vs unmatched responses if you want full coverage.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@runner/runner.go`:
- Around line 2461-2512: Remove the analyze-side file write block that runs when
scanopts.StoreResponse or scanopts.StoreChain is true (the code that creates
response files using os.OpenFile with O_EXCL, builds data from
requestDump/resp.Raw/resp.GetChain(), and sets fileNameHash), and instead
implement the O_EXCL + incremental _<idx> filename creation, the
truncation/omit-body logic that uses resp.Data (resp.Raw =
strings.ReplaceAll(resp.Raw, string(resp.Data), "") when r.options.OmitBody),
and the StoreChain append behavior inside the existing output loop that
currently calls os.WriteFile(responsePath, data, 0644); after successfully
creating/writing the file there update Result.FileNameHash (fileNameHash) to
include the _<idx> suffix so downstream consumers see the exact filename — this
consolidates persistence to a single path, prevents double-writes, preserves
uniqueness when r.options.HasMatcherOrFilter() is true, and ensures omit-body
works consistently.

---

Nitpick comments:
In `@runner/runner_test.go`:
- Around line 518-539: The two tests TestStoreResponse_withoutMatchersStoresAll
and TestStoreResponse_withMatcherSetsFlag are misnamed because they only
exercise Options.ValidateOptions() and HasMatcherOrFilter() rather than
verifying files written to StoreResponseDir; rename them to e.g.
TestOptions_FlagFalseWithoutMatchers and TestOptions_FlagTrueWithMatcher (update
the test function names and any comments) to reflect they only assert the
matcher-flag behavior (references: Options, ValidateOptions, HasMatcherOrFilter,
StoreResponseDir); alternatively replace them with an integration-style test
that instantiates a runner, sends requests to an httptest.Server, runs
analyze/output, and asserts which files appear under StoreResponseDir/response/
for matched vs unmatched responses if you want full coverage.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f70c0739-a36b-4fc6-b39a-4a0b207fb8f9

📥 Commits

Reviewing files that changed from the base of the PR and between 3adf9ce and 98e6af0.

📒 Files selected for processing (2)
  • runner/runner.go
  • runner/runner_test.go

Comment thread runner/runner.go
Comment on lines +2461 to 2512
// store response — when matchers/filters are active, defer writing to the
// output loop so only matched responses are persisted to disk.
if scanopts.StoreResponse || scanopts.StoreChain {
if r.options.OmitBody {
resp.Raw = strings.ReplaceAll(resp.Raw, string(resp.Data), "")
}
responsePath = fileutilz.AbsPathOrDefault(filepath.Join(responseBaseDir, domainResponseFile))
// URL.EscapedString returns that can be used as filename
respRaw := resp.Raw
reqRaw := requestDump
if len(respRaw) > scanopts.MaxResponseBodySizeToSave {
respRaw = respRaw[:scanopts.MaxResponseBodySizeToSave]
}
data := reqRaw
if scanopts.StoreChain && resp.HasChain() {
data = append(data, append([]byte("\n"), []byte(resp.GetChain())...)...)
}
data = append(data, respRaw...)
data = append(data, []byte("\n\n\n")...)
data = append(data, []byte(fullURL)...)
_ = fileutil.CreateFolder(responseBaseDir)

basePath := strings.TrimSuffix(responsePath, ".txt")
var idx int
for idx = 0; ; idx++ {
targetPath := responsePath
if idx > 0 {
targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx)
}
f, err := os.OpenFile(targetPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0644)
if err == nil {
_, writeErr := f.Write(data)
_ = f.Close()
if writeErr != nil {
gologger.Error().Msgf("Could not write to '%s': %s", targetPath, writeErr)
fileNameHash = hash

if !r.options.HasMatcherOrFilter() {
if r.options.OmitBody {
resp.Raw = strings.ReplaceAll(resp.Raw, string(resp.Data), "")
}
responsePath = fileutilz.AbsPathOrDefault(filepath.Join(responseBaseDir, domainResponseFile))
// URL.EscapedString returns that can be used as filename
respRaw := resp.Raw
reqRaw := requestDump
if len(respRaw) > scanopts.MaxResponseBodySizeToSave {
respRaw = respRaw[:scanopts.MaxResponseBodySizeToSave]
}
data := reqRaw
if scanopts.StoreChain && resp.HasChain() {
data = append(data, append([]byte("\n"), []byte(resp.GetChain())...)...)
}
data = append(data, respRaw...)
data = append(data, []byte("\n\n\n")...)
data = append(data, []byte(fullURL)...)
_ = fileutil.CreateFolder(responseBaseDir)

basePath := strings.TrimSuffix(responsePath, ".txt")
var idx int
for idx = 0; ; idx++ {
targetPath := responsePath
if idx > 0 {
targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx)
}
f, err := os.OpenFile(targetPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0644)
if err == nil {
_, writeErr := f.Write(data)
_ = f.Close()
if writeErr != nil {
gologger.Error().Msgf("Could not write to '%s': %s", targetPath, writeErr)
}
break
}
if !os.IsExist(err) {
gologger.Error().Msgf("Failed to create file '%s': %s", targetPath, err)
break
}
break
}
if !os.IsExist(err) {
gologger.Error().Msgf("Failed to create file '%s': %s", targetPath, err)
break
}
}

if idx == 0 {
fileNameHash = hash
} else {
fileNameHash = fmt.Sprintf("%s_%d", hash, idx)
if idx > 0 {
fileNameHash = fmt.Sprintf("%s_%d", hash, idx)
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Persistence is now split between analyze and the output loop, causing double writes without matchers and a uniqueness regression with matchers.

The deferred persistence path (the output loop at lines 1303-1340) still writes the response file unconditionally for every matched result via plain os.WriteFile(responsePath, data, 0644) — it does not use O_EXCL and does not apply the _<idx> suffix logic added here. Concrete consequences of the split:

  • Double write when no matcher/filter is active. Both this block and the output loop run, so the file freshly created here with O_EXCL is then re-written (overwritten) by the output loop's os.WriteFile, doubling I/O for every response.
  • Lost uniqueness when a matcher/filter is active. This branch is skipped, so fileNameHash stays at the bare hash (no _<idx> suffix). The hash is sha1(method + ":" + URL.EscapedString()), which is identical across distinct results that share method+URL — e.g. -probe-all-ips (different CustomIP, same URL string), the goto retry path after protocol fallback, or repeated probing. With matchers active those matched results all collide on the same <hash>.txt and silently overwrite each other.
  • OmitBody is a no-op in the deferred path. The output loop strips via strings.ReplaceAll(resp.Raw, resp.ResponseBody, ""), but Result.ResponseBody is only populated when -response/-base64-response/-mdc/-fdc are set (see lines 2180-2190). The branch here uses string(resp.Data) and works correctly; the deferred branch leaves the body in resp.Raw on disk whenever -omit-body is combined with a matcher/filter without one of those flags.

Recommendation: consolidate persistence into a single path. The cleanest fix is to drop this analyze-side write entirely and move the O_EXCL + _<idx> uniqueness logic and the string(resp.Data)-based OmitBody handling into the output loop (which already runs after matcher/filter evaluation) so uniqueness, body omission, and "only matched" all work together. Updating Result.FileNameHash in the output loop after a successful create would keep downstream consumers (index.txt, callers) consistent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@runner/runner.go` around lines 2461 - 2512, Remove the analyze-side file
write block that runs when scanopts.StoreResponse or scanopts.StoreChain is true
(the code that creates response files using os.OpenFile with O_EXCL, builds data
from requestDump/resp.Raw/resp.GetChain(), and sets fileNameHash), and instead
implement the O_EXCL + incremental _<idx> filename creation, the
truncation/omit-body logic that uses resp.Data (resp.Raw =
strings.ReplaceAll(resp.Raw, string(resp.Data), "") when r.options.OmitBody),
and the StoreChain append behavior inside the existing output loop that
currently calls os.WriteFile(responsePath, data, 0644); after successfully
creating/writing the file there update Result.FileNameHash (fileNameHash) to
include the _<idx> suffix so downstream consumers see the exact filename — this
consolidates persistence to a single path, prevents double-writes, preserves
uniqueness when r.options.HasMatcherOrFilter() is true, and ensures omit-body
works consistently.

@dogancanbakir dogancanbakir merged commit 31af5c0 into dev May 6, 2026
14 checks passed
@dogancanbakir dogancanbakir deleted the 2418-store-response-only-matched branch May 6, 2026 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

how to sr (only match)

3 participants