fixing output writing by Mzack9999 · Pull Request #2462 · projectdiscovery/httpx

Mzack9999 · 2026-03-22T12:35:49Z

Proposed changes

Close #2418

Checklist

Pull request is created against the dev branch
All checks passed (lint, unit/integration/regression tests etc.) with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

Summary by CodeRabbit

Improvements
- Added matcher/filter detection and adjusted response persistence: when matchers or filters are active, response storage is deferred to the output loop rather than written immediately, preserving correct behavior and avoiding unnecessary on-disk writes.
Tests
- Added tests covering matcher/filter detection and interactions with response storage, including scenarios with and without active filters.

neo-by-projectdiscovery-dev · 2026-03-22T12:36:06Z

Neo - PR Security Review

No security issues found

_{Comment @pdneo help for available commands. · Open in Neo}

coderabbitai · 2026-03-22T12:36:07Z

Walkthrough

Added Options.HasMatcherOrFilter() to detect active matcher/filter settings. Adjusted runner response persistence to skip immediate on-disk writes when matchers/filters are active (deferring to the output loop). Added tests covering the new detection method and StoreResponse interaction.

Changes

Matcher detection + response persistence

Layer / File(s)	Summary
Option predicate `runner/options.go`	Added `func (options *Options) HasMatcherOrFilter() bool` to report when any matcher or filter option (string fields, parsed slices, numeric/boolean flags) is set.
Core persistence change `runner/runner.go`	When `scanopts.StoreResponse
Tests `runner/runner_test.go`	Added `TestOptions_hasMatcherOrFilter`, `TestStoreResponse_withoutMatchersStoresAll`, and `TestStoreResponse_withMatcherSetsFlag` to validate matcher/filter detection and interaction with StoreResponse flags.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble through options, clever and spry,
I sniff which matchers wake and which lie.
I skip the clutter, save only the prize,
Soft-footed responses under moonlit skies. ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'fixing output writing' is vague and generic, lacking specificity about what output writing issue is being fixed or the main change.	Consider a more descriptive title such as 'Store responses only for matched requests' or 'Fix response storage to respect matchers and filters' to better convey the core objective.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	The PR successfully implements the coding requirement from issue `#2418`: httpx now stores only responses matching the specified matcher/filter instead of saving all responses.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the linked issue objective: implementing matcher/filter detection, updating response persistence logic, and adding comprehensive test coverage for the new functionality.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch 2418-store-response-only-matched

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

dogancanbakir · 2026-04-03T11:48:54Z

LGTM, just:

merge conflict

denisl · 2026-04-16T18:12:57Z

I tested this PR locally and it works as expected on my side. I also checked edge cases and did not notice regressions. If helpful, I can add a small test case covering this behavior to help with review/merge confidence.

# Conflicts: # runner/runner_test.go

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

runner/runner_test.go (1)
518-539: ⚡ Quick win

These tests don't actually verify storage behavior despite their names.

TestStoreResponse_withoutMatchersStoresAll and TestStoreResponse_withMatcherSetsFlag only assert the value of HasMatcherOrFilter() — they create an Options value, call ValidateOptions(), and check the flag. The runner is never instantiated, no responses are processed, and the contents of StoreResponseDir are never inspected. They cover the same surface as TestOptions_hasMatcherOrFilter would with two extra rows.

Given the regression being fixed (responses persisted regardless of -ms), a test that exercises the analyze/output pipeline and asserts on which files end up under StoreResponseDir/response/ for matched vs. unmatched responses would more credibly cover the bug. At minimum, rename these to reflect that they only assert flag detection (e.g., ..._FlagFalseWithoutMatchers/..._FlagTrueWithMatcher) so the names don't suggest behavior they don't verify.
Minimal rename diff
-func TestStoreResponse_withoutMatchersStoresAll(t *testing.T) {
+func TestStoreResponse_hasMatcherOrFilterFalseWithoutMatchers(t *testing.T) {
@@
-func TestStoreResponse_withMatcherSetsFlag(t *testing.T) {
+func TestStoreResponse_hasMatcherOrFilterTrueWithMatcher(t *testing.T) {
Want me to draft an integration-style test that runs analyze against a local httptest.Server and asserts on the files actually written under StoreResponseDir for matched vs. unmatched responses?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@runner/runner_test.go` around lines 518 - 539, The two tests
TestStoreResponse_withoutMatchersStoresAll and
TestStoreResponse_withMatcherSetsFlag are misnamed because they only exercise
Options.ValidateOptions() and HasMatcherOrFilter() rather than verifying files
written to StoreResponseDir; rename them to e.g.
TestOptions_FlagFalseWithoutMatchers and TestOptions_FlagTrueWithMatcher (update
the test function names and any comments) to reflect they only assert the
matcher-flag behavior (references: Options, ValidateOptions, HasMatcherOrFilter,
StoreResponseDir); alternatively replace them with an integration-style test
that instantiates a runner, sends requests to an httptest.Server, runs
analyze/output, and asserts which files appear under StoreResponseDir/response/
for matched vs unmatched responses if you want full coverage.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@runner/runner.go`:
- Around line 2461-2512: Remove the analyze-side file write block that runs when
scanopts.StoreResponse or scanopts.StoreChain is true (the code that creates
response files using os.OpenFile with O_EXCL, builds data from
requestDump/resp.Raw/resp.GetChain(), and sets fileNameHash), and instead
implement the O_EXCL + incremental _<idx> filename creation, the
truncation/omit-body logic that uses resp.Data (resp.Raw =
strings.ReplaceAll(resp.Raw, string(resp.Data), "") when r.options.OmitBody),
and the StoreChain append behavior inside the existing output loop that
currently calls os.WriteFile(responsePath, data, 0644); after successfully
creating/writing the file there update Result.FileNameHash (fileNameHash) to
include the _<idx> suffix so downstream consumers see the exact filename — this
consolidates persistence to a single path, prevents double-writes, preserves
uniqueness when r.options.HasMatcherOrFilter() is true, and ensures omit-body
works consistently.

---

Nitpick comments:
In `@runner/runner_test.go`:
- Around line 518-539: The two tests TestStoreResponse_withoutMatchersStoresAll
and TestStoreResponse_withMatcherSetsFlag are misnamed because they only
exercise Options.ValidateOptions() and HasMatcherOrFilter() rather than
verifying files written to StoreResponseDir; rename them to e.g.
TestOptions_FlagFalseWithoutMatchers and TestOptions_FlagTrueWithMatcher (update
the test function names and any comments) to reflect they only assert the
matcher-flag behavior (references: Options, ValidateOptions, HasMatcherOrFilter,
StoreResponseDir); alternatively replace them with an integration-style test
that instantiates a runner, sends requests to an httptest.Server, runs
analyze/output, and asserts which files appear under StoreResponseDir/response/
for matched vs unmatched responses if you want full coverage.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f70c0739-a36b-4fc6-b39a-4a0b207fb8f9

📥 Commits

Reviewing files that changed from the base of the PR and between 3adf9ce and 98e6af0.

📒 Files selected for processing (2)

runner/runner.go
runner/runner_test.go

coderabbitai · 2026-05-06T13:31:49Z

+	// store response — when matchers/filters are active, defer writing to the
+	// output loop so only matched responses are persisted to disk.
 	if scanopts.StoreResponse || scanopts.StoreChain {
-		if r.options.OmitBody {
-			resp.Raw = strings.ReplaceAll(resp.Raw, string(resp.Data), "")
-		}
-		responsePath = fileutilz.AbsPathOrDefault(filepath.Join(responseBaseDir, domainResponseFile))
-		// URL.EscapedString returns that can be used as filename
-		respRaw := resp.Raw
-		reqRaw := requestDump
-		if len(respRaw) > scanopts.MaxResponseBodySizeToSave {
-			respRaw = respRaw[:scanopts.MaxResponseBodySizeToSave]
-		}
-		data := reqRaw
-		if scanopts.StoreChain && resp.HasChain() {
-			data = append(data, append([]byte("\n"), []byte(resp.GetChain())...)...)
-		}
-		data = append(data, respRaw...)
-		data = append(data, []byte("\n\n\n")...)
-		data = append(data, []byte(fullURL)...)
-		_ = fileutil.CreateFolder(responseBaseDir)
-
-		basePath := strings.TrimSuffix(responsePath, ".txt")
-		var idx int
-		for idx = 0; ; idx++ {
-			targetPath := responsePath
-			if idx > 0 {
-				targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx)
-			}
-			f, err := os.OpenFile(targetPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0644)
-			if err == nil {
-				_, writeErr := f.Write(data)
-				_ = f.Close()
-				if writeErr != nil {
-					gologger.Error().Msgf("Could not write to '%s': %s", targetPath, writeErr)
+		fileNameHash = hash
+
+		if !r.options.HasMatcherOrFilter() {
+			if r.options.OmitBody {
+				resp.Raw = strings.ReplaceAll(resp.Raw, string(resp.Data), "")
+			}
+			responsePath = fileutilz.AbsPathOrDefault(filepath.Join(responseBaseDir, domainResponseFile))
+			// URL.EscapedString returns that can be used as filename
+			respRaw := resp.Raw
+			reqRaw := requestDump
+			if len(respRaw) > scanopts.MaxResponseBodySizeToSave {
+				respRaw = respRaw[:scanopts.MaxResponseBodySizeToSave]
+			}
+			data := reqRaw
+			if scanopts.StoreChain && resp.HasChain() {
+				data = append(data, append([]byte("\n"), []byte(resp.GetChain())...)...)
+			}
+			data = append(data, respRaw...)
+			data = append(data, []byte("\n\n\n")...)
+			data = append(data, []byte(fullURL)...)
+			_ = fileutil.CreateFolder(responseBaseDir)
+
+			basePath := strings.TrimSuffix(responsePath, ".txt")
+			var idx int
+			for idx = 0; ; idx++ {
+				targetPath := responsePath
+				if idx > 0 {
+					targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx)
+				}
+				f, err := os.OpenFile(targetPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0644)
+				if err == nil {
+					_, writeErr := f.Write(data)
+					_ = f.Close()
+					if writeErr != nil {
+						gologger.Error().Msgf("Could not write to '%s': %s", targetPath, writeErr)
+					}
+					break
+				}
+				if !os.IsExist(err) {
+					gologger.Error().Msgf("Failed to create file '%s': %s", targetPath, err)
+					break
 				}
-				break
-			}
-			if !os.IsExist(err) {
-				gologger.Error().Msgf("Failed to create file '%s': %s", targetPath, err)
-				break
 			}
-		}

-		if idx == 0 {
-			fileNameHash = hash
-		} else {
-			fileNameHash = fmt.Sprintf("%s_%d", hash, idx)
+			if idx > 0 {
+				fileNameHash = fmt.Sprintf("%s_%d", hash, idx)
+			}
 		}
 	}


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Persistence is now split between analyze and the output loop, causing double writes without matchers and a uniqueness regression with matchers.

The deferred persistence path (the output loop at lines 1303-1340) still writes the response file unconditionally for every matched result via plain os.WriteFile(responsePath, data, 0644) — it does not use O_EXCL and does not apply the _<idx> suffix logic added here. Concrete consequences of the split:

Double write when no matcher/filter is active. Both this block and the output loop run, so the file freshly created here with O_EXCL is then re-written (overwritten) by the output loop's os.WriteFile, doubling I/O for every response.

Lost uniqueness when a matcher/filter is active. This branch is skipped, so fileNameHash stays at the bare hash (no _<idx> suffix). The hash is sha1(method + ":" + URL.EscapedString()), which is identical across distinct results that share method+URL — e.g. -probe-all-ips (different CustomIP, same URL string), the goto retry path after protocol fallback, or repeated probing. With matchers active those matched results all collide on the same <hash>.txt and silently overwrite each other.

OmitBody is a no-op in the deferred path. The output loop strips via strings.ReplaceAll(resp.Raw, resp.ResponseBody, ""), but Result.ResponseBody is only populated when -response/-base64-response/-mdc/-fdc are set (see lines 2180-2190). The branch here uses string(resp.Data) and works correctly; the deferred branch leaves the body in resp.Raw on disk whenever -omit-body is combined with a matcher/filter without one of those flags.

Recommendation: consolidate persistence into a single path. The cleanest fix is to drop this analyze-side write entirely and move the O_EXCL + _<idx> uniqueness logic and the string(resp.Data)-based OmitBody handling into the output loop (which already runs after matcher/filter evaluation) so uniqueness, body omission, and "only matched" all work together. Updating Result.FileNameHash in the output loop after a successful create would keep downstream consumers (index.txt, callers) consistent.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@runner/runner.go` around lines 2461 - 2512, Remove the analyze-side file write block that runs when scanopts.StoreResponse or scanopts.StoreChain is true (the code that creates response files using os.OpenFile with O_EXCL, builds data from requestDump/resp.Raw/resp.GetChain(), and sets fileNameHash), and instead implement the O_EXCL + incremental _<idx> filename creation, the truncation/omit-body logic that uses resp.Data (resp.Raw = strings.ReplaceAll(resp.Raw, string(resp.Data), "") when r.options.OmitBody), and the StoreChain append behavior inside the existing output loop that currently calls os.WriteFile(responsePath, data, 0644); after successfully creating/writing the file there update Result.FileNameHash (fileNameHash) to include the _<idx> suffix so downstream consumers see the exact filename — this consolidates persistence to a single path, prevents double-writes, preserves uniqueness when r.options.HasMatcherOrFilter() is true, and ensures omit-body works consistently.

fixing output writing

3adf9ce

auto-assign Bot requested a review from dogancanbakir March 22, 2026 12:35

dogancanbakir approved these changes Apr 3, 2026

View reviewed changes

Merge remote-tracking branch 'origin/dev' into pr-2462

98e6af0

# Conflicts: # runner/runner_test.go

coderabbitai Bot reviewed May 6, 2026

View reviewed changes

dogancanbakir merged commit 31af5c0 into dev May 6, 2026
14 checks passed

dogancanbakir deleted the 2418-store-response-only-matched branch May 6, 2026 13:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing output writing#2462

fixing output writing#2462
dogancanbakir merged 2 commits intodevfrom
2418-store-response-only-matched

Mzack9999 commented Mar 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

neo-by-projectdiscovery-dev Bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 22, 2026 •

edited

Loading

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

dogancanbakir commented Apr 3, 2026

Uh oh!

denisl commented Apr 16, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Mzack9999 commented Mar 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Checklist

Summary by CodeRabbit

Uh oh!

neo-by-projectdiscovery-dev Bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Neo - PR Security Review

Uh oh!

coderabbitai Bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

dogancanbakir commented Apr 3, 2026

Uh oh!

denisl commented Apr 16, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Mzack9999 commented Mar 22, 2026 •

edited by coderabbitai Bot

Loading

neo-by-projectdiscovery-dev Bot commented Mar 22, 2026 •

edited

Loading

coderabbitai Bot commented Mar 22, 2026 •

edited

Loading