Skip to content

fix(config): watch vector config paths for sinks and transforms#25133

Open
powerumc wants to merge 3 commits intovectordotdev:masterfrom
powerumc:powerumc/fix-config-watcher-reload
Open

fix(config): watch vector config paths for sinks and transforms#25133
powerumc wants to merge 3 commits intovectordotdev:masterfrom
powerumc:powerumc/fix-config-watcher-reload

Conversation

@powerumc
Copy link
Copy Markdown
Contributor

@powerumc powerumc commented Apr 7, 2026

Summary

Fixed an issue where concurrent modifications to Vector configuration files and enrichment tables resulted in only enrichment table changes being reloaded.

When ComponentConfig.component_type is Sink or Transform, config_paths is empty. As a result, simultaneous changes to Vector config files and enrichment tables are not fully detected, and only enrichment table changes are picked up.

This PR fixes that by adding the Vector configuration file (or directory) to config_paths, and by checking whether the changed Vector configuration path lies under a directory listed in config_paths.

Vector configuration

For testing, add a Vector configuration file example.toml under the test directory. This configuration is a simple setup that prints Hello, Vector! every 2 seconds.

[sources.example_source]
type = "exec"
command = ["echo", "Hello, Vector!"]
mode = "scheduled"
scheduled.exec_interval_secs = 2

[sinks.console]
type = "console"
inputs = ["example_source"]
encoding.codec = "text"

[sinks.console2]
type = "console"
inputs = ["example_source"]
encoding.codec = "text"

[enrichment_tables.example1_csv]
type = "file"
file.path = "/path/to/test/example1.csv"
file.encoding.type = "csv"
file.encoding.include_headers = true
schema."allow_header_fields" = "string"

[enrichment_tables.example2_csv]
type = "file"
file.path = "/path/to/test/example2.csv"
file.encoding.type = "csv"
file.encoding.include_headers = true
schema."allow_header_fields" = "string"

In the same test directory, create two Enrichment Table (.csv) files: example1.csv and example2.csv.

name, value
aaa, 111

How did you test this PR?

Run Vector:

cargo run -- -C test/ --watch-config

In another terminal session, from the test directory, change the Vector configuration file and the Enrichment Table (.csv) files at the same time:

echo '' >> example.toml && echo '' >> example1.csv && echo '' >> example2.csv

You would expect a reload because the Vector configuration changed, but only the enrichment table files reload:

Hello, Vector!
Hello, Vector!
INFO vector::config::watcher: Configuration file changed.
INFO vector::config::watcher: Component [ComponentKey { id: "example1_csv" }, ComponentKey { id: "example2_csv" }] configuration changed.
INFO vector::config::watcher: Only enrichment tables have changed.
Hello, Vector!
Hello, Vector!

After this PR, the Vector configuration reloads as expected. (When the Vector configuration reloads, enrichment tables are loaded again as well.)

# echo '' >> example.toml && echo '' >> example1.csv && echo '' >> example2.csv

Hello, Vector!
Hello, Vector!
INFO vector::config::watcher: Configuration file changed.
INFO vector::config::watcher: Component [ComponentKey { id: "example1_csv" }, ComponentKey { id: "console1" }, ComponentKey { id: "console2" }, ComponentKey { id: "example2_csv" }] configuration changed.
INFO vector::topology::running: Reloading running topology with new configuration.
INFO vector::topology::running: Running healthchecks.
INFO vector::topology::builder: Internal log [Healthcheck passed.] has been suppressed 1 times.
INFO vector::topology::builder: Healthcheck passed.
INFO vector::topology::builder: Internal log [Healthcheck passed.] is being suppressed to avoid flooding.
INFO vector::topology::running: New configuration loaded successfully.
INFO vector: Vector has reloaded. path=[Dir("test")] internal_log_rate_limit=false
Hello, Vector!
Hello, Vector!

When only the enrichment tables change, only the enrichment tables reload, as expected:

# echo '' >> example1.csv && echo '' >> example2.csv

Hello, Vector!
Hello, Vector!
INFO vector::config::watcher: Configuration file changed.
INFO vector::config::watcher: Component [ComponentKey { id: "example1_csv" }, ComponentKey { id: "example2_csv" }] configuration changed.
INFO vector::config::watcher: Only enrichment tables have changed.
Hello, Vector!
Hello, Vector!

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details on the dd-rust-license-tool.

powerumc added 2 commits April 7, 2026 15:47
* Append vector `config_paths` to the list of watched files for transforms and sinks.
* Enhance `ComponentConfig::contains` to correctly match modified files against watched directory paths (e.g., when using `--config-dir`).
@powerumc powerumc changed the title fix(config): watch vector config paths for sinks and transforms with enrichment table fix(config): watch vector config paths for sinks and transforms Apr 7, 2026
@powerumc powerumc force-pushed the powerumc/fix-config-watcher-reload branch from d3bf255 to d089638 Compare April 7, 2026 11:54
@powerumc powerumc marked this pull request as ready for review April 7, 2026 11:56
@powerumc powerumc requested a review from a team as a code owner April 7, 2026 11:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d089638b6a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

&& config_paths
.iter()
.filter(|p| Format::from_path(p).is_ok())
.any(|p| p.parent() == Some(path.as_path()))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match config-dir updates by ancestry, not immediate parent

In ComponentConfig::contains, directory-based config paths only match when p.parent() == path, so changes under nested files like --config-dir .../transforms/foo/bar.toml are not attributed to sink/transform components. Because load_from_dir supports component subdirectories (and recursive transform subdirectories), a concurrent enrichment-table edit can still produce changed_components containing only enrichment tables, which triggers ReloadEnrichmentTables and skips reloading the changed Vector config.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, Vector reloads even when only enrichment table files are modified within subdirectories (including nested subdirectories) of the config directory.

For example, given the following directory structure, modifying only the .csv files still triggers a reload:

test -- example.toml
     -- example1.csv
     -- example2.csv
     -- sink -- example2.toml

$ echo '' >> example1.csv && echo '' >> example2.csv

INFO vector::config::watcher: Configuration file changed.
INFO vector::topology::running: Reloading running topology with new configuration.
INFO vector::topology::running: Running healthchecks.
INFO vector::topology::running: New configuration loaded successfully.
INFO vector: Vector has reloaded. path=[Dir("test")] internal_log_rate_limit=false

I’m not sure whether this is intended behavior in Vector or not.
If this is intended, then changes in subdirectories (including nested ones) may not need to be checked in contains().

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's worth addressing this finding by doing the following:

  • Add a small helper in src/config/watcher.rs that checks whether a changed_path either exactly matches a watched config file or is a recognized config file under a watched config dir.
  • Call it before the "Only enrichment tables have changed branch"

And for testing coverage:

  • A test involving a transforms/foo/bar.toml which is changed in the same window as an enrichment-table file and assert that we get a ReloadFromDisk signal.
  • A test showing enrichment-only changes still use ReloadEnrichmentTables.

For simplicity, I would defer discussion on whether config-file changes under --config-dir should go straight to ReloadFromDisk instead of piggybacking on component attribution.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, instead of adding a new helper in watcher.rs, it might be sufficient to adjust the condition in the existing ComponentConfig.contains() function as follows:

//.any(|p| p.parent() == Some(path.as_path()))
.any(|p| p.starts_with(path))

(I haven’t tested this yet.)

This way, we would only need to add tests for the contains() function.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take a step back and look at the broader approach. Bear with me please, since this is a complex area due to the number of possible scenarios (especially when a config dir contains multiple configs but let's stick to a single config file for now.). That's why my first hunch was to fix the watcher design.

Bug in current implementation

Walkthrough the provided reproduction steps starting from:

echo '' >> example.toml && echo '' >> example1.csv && echo '' >> example2.csv

This appends a newline to all three files, but example.toml's parsed config is unchanged - zero components changed.

After the PR, the log shows:

Component [ComponentKey { id: "example1_csv" }, ComponentKey { id: "console1"
}, ComponentKey { id: "console2" }, ComponentKey { id: "example2_csv" }]
configuration changed.

But console1 and console2 didn't actually change. This is a bug.

The current master behavior of sending ReloadEnrichmentTables is actually
correct here - only enrichment data files changed, and the config didn't actually change.

Next steps

Change the trigger to:

sed -i '' 's/console2/console3/' test/example.toml && echo '' >> test/example1.csv

Does Vector reload the full config (picking up the codec change) or (bug) it only sends ReloadEnrichmentTables and the sink change is lost?

@pront
Copy link
Copy Markdown
Member

pront commented Apr 7, 2026

Hi @powerumc and thank you for contributing this fix. I will take a look shortly.

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d089638b6a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants