feat(flagd): extract evaluator into api, core, and testkit packages#377
feat(flagd): extract evaluator into api, core, and testkit packages#377
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a modular architecture for the flagd provider by extracting core evaluation logic and API definitions into separate tools packages. It replaces the internal FlagStore with a new FlagdCore implementation and adds a testkit for compliance testing. I have no feedback to provide on the changes.
cd14cf4 to
4cd2612
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #377 +/- ##
==========================================
+ Coverage 95.91% 96.09% +0.18%
==========================================
Files 30 42 +12
Lines 1517 1563 +46
==========================================
+ Hits 1455 1502 +47
+ Misses 62 61 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Split the flagd evaluation logic from the provider into three independent packages under tools/, mirroring the Java SDK contrib architecture (PRs #1696 and #1742): - openfeature-flagd-api: Evaluator Protocol defining the contract for flag evaluation implementations - openfeature-flagd-core: Reference implementation with FlagdCore class, targeting engine, and custom operators (fractional, sem_ver, starts_with, ends_with) - openfeature-flagd-api-testkit: Compliance test suite bundling gherkin feature files from the test-harness evaluator directory The provider's InProcessResolver now delegates to FlagdCore via an adapter pattern, keeping connector code (FileWatcher, GrpcWatcher) unchanged. Old provider modules (flags.py, targeting.py, custom_ops.py) are thin re-exports from the core package for backward compatibility. Also updates the test-harness submodule from v2.11.1 to v3.5.0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
- Implement fractional v2 bucketing algorithm (unsigned hash, integer arithmetic with bit-shift instead of percentage-based float) - Add MAX_WEIGHT_SUM overflow guard - Add negative weight clamping (max(0, weight)) - Add explicit bool-as-weight rejection - Support non-string variant types (str|float|int|bool|None) - Extract _resolve_bucket_by helper - Bump mmh3 dependency to >=5.0.0,<6.0.0 - Drop Python 3.9: update requires-python to >=3.10 for all tools packages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
ad8727a to
f34c06a
Compare
- Fix ruff violations: UP007 (modern type unions), N818 (rename FlagStoreException to FlagStoreError), FURB171 (simplify membership test), PERF401 (use list comprehension), S101 (allow assert in steps) - Add py.typed marker files for PEP 561 compliance - Revert protobuf evaluation.v2 imports/config back to v1 - Run ruff format on all affected files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
- Add tools/openfeature-flagd-{api,core,api-testkit} to build matrix
- Replace project.scripts with poe tasks to match CI expectations
- Add poethepoet dev dependency to all tools packages
- Remove obsolete scripts.py files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
flagd-core implements the v2 fractional bucketing algorithm, so v1 test expectations don't match. Deselect @fractional-v1 tagged tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The testkit is a test library, not a test suite. CI's `poe cov` failed with "no data collected" because tests/ was empty. Add smoke tests to verify the testkit can be imported and returns valid data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Fix mypy errors: add return types, parameter types, use ErrorCode enum instead of string, and cast Mapping to dict for indexed assignment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the openfeature-flagd-api and openfeature-flagd-core packages to provide a modular evaluator implementation for flagd. It also refactors the existing openfeature-provider-flagd to utilize these new core components. I have provided feedback to improve performance by allowing flag configuration to be passed as a dictionary, avoiding redundant serialization, and suggested fixes for regex escaping and exception handling in the core implementation.
- Accept str | dict in Evaluator protocol and FlagdCore, eliminating the dict->JSON->dict roundtrip in _FlagStoreAdapter - Fix ReferenceError handler: use exception instance, not class, and log flag.targeting instead of the function object - Escape evaluator names in $ref regex replacement (re.escape) - Fix backward-compat FlagStore to emit changed_keys, not all keys - Fix README import paths (was api.testkit, should be testkit) - Add content to flagd-core CHANGELOG.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request refactors the flagd provider by extracting its core evaluation logic into three new standalone packages: openfeature-flagd-api (protocol definition), openfeature-flagd-core (reference implementation), and openfeature-flagd-api-testkit (compliance suite). The InProcessResolver has been updated to use the new FlagdCore evaluator, and existing modules have been refactored to re-export logic for backward compatibility. Review feedback highlights a potential breaking change in the FlagStore.update method signature and suggests a correction for JSON parsing in the testkit utilities.
Replace vendored feature/flag files with a hatch build hook that copies them from the test-harness submodule's evaluator/ directory. Files are gitignored and generated fresh on each build via force_include. Also: - Update test-harness submodule to v3.5.0 (adds @fractional-v1/v2 tags) - Add fractional-v1 deselect to provider pytest.ini - Remove redundant flags.py re-export from testkit - Address review feedback (dict passthrough, ReferenceError fix, etc.) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
The hatch build hook includes files in sdist/wheel via force_include, but tests run from the source tree. Add a sync script that copies files from the test-harness submodule, called by poe before test/cov. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
flagd-core e2e tests depend on testkit feature files which are generated from the test-harness submodule, not checked in. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Explain why the replace('\"', '"') is needed — pytest-bdd preserves
backslash escapes from Gherkin table cells.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Simon Schrottner <simon.schrottner@gmail.com>
|
I'm reviewing this, will complete tomorrow. Thanks @aepfli |
|
|
||
| def update(self, flags_data: dict) -> None: | ||
| json_str = json.dumps(flags_data) | ||
| changed_keys = self.evaluator.set_flags_and_get_changed_keys(json_str) |
There was a problem hiding this comment.
set_flags_and_get_changed_keys already accepts dict, so this serializes to JSON only for FlagdCore to immediately deserialize it. Could pass flags_data directly and skip the round-trip.
| @@ -0,0 +1,6 @@ | |||
| class FlagStoreError(Exception): | |||
There was a problem hiding this comment.
Is this raised anywhere? I don't see it used in flagd-core. If it's part of the Evaluator contract, it should be documented (when should implementors raise it?). If not, consider removing it.
| def set_flags_and_get_changed_keys( | ||
| self, flag_configuration: str | dict[str, typing.Any] | ||
| ) -> list[str]: | ||
| with self._lock: |
There was a problem hiding this comment.
This JSON parsing block is duplicated from set_flags above (lines 71-75). Consider extracting a _parse_flag_configuration helper.
| default_value: T, | ||
| metadata: Mapping[str, float | int | str | bool], | ||
| reason: Reason, | ||
| ) -> FlagResolutionDetails: |
There was a problem hiding this comment.
nit: return type should be FlagResolutionDetails[T] (parameterized) for type safety.
|
This mirrors the Java api/core/testkit split, which is great. One thing worth thinking through early: this creates an inter-package dependency chain (provider -> core -> api), and that has ongoing management consequences. The provider now depends on The Some concrete suggestions:
None of this is necessarily blocking, but the release coordination overhead is real (we've felt it in Java), and it's easier to set up the guardrails now than to retrofit them after a few broken releases. |
|
I pulled this locally and did a careful review. I found a few issues worth fixing I think, but fundamentally I'm close to approving. Please consider this carefully... WDYT about going right to 1.0? Or going to 1.0 very quickly after merging? |
Summary
Mirrors the Java SDK contrib architecture (PR #1696, PR #1742) by extracting the flagd evaluation logic into three independent packages:
openfeature-flagd-api(tools/openfeature-flagd-api/):EvaluatorProtocol defining the contract for flag evaluation, so others can implement their own evaluatoropenfeature-flagd-core(tools/openfeature-flagd-core/): Reference implementation (FlagdCore) with targeting engine and custom operators (fractional v2, sem_ver, starts_with, ends_with)openfeature-flagd-api-testkit(tools/openfeature-flagd-api-testkit/): Compliance test suite bundling gherkin feature files from the test-harnessevaluator/directory, with pytest-bdd step definitions — installable as a package so custom evaluator implementations can run the same compliance suiteProvider refactoring
InProcessResolvernow delegates evaluation toFlagdCorevia an adapter patternflags.py,targeting.py,custom_ops.py) are thin re-exports from core for backward compatibilityFileWatcher,GrpcWatcher) remain unchangedFractional bucketing
flagd-coreimplements the v2 fractional algorithm (unsigned hash, integer arithmetic with(hash * totalWeight) >> 32)MAX_WEIGHT_SUMoverflow guard, negative weight clamping, explicit bool-as-weight rejectionCI & release
tools/*packages to the build workflow matrix (lint, mypy, tests on Python 3.10–3.14)py.typedmarker files for PEP 561 complianceproject.scriptswithpoethepoettasks to match CI conventionstools/*to UV workspace membersOther changes
evaluator/directory with gherkin feature files)Test plan
openfeature-flagd-apiunit tests: 10 passedopenfeature-flagd-api-testkitsmoke tests: 2 passed, mypy cleanopenfeature-flagd-coreunit tests: 27 passedopenfeature-flagd-coree2e (testkit compliance): 85 passed, 15 deselected (fractional-v1), 0 failuresHow to use the testkit (for custom evaluator implementations)
🤖 Generated with Claude Code