feat: Databricks Unity Catalog Materialization#6565
Draft
falloficaruss wants to merge 24 commits into
Draft
Conversation
Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
…tion methods Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
…apply Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
…, fix operator precedence Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
…cks-specific SDKs Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
Clicking any image in a blog post now opens a fullscreen overlay with the image centered on a dark backdrop. Close with click or Escape. Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Aniket Paluska <apaluska@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Aniket Paluskar <apaluska@redhat.com>
* feat: scaffold Aerospike online store
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* feat: implement Aerospike online_write_batch
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* feat: implement Aerospike online_read
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* feat: implement Aerospike update and teardown
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* test: add Aerospike unit and integration tests
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* feat: add async online_read/write and lifecycle hooks for Aerospike
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* docs: add Aerospike online store reference and tuning guide
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* fix: use bytearray keys and zip-based batch mapping for Aerospike reads
The Aerospike Python client mishandles bytes user keys (hashes only the first byte), collapsing all entities onto the same digest. Wrap keys in bytearray on write and read. Also pair BatchRecord responses with original input keys via zip rather than trusting br.key[2], which the client returns in a different representation on reads.
Add two integration tests: cross-FV Map CDT coexistence and update(tables_to_delete=...) background scan.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* docs: clarify Aerospike auth and TLS sections are Enterprise-only
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* feat: add aerospike to feast-operator supported online stores
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* fix(aerospike): project requested_features server-side and surface per-record errors
Two review blockers rolled into one commit because they share the same code path.
1. Server-side projection. online_read now builds a map_get_by_key_list op nested into the feature-view submap via cdt_ctx_map_key when requested_features is provided, instead of fetching the whole FV slot and filtering in Python. For wide feature views this ships only the requested columns over the wire. The response shape (flat [k,v,k,v] list vs. dict) is normalized through _normalize_projected_features.
2. Per-record error surfacing. Both batch_write and batch_operate only raise when the whole request is rejected; partial failures (single-partition timeout, replica quorum miss) are otherwise silent and present downstream as missing features. online_read now distinguishes RECORD_NOT_FOUND (2) and OP_NOT_APPLICABLE (26, = nested ctx miss when FV slot is absent) from transient errors, which are raised. online_write_batch inspects every per-record result code after the batch call.
Unit tests cover all four paths: projected read, not-found, op-not-applicable (nested ctx miss), and a simulated TIMEOUT that must raise. The docker-backed cross-FV and update() integration tests still pass, so server-side projection is verified end-to-end against a real Aerospike server.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* feat(aerospike)!: rename total_timeout_ms -> batch_total_timeout_ms and add socket_timeout_ms
Review feedback: total_timeout_ms was ambiguous (users read it as a global/end-to-end timeout) and the timeout surface was missing socket_timeout, which is the per-attempt trigger that lets max_retries actually fire within the total budget.
* total_timeout_ms -> batch_total_timeout_ms. Now explicitly named after the Aerospike batch policy it maps to, matches read_timeout_ms / write_timeout_ms in framing (each targets one policy scope).
* Add socket_timeout_ms (optional). Applies uniformly to read, write and batch policies when set. Leaves the Aerospike client default in place when unset.
BREAKING CHANGE: total_timeout_ms is renamed to batch_total_timeout_ms. Config files using the old name must be updated. No default value change.
Docs updated (reference + perf-tuning guide) with a short explainer on the per-attempt vs total deadline distinction. Two new unit tests pin the policy wiring: socket_timeout_ms propagates to all three scopes, and is omitted (not injected as None) when unset.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* refactor(aerospike): use MAP_KEY_ORDERED, KEY_DIGEST, and instance-scoped client
Cheap-win cleanups flagged in review, all touching the same small patch of write-path and lifecycle code.
* Map CDTs are now created with MAP_KEY_ORDERED. map_get_by_key / map_remove_by_key on an ordered map are O(log N) in the map size instead of O(N); matters on reads of wide feature views and on the update() background scan (which walks every record in the project's set).
* Writes drop POLICY_KEY_SEND and rely on the client default (POLICY_KEY_DIGEST). The serialized entity key is no longer stored alongside each record, saving per-record storage the read path never consumes (batch_operate preserves request order; results are paired back by zip in online_read).
* _client moves from a class attribute to an instance attribute (set in __init__). Previously two AerospikeOnlineStore instances could share the cached client through class state until one wrote self._client. With the instance attribute the state is always per-instance from construction.
* Drop MongoDB references from class docstrings and comments (they referred to how the storage layout was derived rather than documenting current behavior). Also rewrite the _build_batch_writes docstring to describe the policies applied on the write path.
Unit test assertions for the write-path record are updated: bw.policy is now None (client default applies) and map ops carry map_policy={'map_order': MAP_KEY_ORDERED}. All three docker-backed integration tests still pass end-to-end (cross-FV upsert, update() background scan, full feature-store round-trip), so the read/write shape survives the ordering and policy changes against a real server.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* feat(aerospike): add per-FV namespace/set overrides and prewriting hook
Adds three configuration knobs to AerospikeOnlineStoreConfig:
- namespace_overrides: pin individual feature views to a different
Aerospike namespace (e.g. RAM-only vs. SSD-backed) without splitting
the project across stores.
- set_overrides: place a feature view in its own set so admin ops on
it (truncate, scan-based deletes during `feast apply`) do not touch
records of other views.
- prewriting_hook: import-string-resolved callable invoked once per
online_write_batch with the rows about to be written, returning the
rows that actually go on the wire. Resolved and cached on first use;
returning [] short-circuits the wire call.
Read, write, update and teardown paths all honour the per-FV ns/set
resolution. update() groups dropped feature views by their resolved
(ns, set) pair and issues one background scan per group. teardown()
truncates every unique (ns, set) pair the project may have written to,
including the store-level default.
Adds 22 unit tests for the new behaviour and updates 3 existing call
sites of _build_batch_writes for the new namespace= parameter. Adds a
sample hook module under examples/online_store/aerospike_overrides_and_hooks/
and corresponding sections in docs/reference/online-stores/aerospike.md.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* test: update aerospike image tag
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* chore: sync README template and secrets baseline after master merge
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* chore: fix secrets baseline line number for v1 operator types
Adding aerospike to the feast-operator enum shifted the allowlisted
SecretRef entry in api/v1/featurestore_types.go by one line.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* docs: update aerospike docs
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* fix(aerospike): wire batch max_retries and fix empty projection handling
Copilot review feedback on PR feast-dev#6532:
- Add max_retries to the batch client policy (batch_operate/batch_write path)
- Treat empty projected feature maps as present FV slots (is not None)
- Return {} from _normalize_projected_features([]) instead of None
- Fix projection unit test mock/assertions
- Correct prewriting_hook config docstring
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* style(aerospike): format online_read docs assignment for ruff
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* chore: update pixi.lock for aerospike optional extra
Regenerate the v6 lockfile with Pixi v0.63.1 after adding the aerospike extra to pyproject.toml.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
* fix(aerospike): add client init lock and batch chunking
Guard lazy client creation with a lock to avoid connection leaks under concurrent first use, and chunk batch reads/writes by batch_max_records so large materializations stay under Aerospike server batch limits.
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
---------
Signed-off-by: Valentyn Kahamlyk <valentin.kagamlyk@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Aniket Paluskar <apaluska@redhat.com>
* fix: Unblock nightly UI build Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> * ci: Add UI production build check Signed-off-by: Francisco Javier Arceo <farceo@redhat.com> --------- Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
2850a47 to
386599e
Compare
…nd spark utils Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
… into feat/databricks-uc-materialization
Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
… into feat/databricks-uc-materialization
Signed-off-by: Abhishek Shinde <norizzabhii@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
The L3 PR adds UC-backed materialization — the ability to write materialized features back to Unity Catalog Delta tables during feast materialize.
When feast materialize runs, the compute engine nodes (LocalOutputNode in local/nodes.py, SparkWriteNode in spark/nodes.py) now call a new write_uc_materialized_data() hook after writing to the online/offline stores. This hook:
Which issue(s) this PR fixes:
Fixes #6499
Checks
git commit -s)Testing Strategy
Misc