refactor(package): stable lakeflow_framework import namespace with bundled config/schemas#87
Open
rederik76 wants to merge 2 commits into
Open
refactor(package): stable lakeflow_framework import namespace with bundled config/schemas#87rederik76 wants to merge 2 commits into
rederik76 wants to merge 2 commits into
Conversation
…rk package - Introduce src/lakeflow_framework/ as a proper Python package with pyproject.toml (hatchling build); config/ and schemas/ bundled as package data - Implement Strategy B (Workspace Files-first) resolver in config_resolver: load_framework_default_json resolves via Workspace Files → importlib.resources → local/config/ overlay; load_framework_schema returns an importlib.resources traversable for bundled JSON schemas - Reduce src/*.py shims to thin re-exports from lakeflow_framework for backward compat - Add contrib/ extension point with README and __init__ stub - Add tests: test_package.py (import surface), test_strategy_b_resolver.py (resolver + schema) - Update all internal imports across dataflow/, dataflow_spec_builder/, and support modules - Add docs: ADR-0007 (package layout), ADR-0008 (Workspace Files-first resolver), deploy_wheel.rst, deploy_framework_overview.rst, contributor_contrib.rst; update all existing docs/ pages to reference lakeflow_framework imports
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
refactor(package): restructure src/ into an importable lakeflow_framework package
Summary
Restructures the framework's flat
src/*.pylayout into a properlakeflow_frameworkPython package with a rootpyproject.toml. The primary goal is a stable, globally-unique import namespace (lakeflow_framework.*) and portable bundling of default config and JSON schemas as package data.The default deployment model is unchanged: flat DAB (bundle) deploy remains the preferred and default path for all customers. You still clone the repo and
databricks bundle deploy, withframework.sourcePathpointing at the deployedsrc/on Workspace Files. This PR additionally makes the package pip-installable as an optional distribution path for teams that manage Python dependencies via PyPI, a UC Volume, or an internal Artifactory feed — but no one is required to switch.Two ADRs capture the design:
lakeflow_frameworkpackage layout, compat shims, packaging, deprecation timeline.Deployment modes
Flat DAB deploy stays the default; the wheel is an optional add-on for specific dependency-management needs:
databricks bundle deploy;framework.sourcePathpoints at deployedsrc/; cluster reads modules and default config directly.pip install lakeflow-framework; defaults/schemas bundled in the wheel viaimportlib.resources.framework.sourcePathstill set sosrc/local/config/sparse overrides deep-merge on top.What changed
Package restructure (primary change)
src/lakeflow_framework/package; all internal imports use absolutelakeflow_framework.*names (no bare imports remain inside the package). This removes the shadow-import risk where a customer bundle module could collide with a bare framework module name.config/default/**andschemas/**bundled as package data so defaults/schemas travel with the package regardless of deploy mode.Config/schema resolver (Strategy B, ADR-0008)
load_framework_default_jsonis the single resolver, resolution order: (1) Workspace Files underframework.sourcePathif present → (2)importlib.resourcespackage data → (3)src/local/config/sparse overlay deep-merged on top (ADR-0006 behavior preserved).load_framework_schemareturns animportlib.resourcestraversable forjsonschemavalidators;os.path.join(...)call sites migrated to the resolver.Optional pip packaging (secondary)
pyproject.toml(setuptools build);VERSIONis the single source of truth, resolved by bothimportlib.metadata(wheel) and direct file read (editable/flat deploy).lakeflow-framework,[contrib](currently an empty no-op),[all].contrib/extension point (empty__init__.py+ support-policyREADME.rst); no modules land here in this PR.Backward compatibility
src/*.pylocations reduced to thin re-export shims (e.g.from lakeflow_framework.logger import *), kept until v1.0.0. Existing notebooks/bundles importing bare names keep working unchanged.Tests
tests/test_package.py— public import surface.tests/test_strategy_b_resolver.py— resolver precedence + schema resolution across deploy modes.Docs
deploy_framework_overview.rst(positions flat DAB as default, wheel as optional),deploy_wheel.rst,contributor_contrib.rst,contributor_dev_env.rst.lakeflow_frameworkimports in new code;docs/conf.pyreadsreleasefromVERSION.Deprecation timeline
lakeflow_frameworkpackage introduced; baresrc/imports still work via shims; flat DAB deploy remains defaultsrc/*.pypaths removedDiff footprint
~128 files changed (+3,741 / −1,979), mostly file moves (
src/X.py→src/lakeflow_framework/X.py) plus import rewrites; net-new code concentrated inconfig_resolver.py,constants.py,__init__.py,contrib/, tests, and docs.Test plan
framework.sourcePathset; defaults/schemas load from Workspace Files (step 1); behavior identical to pre-v0.16.0.from constants import ...) still resolve via shims.src/local/config/overlay deep-merges on top in all deploy modes.pytest tests/test_package.py tests/test_strategy_b_resolver.pypasses; full suite green.pip install -e ./".[all]"succeed;lakeflow_framework.__version__matchesVERSION;python -m buildwheel containsconfig/default/**andschemas/**; wheel install withoutframework.sourcePathloads defaults from package data (step 2).VERSION.