Skip to content

Fix/modelbuilder deploy datacacheconfig 5750#5764

Open
aviruthen wants to merge 10 commits intoaws:masterfrom
aviruthen:fix/modelbuilder-deploy-datacacheconfig-5750
Open

Fix/modelbuilder deploy datacacheconfig 5750#5764
aviruthen wants to merge 10 commits intoaws:masterfrom
aviruthen:fix/modelbuilder-deploy-datacacheconfig-5750

Conversation

@aviruthen
Copy link
Copy Markdown
Collaborator

The issue requests exposing additional CreateInferenceComponent API parameters through ModelBuilder.deploy(), primarily DataCacheConfig, BaseInferenceComponentName, Container specification, and VariantName. The _deploy_core_endpoint method in model_builder.py builds InferenceComponentSpecification but does not pass through these parameters. The sagemaker.core.shapes module already has InferenceComponentDataCacheConfig and related shapes. The fix requires: (1) adding new optional parameters to the deploy() method and _deploy_core_endpoint(), (2) wiring those parameters into the InferenceComponentSpecification and InferenceComponent.create() call, and (3) making variant_name configurable instead of hardcoded to 'AllTraffic'. The deploy wrappers in model_builder_servers.py pass **kwargs through to _deploy_core_endpoint, so they require no changes.

Includes unit and integration tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants