[Storage] Create Session Support for GET Requests#47264
[Storage] Create Session Support for GET Requests#47264weirongw23-msft wants to merge 22 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces opt-in “session-based” authentication for eligible GET blob requests (notably downloads) by adding a pipeline policy that can acquire/cache a short-lived session credential per container and use it to sign subsequent GET requests. It also adds the generated CreateSession REST operation/models and a new recorded test covering session behavior.
Changes:
- Added
StorageSessionPolicy+ container-scoped session cache, and plumbeduse_session=Trueinto the sync pipeline creation path. - Added generated
create_sessionoperation + models/enums to support the service’s session creation API. - Added a recorded test and updated proxy sanitizers for session credentials.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/storage/azure-storage-blob/azure/storage/blob/_shared/policies.py | Adds sync session auth policy + cache and request signing logic. |
| sdk/storage/azure-storage-blob/azure/storage/blob/_shared/base_client.py | Adds use_session kwarg handling and wires StorageSessionPolicy into the sync pipeline. |
| sdk/storage/azure-storage-blob/azure/storage/blob/_shared/policies_async.py | Introduces an async session policy stub (currently unimplemented). |
| sdk/storage/azure-storage-blob/azure/storage/blob/_generated/operations/_container_operations.py | Adds generated create_session request builder + operation. |
| sdk/storage/azure-storage-blob/azure/storage/blob/_generated/aio/operations/_container_operations.py | Adds generated async create_session operation. |
| sdk/storage/azure-storage-blob/azure/storage/blob/_generated/models/_models_py3.py | Adds generated session models: CreateSessionConfiguration, CreateSessionResponse, SessionCredentials. |
| sdk/storage/azure-storage-blob/azure/storage/blob/_generated/models/_azure_blob_storage_enums.py | Adds generated enum AuthenticationType. |
| sdk/storage/azure-storage-blob/azure/storage/blob/_generated/models/init.py | Re-exports new generated models/enums. |
| sdk/storage/azure-storage-blob/azure/apiview-properties.json | Updates API view surface list for new types/ops. |
| sdk/storage/azure-storage-blob/tests/conftest.py | Adds test-proxy sanitizers for session token/key. |
| sdk/storage/azure-storage-blob/tests/test_container.py | Adds recorded test validating bearer-vs-session auth behavior and per-container session usage. |
| sdk/storage/azure-storage-blob/CHANGELOG.md | Documents the new opt-in session authentication feature. |
|
|
||
| assert session1 != session2 | ||
|
|
||
| c1b1_actual = container1.download_blob(c1b1_name, raw_response_hook=make_capture("c1_download")).readall() |
jalauzon-msft
left a comment
There was a problem hiding this comment.
Some comments on the sync policy itself, did not look at much else yet
| if http_request.method != "GET": | ||
| return False | ||
| parsed = urlparse(http_request.url) | ||
| segments = [s for s in parsed.path.split("/") if s] |
There was a problem hiding this comment.
This is more of a meta point for URL parsing in the PR as a whole but I'm leaving here as an example. Azurite/emulators use URLs like 127.0.0.1/account/container which would break all the container parsing logic. Not sure if we care about that (probably eventually we will need to)
| return response # bearer was used; nothing session-related to react to | ||
|
|
||
| status = response.http_response.status_code | ||
| error_code = response.http_response.headers.get("x-ms-error-code", "") |
There was a problem hiding this comment.
nit: I think response might have the error code already parsed?
There was a problem hiding this comment.
It doesn't seem like there's a dedicated error_code property on response.http_response itself, but it's in the headers.
There was a problem hiding this comment.
What about response itself?
There was a problem hiding this comment.
Yeah I didn't see any attributes in response that has any error codes.
|
|
||
| _, session_pipeline = self._create_pipeline(credential, sdk_moniker=self._sdk_moniker, **sub_kwargs) | ||
| generated = AzureBlobStorage(container_url, self.api_version, base_url=container_url) | ||
| generated._client._pipeline = session_pipeline # pylint: disable=protected-access |
There was a problem hiding this comment.
The constructor of AzureBlobStorage can take a pipeline instead of doing it this way.
| StorageSessionPolicy( | ||
| account_name=self.account_name, | ||
| session_client_factory=_session_client_factory, | ||
| use_session=True, |
There was a problem hiding this comment.
This is always passed as True? That probably means we don't need it in the policy and if the policy is present, it does its thing?
| return response # bearer was used; nothing session-related to react to | ||
|
|
||
| status = response.http_response.status_code | ||
| error_code = response.http_response.headers.get("x-ms-error-code", "") |
There was a problem hiding this comment.
What about response itself?
| captured = {} | ||
|
|
||
| def make_capture(label): | ||
| def _hook(response): |
There was a problem hiding this comment.
Why hook inside a hook? You shouldn't need that. You may need to declare captured as nonlocal here though.
There was a problem hiding this comment.
It's a way for the hook to distinguish between requests we're making. For example, an upload should use bearer token and download might use sessions. The label helps distinguish that. We still make the checks through captured[label]. I don't think we have to make it nonlocal if we are mutating captured in place.
| for p in getattr(pipeline, "_impl_policies", []): | ||
| if type(p).__name__ == "StorageSessionPolicy": | ||
| return p | ||
| raise AssertionError("StorageSessionPolicy not found on the pipeline") |
There was a problem hiding this comment.
All these inline functions could be helpers, so they aren't duplicated? Even could be shared across sync and async. With the exception of maybe make_capture since it needs a local... but maybe there is something for that too.
| assert auth.startswith("Session ") | ||
| return auth[len("Session ") :].split(":", 1)[0] | ||
|
|
||
| def find_session_policy(pipeline): |
There was a problem hiding this comment.
This is pretty jank, but I'll allow it for now :D
No description provided.