-
Notifications
You must be signed in to change notification settings - Fork 61
Replace custom cache with CacheControl library #671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
90339f3
ecbd49b
4e3a59e
7477e23
f9e40b9
7b9a1a6
a2655eb
9e0463b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,6 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import os | ||
| import textwrap | ||
| import typing as t | ||
|
|
@@ -43,6 +44,18 @@ def set_color_mode(ctx: click.Context, param: str, value: str) -> None: | |
| }[value] | ||
|
|
||
|
|
||
| def configure_logging( | ||
| ctx: click.Context, param: click.Parameter, value: str | None | ||
| ) -> None: | ||
| if value is None: | ||
| return | ||
| level = getattr(logging, value.upper()) | ||
| logging.basicConfig( | ||
| level=level, | ||
| format="%(name)s [%(levelname)s]: %(message)s", | ||
| ) | ||
|
|
||
|
|
||
| def pretty_helptext_list(values: list[str] | tuple[str, ...]) -> str: | ||
| return textwrap.indent( | ||
| "\n".join( | ||
|
|
@@ -88,12 +101,21 @@ def pretty_helptext_list(values: list[str] | tuple[str, ...]) -> str: | |
| ) | ||
| @click.help_option("-h", "--help") | ||
| @click.version_option() | ||
| @click.option( | ||
| "--log-level", | ||
| hidden=True, | ||
| help="Set the log level for debug output (e.g., DEBUG, INFO, WARNING).", | ||
| type=click.Choice(["DEBUG", "INFO", "WARNING", "ERROR"], case_sensitive=False), | ||
| callback=configure_logging, | ||
| expose_value=False, | ||
| is_eager=True, | ||
| ) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't mind having an option for setting Python log-level, but I think we should make it I've already exposed possibly a couple too many knobs for logging and output control -- we have
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done — added |
||
| @click.option( | ||
| "--schemafile", | ||
| help=( | ||
| "The path to a file containing the JSON Schema to use or an " | ||
| "HTTP(S) URI for the schema. If a remote file is used, " | ||
| "it will be downloaded and cached locally based on mtime. " | ||
| "it will be downloaded and cached locally. " | ||
| "Use '-' for stdin." | ||
| ), | ||
| metavar="[PATH|URI]", | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes me nervous, and relates to my note above about bad data getting into the cache. Now the question I have is: how do we get bad data out of the cache?
There's only one bit of code which calls the CacheDownloader in practice.
In that code, the validation callback is "parse this data with our available parsers".
You can see it here, in the HttpSchemaReader.
Parsing is actually a dynamic problem! JSON5 parsers are used if available, but not if they aren't. And YAML parsing with
ruamel.yamlis generally stable, but if we switched toryamlor another implementation, that could change what's considered valid.Also
tomllibcan be used, andtomllibon Python 3.15+ will support TOML 1.1 while earlier versions support TOML 1.0 . 😵In general, I think we need to be prepared for the possibility that data which passed validation yesterday might not be valid today.
I'm not quite sure what to do on this front. We can remove this
from_cachecheck, but I think the underlying problem will remain, because we can't clear the cache imperatively in response to a failure.I think we may need the ability to do that. It would mean reaching "down the stack", to
session.adapters["https://"].cache.delete(). It's not documented publicly, but it seems too important to pass up.I'm also going to ask on the PyPA discord (I see some of the (past?) maintainers of
cachecontrolare people I know from over there) to see if anyone can confirm that this is okay to do.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I got two pieces of information from a quick exchange with one of the maintainers.
cachecontrolis maintained but not getting new features -- they're not recommending it necessarily for new workBaseCache.delete()is considered part of the public API, so if we hold a reference to the cache object, we can safely calldelete()on it as needed!(2) means that this has a pretty simple resolution. Use
cache.delete()whenever there's bad data in the cache. Yay! 😄(1) is a suggestion, but I think
cachecontrolis a pretty good fit here. And it's stable and battle-tested. If I come to regret the decision, I'll either try to get involved incachecontrolmaintenance (it's the kind of web programming I enjoy), or else swap out for something likerequests-cache.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll continue to work this to handle your comments at least. but, i can also make a similar pr for requests-cache if you would like to be able to compare them. with this pr as reference i'm guessing it wouldn't be take much time. if nothing else, useful reference later in case a switch might happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented
cache.delete()as you suggested. On retry attempts, we now explicitly delete the cache entry usingCacheController.cache_url()to get the normalized key, then callcache.delete(). This replaces theno-cacheheader workaround.The
FileCacheinstance is stored as a@functools.cached_propertyonCacheDownloader(following the project's lazy initialization pattern) and passed to_get_request().a2655eb