Unify LRU memory-limiting caches into one generic cache#22613
Conversation
f0af85c to
8283e19
Compare
| } | ||
|
|
||
| #[test] | ||
| fn test_entry_creation() { |
There was a problem hiding this comment.
Test is removed because ListFilesEntry is removed.
| fn put(&mut self, key: &K, value: V, now: Instant) -> Option<V> { | ||
| let value_size = value.size(); | ||
|
|
||
| if value_size == 0 { |
There was a problem hiding this comment.
List-files-cache rejected values with size 0 which is needed to operate correctly.
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
| DEFAULT_LIST_FILES_CACHE_MEMORY_LIMIT, DEFAULT_LIST_FILES_CACHE_TTL, | ||
| }; | ||
| /// Calculates the number of bytes an [`ObjectMeta`] occupies in the heap. | ||
| pub fn meta_heap_bytes(object_meta: &ObjectMeta) -> usize { |
There was a problem hiding this comment.
This is a left-over from list-files-cache which didn't use the DFHeapsize trait for heap-size accounting because it did not exist at the time. DFHeapsize cannot be implemented here for ObjectMeta because it would become an orphan. To remove it and use DFHeapsize this would require one of the following changes:
-
Add a dependency to
object_storeto the packagecommonswhereDFHeapsizeis located and implement it forObjectMeta. The downside is that this would spread theobject_storefurther around. -
Move the
DFHeapsizetrait into theexecutionpackage where the cache is located, since it's nowhere else used yet. -
Create a new crate/package for the heap-size estimation and add the
object_storedependency there and use it from there.
c7c538a to
29d2f82
Compare
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
run benchmark clickbench_partitioned |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing generic-cache-1 (5920922) to 32a1fe5 (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
|
@nuno-faria Do you have maybe time for a review for this? Thanks |
Which issue does this PR close?
Rationale for this change
This PR introduces a new cache which merges the functionality of file-metadata cache, list-files caches and file-statistics cache into one generic implementation. This removes a lot of redundant code.
What changes are included in this PR?
Introduce a generic
DefaultCachewith LRU eviction, memory-limit and TTL.Migrate all cache tests to use the new
DefaultCache.Replace file-metadata cache, list-files-cache and file-statistics-cache implementations usage with the new generic version.
Are these changes tested?
Yes. All previous cache tests are migrated to the new cache and passing. They had to be slighlty adapted because the new implementation also counts the cache-key for memory accounting which wasn't the case for all previous implementations. The tests are still at the same location to have a diff for reviews.
Once the tests are reviewed and approved they should probably move to
default_cache.rsand the filesfile_statistics_cache.rs,file_metadata_cache.rsandlist_files_cache.rscan be removed.Are there any user-facing changes?
The traits
FileStatisticsCache,ListFilesCacheandFileMetadataCacheare replaced with the typesCache<TableScopedPath, CachedFileMetadata>,Cache<TableScopedPath, CachedFileList>andCache<Path, CachedFileMetadataEntry>.