-
Notifications
You must be signed in to change notification settings - Fork 141
hashmap: ensure hashmaps are reusable after hashmap_clear() #1911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In the series merged at bf0a430 (Merge branch 'en/strmap', 2020-11-21), strmap was built on top of hashmap and hashmap was extended in a few ways to support strmap and be more generally useful. One of the extensions was that hashmap_partial_clear() was introduced to allow reuse of the hashmap without freeing the table. Peff believed that it also made sense to introduce a hashmap_clear() which freed everything while allowing reuse. I added hashmap_clear(), but in doing so, overlooked the fact that for a hashmap to be reusable, it needs a defined cmpfn and data (the HASHMAP_INIT macro requires these fields as parameters, for example). So, if we want the hashmap to be reusable, we shouldn't zero out those fields. We probably also shouldn't zero out do_count_items. (We could zero out grow_at and shrink_at, but whether we zero those or not is irrelevant as they'll be automatically updated whenever a new entry is inserted.) Since clearing is associated with freeing map->table, and the only thing required for consistency after freeing map->table is zeroing tablesize and private_size, let's only zero those fields out. Signed-off-by: Elijah Newren <newren@gmail.com>
/submit |
Submitted as pull.1911.git.1745941663160.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Elijah Newren <newren@gmail.com>
>
> In the series merged at bf0a430f70b5 (Merge branch 'en/strmap',
> 2020-11-21), strmap was built on top of hashmap and hashmap was extended
> in a few ways to support strmap and be more generally useful. One of
> the extensions was that hashmap_partial_clear() was introduced to allow
> reuse of the hashmap without freeing the table. Peff believed that it
> also made sense to introduce a hashmap_clear() which freed everything
> while allowing reuse.
>
> I added hashmap_clear(), but in doing so, overlooked the fact that for
> a hashmap to be reusable, it needs a defined cmpfn and data (the
> HASHMAP_INIT macro requires these fields as parameters, for example).
> So, if we want the hashmap to be reusable, we shouldn't zero out those
> fields. We probably also shouldn't zero out do_count_items. (We could
> zero out grow_at and shrink_at, but whether we zero those or not is
> irrelevant as they'll be automatically updated whenever a new entry is
> inserted.)
>
> Since clearing is associated with freeing map->table, and the only thing
> required for consistency after freeing map->table is zeroing tablesize
> and private_size, let's only zero those fields out.
Makes sense. Thanks for finding and fixing.
I do not think we want to patch all the way down to Git 2.30, ...
> diff --git a/hashmap.c b/hashmap.c
> index ee45ef00852..a711377853f 100644
> --- a/hashmap.c
> +++ b/hashmap.c
> @@ -205,8 +205,9 @@ void hashmap_clear_(struct hashmap *map, ssize_t entry_offset)
> return;
> if (entry_offset >= 0) /* called by hashmap_clear_and_free */
> free_individual_entries(map, entry_offset);
> - free(map->table);
> - memset(map, 0, sizeof(*map));
> + FREE_AND_NULL(map->table);
> + map->tablesize = 0;
> + map->private_size = 0;
> }
>
> struct hashmap_entry *hashmap_get(const struct hashmap *map,
>
> base-commit: f65182a99e545d2f2bc22e6c1c2da192133b16a3
... but this part of the code has been fairly quiet and the patch
applies very cleanly. So I'll apply on top of bf0a430f7 and merge
the result---anybody maintaining Git for their LTS distro can then
merge it to their favorite ancient maintenance track ;-)
Thanks.
|
This patch series was integrated into seen via git@07e5df4. |
This branch is now known as |
This patch series was integrated into seen via git@e2479b6. |
This patch series was integrated into next via git@b0cdbeb. |
This patch series was integrated into seen via git@2dc7231. |
This patch series was integrated into seen via git@328279b. |
There was a status update in the "New Topics" section about the branch hashmap API clean-up to ensure hashmap_clear() leaves a cleared map in a reusable state. Will merge to 'master'. source: <pull.1911.git.1745941663160.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@199eefe. |
There was a status update in the "Cooking" section about the branch hashmap API clean-up to ensure hashmap_clear() leaves a cleared map in a reusable state. Will merge to 'master'. source: <pull.1911.git.1745941663160.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@9bff440. |
There was a status update in the "Cooking" section about the branch hashmap API clean-up to ensure hashmap_clear() leaves a cleared map in a reusable state. Will merge to 'master'. source: <pull.1911.git.1745941663160.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@68501b5. |
This patch series was integrated into seen via git@aa079f6. |
This patch series was integrated into seen via git@4a4656d. |
This patch series was integrated into master via git@4a4656d. |
This patch series was integrated into next via git@4a4656d. |
Closed via 4a4656d. |
Ran into a NULL pointer dereference of cmpfn a few months ago when trying to reuse one of {strmap, strset, strintmap} (don't remember which one) after calling the relevant ${TYPE}_clear() variant, and tracked the NULL pointer back to hashmap_clear(). Turned out to not be relevant to those patches because I ended up not needing to reuse the map after all, but I kept a note to myself to send in a fix.
I was surprised this wasn't a bug we were already hitting somewhere, but I looked through the codebase and it appears that the only time we attempt to reuse a hashmap after clearing is when we specifically use hashmap_partial_clear(). So, this is just a latent bug waiting as a trap for someone.