Skip to content

hashmap: ensure hashmaps are reusable after hashmap_clear() #1911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

newren
Copy link

@newren newren commented Apr 28, 2025

Ran into a NULL pointer dereference of cmpfn a few months ago when trying to reuse one of {strmap, strset, strintmap} (don't remember which one) after calling the relevant ${TYPE}_clear() variant, and tracked the NULL pointer back to hashmap_clear(). Turned out to not be relevant to those patches because I ended up not needing to reuse the map after all, but I kept a note to myself to send in a fix.

I was surprised this wasn't a bug we were already hitting somewhere, but I looked through the codebase and it appears that the only time we attempt to reuse a hashmap after clearing is when we specifically use hashmap_partial_clear(). So, this is just a latent bug waiting as a trap for someone.

In the series merged at bf0a430 (Merge branch 'en/strmap',
2020-11-21), strmap was built on top of hashmap and hashmap was extended
in a few ways to support strmap and be more generally useful.  One of
the extensions was that hashmap_partial_clear() was introduced to allow
reuse of the hashmap without freeing the table.  Peff believed that it
also made sense to introduce a hashmap_clear() which freed everything
while allowing reuse.

I added hashmap_clear(), but in doing so, overlooked the fact that for
a hashmap to be reusable, it needs a defined cmpfn and data (the
HASHMAP_INIT macro requires these fields as parameters, for example).
So, if we want the hashmap to be reusable, we shouldn't zero out those
fields.  We probably also shouldn't zero out do_count_items.  (We could
zero out grow_at and shrink_at, but whether we zero those or not is
irrelevant as they'll be automatically updated whenever a new entry is
inserted.)

Since clearing is associated with freeing map->table, and the only thing
required for consistency after freeing map->table is zeroing tablesize
and private_size, let's only zero those fields out.

Signed-off-by: Elijah Newren <newren@gmail.com>
@newren
Copy link
Author

newren commented Apr 29, 2025

/submit

Copy link

gitgitgadget bot commented Apr 29, 2025

Submitted as pull.1911.git.1745941663160.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1911/newren/fix-hashmap-clear-v1

To fetch this version to local tag pr-1911/newren/fix-hashmap-clear-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1911/newren/fix-hashmap-clear-v1

Copy link

gitgitgadget bot commented Apr 29, 2025

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Elijah Newren <newren@gmail.com>
>
> In the series merged at bf0a430f70b5 (Merge branch 'en/strmap',
> 2020-11-21), strmap was built on top of hashmap and hashmap was extended
> in a few ways to support strmap and be more generally useful.  One of
> the extensions was that hashmap_partial_clear() was introduced to allow
> reuse of the hashmap without freeing the table.  Peff believed that it
> also made sense to introduce a hashmap_clear() which freed everything
> while allowing reuse.
>
> I added hashmap_clear(), but in doing so, overlooked the fact that for
> a hashmap to be reusable, it needs a defined cmpfn and data (the
> HASHMAP_INIT macro requires these fields as parameters, for example).
> So, if we want the hashmap to be reusable, we shouldn't zero out those
> fields.  We probably also shouldn't zero out do_count_items.  (We could
> zero out grow_at and shrink_at, but whether we zero those or not is
> irrelevant as they'll be automatically updated whenever a new entry is
> inserted.)
>
> Since clearing is associated with freeing map->table, and the only thing
> required for consistency after freeing map->table is zeroing tablesize
> and private_size, let's only zero those fields out.

Makes sense.  Thanks for finding and fixing.

I do not think we want to patch all the way down to Git 2.30, ...

> diff --git a/hashmap.c b/hashmap.c
> index ee45ef00852..a711377853f 100644
> --- a/hashmap.c
> +++ b/hashmap.c
> @@ -205,8 +205,9 @@ void hashmap_clear_(struct hashmap *map, ssize_t entry_offset)
>  		return;
>  	if (entry_offset >= 0)  /* called by hashmap_clear_and_free */
>  		free_individual_entries(map, entry_offset);
> -	free(map->table);
> -	memset(map, 0, sizeof(*map));
> +	FREE_AND_NULL(map->table);
> +	map->tablesize = 0;
> +	map->private_size = 0;
>  }
>  
>  struct hashmap_entry *hashmap_get(const struct hashmap *map,
>
> base-commit: f65182a99e545d2f2bc22e6c1c2da192133b16a3

... but this part of the code has been fairly quiet and the patch
applies very cleanly.  So I'll apply on top of bf0a430f7 and merge
the result---anybody maintaining Git for their LTS distro can then
merge it to their favorite ancient maintenance track ;-)

Thanks.


Copy link

gitgitgadget bot commented Apr 30, 2025

This patch series was integrated into seen via git@07e5df4.

@gitgitgadget gitgitgadget bot added the seen label Apr 30, 2025
Copy link

gitgitgadget bot commented Apr 30, 2025

This branch is now known as en/hashmap-clear-fix.

Copy link

gitgitgadget bot commented Apr 30, 2025

This patch series was integrated into seen via git@e2479b6.

Copy link

gitgitgadget bot commented Apr 30, 2025

This patch series was integrated into next via git@b0cdbeb.

@gitgitgadget gitgitgadget bot added the next label Apr 30, 2025
Copy link

gitgitgadget bot commented May 1, 2025

This patch series was integrated into seen via git@2dc7231.

Copy link

gitgitgadget bot commented May 3, 2025

This patch series was integrated into seen via git@328279b.

Copy link

gitgitgadget bot commented May 3, 2025

There was a status update in the "New Topics" section about the branch en/hashmap-clear-fix on the Git mailing list:

hashmap API clean-up to ensure hashmap_clear() leaves a cleared map
in a reusable state.

Will merge to 'master'.
source: <pull.1911.git.1745941663160.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented May 5, 2025

This patch series was integrated into seen via git@199eefe.

Copy link

gitgitgadget bot commented May 6, 2025

There was a status update in the "Cooking" section about the branch en/hashmap-clear-fix on the Git mailing list:

hashmap API clean-up to ensure hashmap_clear() leaves a cleared map
in a reusable state.

Will merge to 'master'.
source: <pull.1911.git.1745941663160.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented May 6, 2025

This patch series was integrated into seen via git@9bff440.

Copy link

gitgitgadget bot commented May 8, 2025

There was a status update in the "Cooking" section about the branch en/hashmap-clear-fix on the Git mailing list:

hashmap API clean-up to ensure hashmap_clear() leaves a cleared map
in a reusable state.

Will merge to 'master'.
source: <pull.1911.git.1745941663160.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented May 8, 2025

This patch series was integrated into seen via git@68501b5.

Copy link

gitgitgadget bot commented May 8, 2025

This patch series was integrated into seen via git@aa079f6.

Copy link

gitgitgadget bot commented May 8, 2025

This patch series was integrated into seen via git@4a4656d.

Copy link

gitgitgadget bot commented May 8, 2025

This patch series was integrated into master via git@4a4656d.

Copy link

gitgitgadget bot commented May 8, 2025

This patch series was integrated into next via git@4a4656d.

@gitgitgadget gitgitgadget bot added the master label May 8, 2025
Copy link

gitgitgadget bot commented May 8, 2025

Closed via 4a4656d.

@gitgitgadget gitgitgadget bot closed this May 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant