From 31d19899c7a0e08e8cf843ea1abe7504eb3a1d34 Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Mon, 27 Apr 2026 21:06:49 +0200 Subject: [PATCH 01/29] doc: log: fix --decorate description list 026f2e3b (doc: convert git-log to new documentation format, 2025-07-07) transformed the inline description of `--decorate` options to a description list: We also transform inline descriptions of possible values of option --decorate into a list, which is more readable and extensible. But a source code block was used instead of an open block. Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-log.adoc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/git-log.adoc b/Documentation/git-log.adoc index e304739c5e8011..1c95499060d149 100644 --- a/Documentation/git-log.adoc +++ b/Documentation/git-log.adoc @@ -36,14 +36,14 @@ OPTIONS Print out the ref names of any commits that are shown. Possible values are: + ----- +-- `short`;; the ref name prefixes `refs/heads/`, `refs/tags/` and `refs/remotes/` are not printed. `full`;; the full ref name (including prefix) is printed. `auto`:: if the output is going to a terminal, the ref names are shown as if `short` were given, otherwise no ref names are shown. ----- +-- + The option `--decorate` is short-hand for `--decorate=short`. Default to configuration value of `log.decorate` if configured, otherwise, `auto`. From b635fd0725dd74ae59a0467a3180624a8e9abdb0 Mon Sep 17 00:00:00 2001 From: Kristoffer Haugsbakk Date: Mon, 27 Apr 2026 21:06:50 +0200 Subject: [PATCH 02/29] doc: log: use the same delimiter in description list We must use the same delimiter since this is a meant to be a flat list. Introducing a new legal delimiter like `::` makes an inner description list: ... full the full ref name ... auto if the output ... Signed-off-by: Kristoffer Haugsbakk Signed-off-by: Junio C Hamano --- Documentation/git-log.adoc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/git-log.adoc b/Documentation/git-log.adoc index 1c95499060d149..fb3ac112839cf7 100644 --- a/Documentation/git-log.adoc +++ b/Documentation/git-log.adoc @@ -40,7 +40,7 @@ OPTIONS `short`;; the ref name prefixes `refs/heads/`, `refs/tags/` and `refs/remotes/` are not printed. `full`;; the full ref name (including prefix) is printed. -`auto`:: if the output is going to a terminal, the ref names +`auto`;; if the output is going to a terminal, the ref names are shown as if `short` were given, otherwise no ref names are shown. -- From 4a9e0972280d821990a672a11465f90cef60dfae Mon Sep 17 00:00:00 2001 From: Zakariyah Ali Date: Wed, 29 Apr 2026 11:36:06 +0100 Subject: [PATCH 03/29] t2000: consolidate second scenario into a single test block The second test scenario in t2000 consists of several fragmented test_expect_success blocks that handle data setup, tree writes, execution of git-checkout-index, and final state validation. Consolidate these nine separate blocks into a single self-contained test block. This follows the modern Git testing standard where setup, execution, and validation of a single logical scenario are kept together. As a result of this consolidation, the show_files() helper and its associated test_debug calls are no longer used and have been removed. This also removes a dependency on the non-portable 'find -ls' command. Helped-by: Karthik Nayak Helped-by: Junio C Hamano Signed-off-by: Zakariyah Ali Signed-off-by: Junio C Hamano --- t/t2000-conflict-when-checking-files-out.sh | 65 +++------------------ 1 file changed, 8 insertions(+), 57 deletions(-) diff --git a/t/t2000-conflict-when-checking-files-out.sh b/t/t2000-conflict-when-checking-files-out.sh index af199d81913f1e..7b613705498396 100755 --- a/t/t2000-conflict-when-checking-files-out.sh +++ b/t/t2000-conflict-when-checking-files-out.sh @@ -23,17 +23,6 @@ test_description='git conflicts when checking files out test.' . ./test-lib.sh -show_files() { - # show filesystem files, just [-dl] for type and name - find path? -ls | - sed -e 's/^[0-9]* * [0-9]* * \([-bcdl]\)[^ ]* *[0-9]* *[^ ]* *[^ ]* *[0-9]* [A-Z][a-z][a-z] [0-9][0-9] [^ ]* /fs: \1 /' - # what's in the cache, just mode and name - git ls-files --stage | - sed -e 's/^\([0-9]*\) [0-9a-f]* [0-3] /ca: \1 /' - # what's in the tree, just mode and name. - git ls-tree -r "$1" | - sed -e 's/^\([0-9]*\) [^ ]* [0-9a-f]* /tr: \1 /' -} test_expect_success 'prepare files path0 and path1/file1' ' date >path0 && @@ -83,59 +72,21 @@ test_expect_success SYMLINKS 'checkout-index -f twice with --prefix' ' # path path3 is occupied by a non-directory. With "-f" it should remove # the symlink path3 and create directory path3 and file path3/file1. -test_expect_success 'prepare path2/file0 and index' ' +test_expect_success 'checkout-index -f resolves symlink conflict on leading path' ' mkdir path2 && date >path2/file0 && - git update-index --add path2/file0 -' - -test_expect_success 'write tree with path2/file0' ' - tree1=$(git write-tree) -' - -test_debug 'show_files $tree1' - -test_expect_success 'prepare path3/file1 and index' ' + git update-index --add path2/file0 && + tree1=$(git write-tree) && mkdir path3 && date >path3/file1 && - git update-index --add path3/file1 -' - -test_expect_success 'write tree with path3/file1' ' - tree2=$(git write-tree) -' - -test_debug 'show_files $tree2' - -test_expect_success 'read previously written tree and checkout.' ' + git update-index --add path3/file1 && + tree2=$(git write-tree) && rm -fr path3 && git read-tree -m $tree1 && - git checkout-index -f -a -' - -test_debug 'show_files $tree1' - -test_expect_success 'add a symlink' ' - test_ln_s_add path2 path3 -' - -test_expect_success 'write tree with symlink path3' ' - tree3=$(git write-tree) -' - -test_debug 'show_files $tree3' - -# Morten says "Got that?" here. -# Test begins. - -test_expect_success 'read previously written tree and checkout.' ' + git checkout-index -f -a && + test_ln_s_add path2 path3 && git read-tree $tree2 && - git checkout-index -f -a -' - -test_debug 'show_files $tree2' - -test_expect_success 'checking out conflicting path with -f' ' + git checkout-index -f -a && test_path_is_dir_not_symlink path2 && test_path_is_dir_not_symlink path3 && test_path_is_file_not_symlink path2/file0 && From a81411253323208e1e8d3591247c27fefa8a2045 Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Mon, 4 May 2026 15:06:18 +0100 Subject: [PATCH 04/29] xdiff: reduce size of action arrays When the myers algorithm is selected the input files are pre-processed to remove any common prefix and suffix. Then any lines that appear only in one side of the diff are marked as changed and frequently occurring lines are marked as changed if they are adjacent to a changed line. This step requires a couple of temporary arrays. As as the common prefix and suffix have already been removed, the arrays only need to be big enough to hold the lines between them, not the whole file. Reduce the size of the arrays and adjust the loops that use them accordingly while taking care to keep indexing the arrays in xdfile_t with absolute line numbers. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- xdiff/xprepare.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c index beef711067b612..3b6bae0d1581b7 100644 --- a/xdiff/xprepare.c +++ b/xdiff/xprepare.c @@ -273,16 +273,19 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd uint8_t *action1 = NULL, *action2 = NULL; bool need_min = !!(cf->flags & XDF_NEED_MINIMAL); int ret = 0; + ptrdiff_t off = xdf1->dstart; + ptrdiff_t len1 = xdf1->dend - off + 1; + ptrdiff_t len2 = xdf2->dend - off + 1; /* * Create temporary arrays that will help us decide if * changed[i] should remain false, or become true. */ - if (!XDL_CALLOC_ARRAY(action1, xdf1->nrec + 1)) { + if (!XDL_CALLOC_ARRAY(action1, len1)) { ret = -1; goto cleanup; } - if (!XDL_CALLOC_ARRAY(action2, xdf2->nrec + 1)) { + if (!XDL_CALLOC_ARRAY(action2, len2)) { ret = -1; goto cleanup; } @@ -298,8 +301,8 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd if (mlim1 > XDL_MAX_EQLIMIT) mlim1 = XDL_MAX_EQLIMIT; } - for (i = xdf1->dstart; i <= xdf1->dend; i++) { - size_t mph1 = xdf1->recs[i].minimal_perfect_hash; + for (i = 0; i < len1; i++) { + size_t mph1 = xdf1->recs[i + off].minimal_perfect_hash; rcrec = cf->rcrecs[mph1]; nm = rcrec ? rcrec->len2 : 0; if (nm == 0) @@ -318,8 +321,8 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd if (mlim2 > XDL_MAX_EQLIMIT) mlim2 = XDL_MAX_EQLIMIT; } - for (i = xdf2->dstart; i <= xdf2->dend; i++) { - size_t mph2 = xdf2->recs[i].minimal_perfect_hash; + for (i = 0; i < len2; i++) { + size_t mph2 = xdf2->recs[i + off].minimal_perfect_hash; rcrec = cf->rcrecs[mph2]; nm = rcrec ? rcrec->len1 : 0; if (nm == 0) @@ -335,42 +338,42 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd * false, or become true. */ xdf1->nreff = 0; - for (i = xdf1->dstart; i <= xdf1->dend; i++) { + for (i = 0; i < len1; i++) { uint8_t action = action1[i]; if (action == INVESTIGATE) { - if (!xdl_clean_mmatch(action1, i, xdf1->dstart, xdf1->dend)) + if (!xdl_clean_mmatch(action1, i, 0, len1 - 1)) action = KEEP; else action = DISCARD; } if (action == KEEP) { - xdf1->reference_index[xdf1->nreff++] = i; + xdf1->reference_index[xdf1->nreff++] = i + off; /* changed[i] remains false */ } else if (action == DISCARD) { - xdf1->changed[i] = true; + xdf1->changed[i + off] = true; } else { BUG("Illegal state for action"); } } xdf2->nreff = 0; - for (i = xdf2->dstart; i <= xdf2->dend; i++) { + for (i = 0; i < len2; i++) { uint8_t action = action2[i]; if (action == INVESTIGATE) { - if (!xdl_clean_mmatch(action2, i, xdf2->dstart, xdf2->dend)) + if (!xdl_clean_mmatch(action2, i, 0, len2 - 1)) action = KEEP; else action = DISCARD; } if (action == KEEP) { - xdf2->reference_index[xdf2->nreff++] = i; + xdf2->reference_index[xdf2->nreff++] = i + off; /* changed[i] remains false */ } else if (action == DISCARD) { - xdf2->changed[i] = true; + xdf2->changed[i + off] = true; } else { BUG("Illegal state for action"); } From 53d13887b8581d46dffc1f4ee2622c977b65ecb5 Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Mon, 4 May 2026 15:06:19 +0100 Subject: [PATCH 05/29] xdiff: cleanup xdl_clean_mmatch() Remove the "s" parameter as, since the last commit, this function is always called with s == 0. Also change parameter "e" to expect a length, rather than the index of the last line to simplify the caller. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- xdiff/xprepare.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c index 3b6bae0d1581b7..81de412875abb4 100644 --- a/xdiff/xprepare.c +++ b/xdiff/xprepare.c @@ -197,8 +197,9 @@ void xdl_free_env(xdfenv_t *xe) { } -static bool xdl_clean_mmatch(uint8_t const *action, ptrdiff_t i, ptrdiff_t s, ptrdiff_t e) { +static bool xdl_clean_mmatch(uint8_t const *action, ptrdiff_t i, ptrdiff_t len) { ptrdiff_t r, rdis0, rpdis0, rdis1, rpdis1; + ptrdiff_t s = 0, e = len - 1; /* * Limits the window that is examined during the similar-lines @@ -342,7 +343,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd uint8_t action = action1[i]; if (action == INVESTIGATE) { - if (!xdl_clean_mmatch(action1, i, 0, len1 - 1)) + if (!xdl_clean_mmatch(action1, i, len1)) action = KEEP; else action = DISCARD; @@ -363,7 +364,7 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd uint8_t action = action2[i]; if (action == INVESTIGATE) { - if (!xdl_clean_mmatch(action2, i, 0, len2 - 1)) + if (!xdl_clean_mmatch(action2, i, len2)) action = KEEP; else action = DISCARD; From c8eb18f58607057a812654bdfca3e6b47bd0ffe4 Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Mon, 4 May 2026 15:06:20 +0100 Subject: [PATCH 06/29] xprepare: simplify error handling If either of the two allocations fail we want to take the same action so use a single if statement. This saves a few lines and makes it easier for the next commit to add a couple more allocations. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- xdiff/xprepare.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c index 81de412875abb4..7a29e5fc4748e2 100644 --- a/xdiff/xprepare.c +++ b/xdiff/xprepare.c @@ -282,11 +282,8 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd * Create temporary arrays that will help us decide if * changed[i] should remain false, or become true. */ - if (!XDL_CALLOC_ARRAY(action1, len1)) { - ret = -1; - goto cleanup; - } - if (!XDL_CALLOC_ARRAY(action2, len2)) { + if (!XDL_CALLOC_ARRAY(action1, len1) || + !XDL_CALLOC_ARRAY(action2, len2)) { ret = -1; goto cleanup; } From dca97e79bbf75f27602fe277344bfebebed82bb9 Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Mon, 4 May 2026 15:06:21 +0100 Subject: [PATCH 07/29] xdiff: reduce the size of array When the myers algorithm is selected the input files are pre-processed to remove any common prefix and suffix and any lines that appear in only one file. This requires a map to be created between the lines that are processed by the myers algorithm and the lines in the original file. That map does not include the common lines at the beginning and end of the files but the array is allocated to be the size of the whole file. Move the allocation into xdl_cleanup_records() where the map is populated and we know how big it needs to be. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- xdiff/xprepare.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c index 7a29e5fc4748e2..11bada2608a7a4 100644 --- a/xdiff/xprepare.c +++ b/xdiff/xprepare.c @@ -171,12 +171,6 @@ static int xdl_prepare_ctx(unsigned int pass, mmfile_t *mf, long narec, xpparam_ if (!XDL_CALLOC_ARRAY(xdf->changed, xdf->nrec + 2)) goto abort; - if ((XDF_DIFF_ALG(xpp->flags) != XDF_PATIENCE_DIFF) && - (XDF_DIFF_ALG(xpp->flags) != XDF_HISTOGRAM_DIFF)) { - if (!XDL_ALLOC_ARRAY(xdf->reference_index, xdf->nrec + 1)) - goto abort; - } - xdf->changed += 1; xdf->nreff = 0; xdf->dstart = 0; @@ -283,7 +277,10 @@ static int xdl_cleanup_records(xdlclassifier_t *cf, xdfile_t *xdf1, xdfile_t *xd * changed[i] should remain false, or become true. */ if (!XDL_CALLOC_ARRAY(action1, len1) || - !XDL_CALLOC_ARRAY(action2, len2)) { + !XDL_CALLOC_ARRAY(action2, len2) || + !XDL_ALLOC_ARRAY(xdf1->reference_index, len1) || + !XDL_ALLOC_ARRAY(xdf2->reference_index, len2)) + { ret = -1; goto cleanup; } From 8a349a1d7970ed2c6fcac8ea0c1d641384ad082d Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:05 +0200 Subject: [PATCH 08/29] refs: remove unused typedef 'ref_transaction_commit_fn' The typedef 'ref_transaction_commit_fn' is not used anywhere in our code, let's remove it. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs/refs-internal.h | 4 ---- 1 file changed, 4 deletions(-) diff --git a/refs/refs-internal.h b/refs/refs-internal.h index d79e35fd269a6c..2d963cc4f4e201 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -421,10 +421,6 @@ typedef int ref_transaction_abort_fn(struct ref_store *refs, struct ref_transaction *transaction, struct strbuf *err); -typedef int ref_transaction_commit_fn(struct ref_store *refs, - struct ref_transaction *transaction, - struct strbuf *err); - typedef int optimize_fn(struct ref_store *ref_store, struct refs_optimize_opts *opts); From d194dffcfd3ad26105149f8e0fbd3b6537bf1986 Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:06 +0200 Subject: [PATCH 09/29] refs: introduce `ref_store_init_options` Reference backends are initiated via the `init()` function. When initiating the function, the backend is also provided flags which denote the access levels of the initiator. Create a new structure `ref_store_init_options` to house such options and move the access flags to this structure. This allows easier extension of providing further options to the backends. In the following commit, we'll also provide config around reflog creation to the backends via the same structure. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs.c | 6 +++++- refs/files-backend.c | 8 +++++--- refs/packed-backend.c | 4 ++-- refs/packed-backend.h | 3 ++- refs/refs-internal.h | 11 ++++++++++- refs/reftable-backend.c | 4 ++-- 6 files changed, 26 insertions(+), 10 deletions(-) diff --git a/refs.c b/refs.c index bfcb9c7ac3d38c..8992dd6ae865dd 100644 --- a/refs.c +++ b/refs.c @@ -2295,6 +2295,9 @@ static struct ref_store *ref_store_init(struct repository *repo, { const struct ref_storage_be *be; struct ref_store *refs; + struct ref_store_init_options opts = { + .access_flags = flags, + }; be = find_ref_storage_backend(format); if (!be) @@ -2304,7 +2307,8 @@ static struct ref_store *ref_store_init(struct repository *repo, * TODO Send in a 'struct worktree' instead of a 'gitdir', and * allow the backend to handle how it wants to deal with worktrees. */ - refs = be->init(repo, repo->ref_storage_payload, gitdir, flags); + refs = be->init(repo, repo->ref_storage_payload, gitdir, &opts); + return refs; } diff --git a/refs/files-backend.c b/refs/files-backend.c index b3b0c25f84e503..72afe62cee5967 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -108,7 +108,7 @@ static void clear_loose_ref_cache(struct files_ref_store *refs) static struct ref_store *files_ref_store_init(struct repository *repo, const char *payload, const char *gitdir, - unsigned int flags) + const struct ref_store_init_options *opts) { struct files_ref_store *refs = xcalloc(1, sizeof(*refs)); struct ref_store *ref_store = (struct ref_store *)refs; @@ -120,11 +120,13 @@ static struct ref_store *files_ref_store_init(struct repository *repo, &ref_common_dir); base_ref_store_init(ref_store, repo, refdir.buf, &refs_be_files); - refs->store_flags = flags; + refs->gitcommondir = strbuf_detach(&ref_common_dir, NULL); refs->packed_ref_store = - packed_ref_store_init(repo, NULL, refs->gitcommondir, flags); + packed_ref_store_init(repo, NULL, refs->gitcommondir, opts); + refs->store_flags = opts->access_flags; refs->log_all_ref_updates = repo_settings_get_log_all_ref_updates(repo); + repo_config_get_bool(repo, "core.prefersymlinkrefs", &refs->prefer_symlink_refs); chdir_notify_reparent("files-backend $GIT_DIR", &refs->base.gitdir); diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 23ed62984b765b..35a0f32e1cc45e 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -218,14 +218,14 @@ static size_t snapshot_hexsz(const struct snapshot *snapshot) struct ref_store *packed_ref_store_init(struct repository *repo, const char *payload UNUSED, const char *gitdir, - unsigned int store_flags) + const struct ref_store_init_options *opts) { struct packed_ref_store *refs = xcalloc(1, sizeof(*refs)); struct ref_store *ref_store = (struct ref_store *)refs; struct strbuf sb = STRBUF_INIT; base_ref_store_init(ref_store, repo, gitdir, &refs_be_packed); - refs->store_flags = store_flags; + refs->store_flags = opts->access_flags; strbuf_addf(&sb, "%s/packed-refs", gitdir); refs->path = strbuf_detach(&sb, NULL); diff --git a/refs/packed-backend.h b/refs/packed-backend.h index 2c2377a35653ec..1db48e801d63d0 100644 --- a/refs/packed-backend.h +++ b/refs/packed-backend.h @@ -3,6 +3,7 @@ struct repository; struct ref_transaction; +struct ref_store_init_options; /* * Support for storing references in a `packed-refs` file. @@ -16,7 +17,7 @@ struct ref_transaction; struct ref_store *packed_ref_store_init(struct repository *repo, const char *payload, const char *gitdir, - unsigned int store_flags); + const struct ref_store_init_options *options); /* * Lock the packed-refs file for writing. Flags is passed to diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 2d963cc4f4e201..f49b3807bf3382 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -385,6 +385,15 @@ struct ref_store; REF_STORE_ODB | \ REF_STORE_MAIN) +/* + * Options for initializing the ref backend. All backend-agnostic information + * which backends required will be held here. + */ +struct ref_store_init_options { + /* The kind of operations that the ref_store is allowed to perform. */ + unsigned int access_flags; +}; + /* * Initialize the ref_store for the specified gitdir. These functions * should call base_ref_store_init() to initialize the shared part of @@ -393,7 +402,7 @@ struct ref_store; typedef struct ref_store *ref_store_init_fn(struct repository *repo, const char *payload, const char *gitdir, - unsigned int flags); + const struct ref_store_init_options *opts); /* * Release all memory and resources associated with the ref store. */ diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index daea30a5b4cad9..ad4ee2627c4418 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -369,7 +369,7 @@ static int reftable_be_config(const char *var, const char *value, static struct ref_store *reftable_be_init(struct repository *repo, const char *payload, const char *gitdir, - unsigned int store_flags) + const struct ref_store_init_options *opts) { struct reftable_ref_store *refs = xcalloc(1, sizeof(*refs)); struct strbuf ref_common_dir = STRBUF_INIT; @@ -386,8 +386,8 @@ static struct ref_store *reftable_be_init(struct repository *repo, base_ref_store_init(&refs->base, repo, refdir.buf, &refs_be_reftable); strmap_init(&refs->worktree_backends); - refs->store_flags = store_flags; refs->log_all_ref_updates = repo_settings_get_log_all_ref_updates(repo); + refs->store_flags = opts->access_flags; switch (repo->hash_algo->format_id) { case GIT_SHA1_FORMAT_ID: From cc42c88945753363d67d130c79640e3c682e1334 Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:07 +0200 Subject: [PATCH 10/29] refs: extract out reflog config to generic layer The reference backends need to know when to create reflog entries, this is dictated by the 'core.logallrefupdates' config. Instead of relying on the backends to call `repo_settings_get_log_all_ref_updates()` to obtain this config value, let's do this in the generic layer and pass down the value to the backends. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs.c | 1 + refs/files-backend.c | 2 +- refs/refs-internal.h | 6 ++++++ refs/reftable-backend.c | 2 +- 4 files changed, 9 insertions(+), 2 deletions(-) diff --git a/refs.c b/refs.c index 8992dd6ae865dd..6b506aeea30bae 100644 --- a/refs.c +++ b/refs.c @@ -2297,6 +2297,7 @@ static struct ref_store *ref_store_init(struct repository *repo, struct ref_store *refs; struct ref_store_init_options opts = { .access_flags = flags, + .log_all_ref_updates = repo_settings_get_log_all_ref_updates(repo), }; be = find_ref_storage_backend(format); diff --git a/refs/files-backend.c b/refs/files-backend.c index 72afe62cee5967..4b2faf477727b4 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -125,7 +125,7 @@ static struct ref_store *files_ref_store_init(struct repository *repo, refs->packed_ref_store = packed_ref_store_init(repo, NULL, refs->gitcommondir, opts); refs->store_flags = opts->access_flags; - refs->log_all_ref_updates = repo_settings_get_log_all_ref_updates(repo); + refs->log_all_ref_updates = opts->log_all_ref_updates; repo_config_get_bool(repo, "core.prefersymlinkrefs", &refs->prefer_symlink_refs); diff --git a/refs/refs-internal.h b/refs/refs-internal.h index f49b3807bf3382..d103387ebf1e92 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -392,6 +392,12 @@ struct ref_store; struct ref_store_init_options { /* The kind of operations that the ref_store is allowed to perform. */ unsigned int access_flags; + + /* + * Denotes under what conditions reflogs should be created when updating + * references. + */ + enum log_refs_config log_all_ref_updates; }; /* diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index ad4ee2627c4418..93374d25c24d21 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -386,7 +386,7 @@ static struct ref_store *reftable_be_init(struct repository *repo, base_ref_store_init(&refs->base, repo, refdir.buf, &refs_be_reftable); strmap_init(&refs->worktree_backends); - refs->log_all_ref_updates = repo_settings_get_log_all_ref_updates(repo); + refs->log_all_ref_updates = opts->log_all_ref_updates; refs->store_flags = opts->access_flags; switch (repo->hash_algo->format_id) { From e99e98e600181ddf431b267c2887358b3556e45c Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:08 +0200 Subject: [PATCH 11/29] refs: return `ref_transaction_error` from `ref_transaction_update()` The `ref_transaction_update()` function is used to add updates to a given reference transactions. In the following commit, we'll add more validation to this function. As such, it would be beneficial if the function returns specific error types, so callers can differentiate between different errors. To facilitate this, return `enum ref_transaction_error` from the function and covert the existing '-1' returns to 'REF_TRANSACTION_ERROR_GENERIC'. Since this retains the existing behavior, no changes are made to any of the callers but this sets the necessary infrastructure for introduction of other errors. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs.c | 20 ++++++++++---------- refs.h | 16 ++++++++-------- 2 files changed, 18 insertions(+), 18 deletions(-) diff --git a/refs.c b/refs.c index 6b506aeea30bae..efa16b739d153b 100644 --- a/refs.c +++ b/refs.c @@ -1383,25 +1383,25 @@ static int transaction_refname_valid(const char *refname, return 1; } -int ref_transaction_update(struct ref_transaction *transaction, - const char *refname, - const struct object_id *new_oid, - const struct object_id *old_oid, - const char *new_target, - const char *old_target, - unsigned int flags, const char *msg, - struct strbuf *err) +enum ref_transaction_error ref_transaction_update(struct ref_transaction *transaction, + const char *refname, + const struct object_id *new_oid, + const struct object_id *old_oid, + const char *new_target, + const char *old_target, + unsigned int flags, const char *msg, + struct strbuf *err) { assert(err); if ((flags & REF_FORCE_CREATE_REFLOG) && (flags & REF_SKIP_CREATE_REFLOG)) { strbuf_addstr(err, _("refusing to force and skip creation of reflog")); - return -1; + return REF_TRANSACTION_ERROR_GENERIC; } if (!transaction_refname_valid(refname, new_oid, flags, err)) - return -1; + return REF_TRANSACTION_ERROR_GENERIC; if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS) BUG("illegal flags 0x%x passed to ref_transaction_update()", flags); diff --git a/refs.h b/refs.h index d65de6ab5fe11b..71d5c186d044bb 100644 --- a/refs.h +++ b/refs.h @@ -905,14 +905,14 @@ struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs, * See the above comment "Reference transaction updates" for more * information. */ -int ref_transaction_update(struct ref_transaction *transaction, - const char *refname, - const struct object_id *new_oid, - const struct object_id *old_oid, - const char *new_target, - const char *old_target, - unsigned int flags, const char *msg, - struct strbuf *err); +enum ref_transaction_error ref_transaction_update(struct ref_transaction *transaction, + const char *refname, + const struct object_id *new_oid, + const struct object_id *old_oid, + const char *new_target, + const char *old_target, + unsigned int flags, const char *msg, + struct strbuf *err); /* * Similar to `ref_transaction_update`, but this function is only for adding From 637989cdec77b95fb0743cf433873253cec5ae0f Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:09 +0200 Subject: [PATCH 12/29] update-ref: move `print_rejected_refs()` up The `print_rejected_refs()` function is used to print any rejected refs when using git-updated-ref(1) with the '--batch-updates' option. In the following commit, we'll need to use this function in another place, so move the function up to avoid a separate forward declaration. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- builtin/update-ref.c | 45 ++++++++++++++++++++++---------------------- 1 file changed, 22 insertions(+), 23 deletions(-) diff --git a/builtin/update-ref.c b/builtin/update-ref.c index 2d68c40ecb7010..5259cc72267918 100644 --- a/builtin/update-ref.c +++ b/builtin/update-ref.c @@ -234,6 +234,28 @@ static int parse_next_oid(const char **next, const char *end, command, refname); } +static void print_rejected_refs(const char *refname, + const struct object_id *old_oid, + const struct object_id *new_oid, + const char *old_target, + const char *new_target, + enum ref_transaction_error err, + const char *details, + void *cb_data UNUSED) +{ + struct strbuf sb = STRBUF_INIT; + + if (details && *details) + error("%s", details); + + strbuf_addf(&sb, "rejected %s %s %s %s\n", refname, + new_oid ? oid_to_hex(new_oid) : new_target, + old_oid ? oid_to_hex(old_oid) : old_target, + ref_transaction_error_msg(err)); + + fwrite(sb.buf, sb.len, 1, stdout); + strbuf_release(&sb); +} /* * The following five parse_cmd_*() functions parse the corresponding @@ -567,29 +589,6 @@ static void parse_cmd_abort(struct ref_transaction *transaction, report_ok("abort"); } -static void print_rejected_refs(const char *refname, - const struct object_id *old_oid, - const struct object_id *new_oid, - const char *old_target, - const char *new_target, - enum ref_transaction_error err, - const char *details, - void *cb_data UNUSED) -{ - struct strbuf sb = STRBUF_INIT; - - if (details && *details) - error("%s", details); - - strbuf_addf(&sb, "rejected %s %s %s %s\n", refname, - new_oid ? oid_to_hex(new_oid) : new_target, - old_oid ? oid_to_hex(old_oid) : old_target, - ref_transaction_error_msg(err)); - - fwrite(sb.buf, sb.len, 1, stdout); - strbuf_release(&sb); -} - static void parse_cmd_commit(struct ref_transaction *transaction, const char *next, const char *end UNUSED) { From e31a10418a4c2270651bab326f4715892db9c3ee Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:10 +0200 Subject: [PATCH 13/29] update-ref: handle rejections while adding updates When using git-update-ref(1) with the '--batch-updates' flag, updates rejected by the reference backend are displayed to the user while other updates are applied. This only applies during the commit phase of the transaction. In the following commits, we'll also extend `ref_transaction_update()` to reject updates before a transaction is prepared/committed. In preparation, modify the code in update-ref to also handle non-generic rejections from `ref_transaction_update()`. This involves propagating information to each of the commands on whether updates are allowed to be rejected, and also checking for rejections and only dying for generic failures. Errors encountered during updates will be shown to the user immediately unlike other errors encountered only when the transaction is prepared/committed. As the verification of object IDs and peeled tag objects will move into `ref_transaction_update()` in the following commit, this means that those errors will be shown to the user before other errors, this changes the order of errors, but the functionality remains the same. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- builtin/update-ref.c | 137 +++++++++++++++++++++++++++++++------------ 1 file changed, 98 insertions(+), 39 deletions(-) diff --git a/builtin/update-ref.c b/builtin/update-ref.c index 5259cc72267918..6355c3dd3e7fc5 100644 --- a/builtin/update-ref.c +++ b/builtin/update-ref.c @@ -25,6 +25,15 @@ static unsigned int default_flags; static unsigned create_reflog_flag; static const char *msg; +struct command_options { + /* + * Individual updates are allowed to fail without causing + * update-ref to exit. This is set when using the + * '--batch-updates' flag. + */ + bool allow_update_failures; +}; + /* * Parse one whitespace- or NUL-terminated, possibly C-quoted argument * and append the result to arg. Return a pointer to the terminator. @@ -257,6 +266,31 @@ static void print_rejected_refs(const char *refname, strbuf_release(&sb); } +/* + * Handle transaction errors. If we're using batches updates, we want to only + * die for generic errors and print the remaining to the user. + */ +static void handle_ref_transaction_error(const char *refname, + struct object_id *new_oid, + struct object_id *old_oid, + const char *new_target, + const char *old_target, + enum ref_transaction_error tx_err, + struct strbuf *err, + struct command_options *opts) +{ + if (!tx_err) + return; + + if (tx_err != REF_TRANSACTION_ERROR_GENERIC && opts->allow_update_failures) { + print_rejected_refs(refname, old_oid, new_oid, old_target, + new_target, tx_err, err->buf, NULL); + return; + } + + die("%s", err->buf); +} + /* * The following five parse_cmd_*() functions parse the corresponding * command. In each case, next points at the character following the @@ -268,11 +302,13 @@ static void print_rejected_refs(const char *refname, */ static void parse_cmd_update(struct ref_transaction *transaction, - const char *next, const char *end) + const char *next, const char *end, + struct command_options *opts) { struct strbuf err = STRBUF_INIT; char *refname; struct object_id new_oid, old_oid; + enum ref_transaction_error tx_err; int have_old; refname = parse_refname(&next); @@ -289,12 +325,14 @@ static void parse_cmd_update(struct ref_transaction *transaction, if (*next != line_termination) die("update %s: extra input: %s", refname, next); - if (ref_transaction_update(transaction, refname, - &new_oid, have_old ? &old_oid : NULL, - NULL, NULL, - update_flags | create_reflog_flag, - msg, &err)) - die("%s", err.buf); + tx_err = ref_transaction_update(transaction, refname, + &new_oid, have_old ? &old_oid : NULL, + NULL, NULL, + update_flags | create_reflog_flag, + msg, &err); + handle_ref_transaction_error(refname, &new_oid, have_old ? &old_oid : NULL, + NULL, NULL, tx_err, &err, opts); + update_flags = default_flags; free(refname); @@ -302,9 +340,11 @@ static void parse_cmd_update(struct ref_transaction *transaction, } static void parse_cmd_symref_update(struct ref_transaction *transaction, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts) { char *refname, *new_target, *old_arg; + enum ref_transaction_error tx_err; char *old_target = NULL; struct strbuf err = STRBUF_INIT; struct object_id old_oid; @@ -341,13 +381,15 @@ static void parse_cmd_symref_update(struct ref_transaction *transaction, if (*next != line_termination) die("symref-update %s: extra input: %s", refname, next); - if (ref_transaction_update(transaction, refname, NULL, - have_old_oid ? &old_oid : NULL, - new_target, - have_old_oid ? NULL : old_target, - update_flags | create_reflog_flag, - msg, &err)) - die("%s", err.buf); + tx_err = ref_transaction_update(transaction, refname, NULL, + have_old_oid ? &old_oid : NULL, + new_target, + have_old_oid ? NULL : old_target, + update_flags | create_reflog_flag, + msg, &err); + handle_ref_transaction_error(refname, NULL, have_old_oid ? &old_oid : NULL, + new_target, have_old_oid ? NULL : old_target, + tx_err, &err, opts); update_flags = default_flags; free(refname); @@ -358,11 +400,13 @@ static void parse_cmd_symref_update(struct ref_transaction *transaction, } static void parse_cmd_create(struct ref_transaction *transaction, - const char *next, const char *end) + const char *next, const char *end, + struct command_options *opts) { struct strbuf err = STRBUF_INIT; char *refname; struct object_id new_oid; + enum ref_transaction_error tx_err; refname = parse_refname(&next); if (!refname) @@ -377,22 +421,24 @@ static void parse_cmd_create(struct ref_transaction *transaction, if (*next != line_termination) die("create %s: extra input: %s", refname, next); - if (ref_transaction_create(transaction, refname, &new_oid, NULL, - update_flags | create_reflog_flag, - msg, &err)) - die("%s", err.buf); + tx_err = ref_transaction_create(transaction, refname, &new_oid, NULL, + update_flags | create_reflog_flag, + msg, &err); + handle_ref_transaction_error(refname, &new_oid, NULL, NULL, NULL, tx_err, + &err, opts); update_flags = default_flags; free(refname); strbuf_release(&err); } - static void parse_cmd_symref_create(struct ref_transaction *transaction, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts) { struct strbuf err = STRBUF_INIT; char *refname, *new_target; + enum ref_transaction_error tx_err; refname = parse_refname(&next); if (!refname) @@ -405,10 +451,11 @@ static void parse_cmd_symref_create(struct ref_transaction *transaction, if (*next != line_termination) die("symref-create %s: extra input: %s", refname, next); - if (ref_transaction_create(transaction, refname, NULL, new_target, - update_flags | create_reflog_flag, - msg, &err)) - die("%s", err.buf); + tx_err = ref_transaction_create(transaction, refname, NULL, new_target, + update_flags | create_reflog_flag, + msg, &err); + handle_ref_transaction_error(refname, NULL, NULL, new_target, NULL, + tx_err, &err, opts); update_flags = default_flags; free(refname); @@ -417,7 +464,8 @@ static void parse_cmd_symref_create(struct ref_transaction *transaction, } static void parse_cmd_delete(struct ref_transaction *transaction, - const char *next, const char *end) + const char *next, const char *end, + struct command_options *opts UNUSED) { struct strbuf err = STRBUF_INIT; char *refname; @@ -450,9 +498,9 @@ static void parse_cmd_delete(struct ref_transaction *transaction, strbuf_release(&err); } - static void parse_cmd_symref_delete(struct ref_transaction *transaction, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts UNUSED) { struct strbuf err = STRBUF_INIT; char *refname, *old_target; @@ -479,9 +527,9 @@ static void parse_cmd_symref_delete(struct ref_transaction *transaction, strbuf_release(&err); } - static void parse_cmd_verify(struct ref_transaction *transaction, - const char *next, const char *end) + const char *next, const char *end, + struct command_options *opts UNUSED) { struct strbuf err = STRBUF_INIT; char *refname; @@ -508,7 +556,8 @@ static void parse_cmd_verify(struct ref_transaction *transaction, } static void parse_cmd_symref_verify(struct ref_transaction *transaction, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts UNUSED) { struct strbuf err = STRBUF_INIT; struct object_id old_oid; @@ -550,7 +599,8 @@ static void report_ok(const char *command) } static void parse_cmd_option(struct ref_transaction *transaction UNUSED, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts UNUSED) { const char *rest; if (skip_prefix(next, "no-deref", &rest) && *rest == line_termination) @@ -560,7 +610,8 @@ static void parse_cmd_option(struct ref_transaction *transaction UNUSED, } static void parse_cmd_start(struct ref_transaction *transaction UNUSED, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts UNUSED) { if (*next != line_termination) die("start: extra input: %s", next); @@ -568,7 +619,8 @@ static void parse_cmd_start(struct ref_transaction *transaction UNUSED, } static void parse_cmd_prepare(struct ref_transaction *transaction, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts UNUSED) { struct strbuf error = STRBUF_INIT; if (*next != line_termination) @@ -579,7 +631,8 @@ static void parse_cmd_prepare(struct ref_transaction *transaction, } static void parse_cmd_abort(struct ref_transaction *transaction, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts UNUSED) { struct strbuf error = STRBUF_INIT; if (*next != line_termination) @@ -590,7 +643,8 @@ static void parse_cmd_abort(struct ref_transaction *transaction, } static void parse_cmd_commit(struct ref_transaction *transaction, - const char *next, const char *end UNUSED) + const char *next, const char *end UNUSED, + struct command_options *opts UNUSED) { struct strbuf error = STRBUF_INIT; if (*next != line_termination) @@ -618,7 +672,8 @@ enum update_refs_state { static const struct parse_cmd { const char *prefix; - void (*fn)(struct ref_transaction *, const char *, const char *); + void (*fn)(struct ref_transaction *, const char *, const char *, + struct command_options *); unsigned args; enum update_refs_state state; } command[] = { @@ -644,6 +699,10 @@ static void update_refs_stdin(unsigned int flags) struct ref_transaction *transaction; int i, j; + struct command_options opts = { + .allow_update_failures = flags & REF_TRANSACTION_ALLOW_FAILURE, + }; + transaction = ref_store_transaction_begin(get_main_ref_store(the_repository), flags, &err); if (!transaction) @@ -721,7 +780,7 @@ static void update_refs_stdin(unsigned int flags) } cmd->fn(transaction, input.buf + strlen(cmd->prefix) + !!cmd->args, - input.buf + input.len); + input.buf + input.len, &opts); } switch (state) { From b32c23be3bf444ad8d56e8daee4a704ff8cae0ea Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:11 +0200 Subject: [PATCH 14/29] refs: move object parsing to the generic layer Regular reference updates made via reference transactions validate that the provided object ID exists in the object database, which is done by calling 'parse_object()'. This check is done independently by the backends which leads to duplicated logic. Let's move this to the generic layer, ensuring the backends only have to care about reference storage and not about validation of the object IDs. With this also remove the 'REF_TRANSACTION_ERROR_INVALID_NEW_VALUE' error type as its no longer used. Since we don't iterate over individual references in `ref_transaction_prepare()`, we add this check to `ref_transaction_update()`. This means that the validation is done as soon as an update is queued, without needing to prepare the transaction. It can be argued that this is more ideal, since this validation has no dependency on the reference transaction being prepared. It must be noted that the change in behavior means that this error cannot be ignored even with usage of batched updates, since this happens when the update is being added to the transaction. But since the caller gets specific error codes, they can either abort the transaction or continue adding other updates to the transaction. Modify 'builtin/receive-pack.c' to now capture the error type so that the error propagated to the client stays the same. Also remove two of the tests which validates batch-updates with invalid new_oid. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- builtin/receive-pack.c | 22 +++++++++++++--------- refs.c | 18 ++++++++++++++++++ refs/files-backend.c | 28 ++-------------------------- refs/reftable-backend.c | 19 ------------------- t/t1400-update-ref.sh | 14 ++++++++++++++ 5 files changed, 47 insertions(+), 54 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 878aa7f0ed9eb5..376e755e97e8ea 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1641,8 +1641,8 @@ static const char *update(struct command *cmd, struct shallow_info *si) ret = NULL; /* good */ } strbuf_release(&err); - } - else { + } else { + enum ref_transaction_error tx_err; struct strbuf err = STRBUF_INIT; if (shallow_update && si->shallow_ref[cmd->index] && update_shallow_ref(cmd, si)) { @@ -1650,14 +1650,18 @@ static const char *update(struct command *cmd, struct shallow_info *si) goto out; } - if (ref_transaction_update(transaction, - namespaced_name, - new_oid, old_oid, - NULL, NULL, - 0, "push", - &err)) { + tx_err = ref_transaction_update(transaction, + namespaced_name, + new_oid, old_oid, + NULL, NULL, + 0, "push", + &err); + if (tx_err) { rp_error("%s", err.buf); - ret = "failed to update ref"; + if (tx_err == REF_TRANSACTION_ERROR_GENERIC) + ret = "failed to update ref"; + else + ret = ref_transaction_error_msg(tx_err); } else { ret = NULL; /* good */ } diff --git a/refs.c b/refs.c index efa16b739d153b..662a9e6f9e2048 100644 --- a/refs.c +++ b/refs.c @@ -1416,6 +1416,24 @@ enum ref_transaction_error ref_transaction_update(struct ref_transaction *transa flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0); flags |= (new_target ? REF_HAVE_NEW : 0) | (old_target ? REF_HAVE_OLD : 0); + if ((flags & REF_HAVE_NEW) && !new_target && !is_null_oid(new_oid) && + !(flags & REF_SKIP_OID_VERIFICATION) && !(flags & REF_LOG_ONLY)) { + struct object *o = parse_object(transaction->ref_store->repo, new_oid); + + if (!o) { + strbuf_addf(err, + _("trying to write ref '%s' with nonexistent object %s"), + refname, oid_to_hex(new_oid)); + return REF_TRANSACTION_ERROR_INVALID_NEW_VALUE; + } + + if (o->type != OBJ_COMMIT && is_branch(refname)) { + strbuf_addf(err, _("trying to write non-commit object %s to branch '%s'"), + oid_to_hex(new_oid), refname); + return REF_TRANSACTION_ERROR_INVALID_NEW_VALUE; + } + } + ref_transaction_add_update(transaction, refname, flags, new_oid, old_oid, new_target, old_target, NULL, msg); diff --git a/refs/files-backend.c b/refs/files-backend.c index 4b2faf477727b4..f20f580fbc9601 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -19,7 +19,6 @@ #include "../iterator.h" #include "../dir-iterator.h" #include "../lockfile.h" -#include "../object.h" #include "../path.h" #include "../dir.h" #include "../chdir-notify.h" @@ -1589,7 +1588,6 @@ static int rename_tmp_log(struct files_ref_store *refs, const char *newrefname) static enum ref_transaction_error write_ref_to_lockfile(struct files_ref_store *refs, struct ref_lock *lock, const struct object_id *oid, - int skip_oid_verification, struct strbuf *err); static int commit_ref_update(struct files_ref_store *refs, struct ref_lock *lock, @@ -1737,7 +1735,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store, } oidcpy(&lock->old_oid, &orig_oid); - if (write_ref_to_lockfile(refs, lock, &orig_oid, 0, &err) || + if (write_ref_to_lockfile(refs, lock, &orig_oid, &err) || commit_ref_update(refs, lock, &orig_oid, logmsg, 0, &err)) { error("unable to write current sha1 into %s: %s", newrefname, err.buf); strbuf_release(&err); @@ -1755,7 +1753,7 @@ static int files_copy_or_rename_ref(struct ref_store *ref_store, goto rollbacklog; } - if (write_ref_to_lockfile(refs, lock, &orig_oid, 0, &err) || + if (write_ref_to_lockfile(refs, lock, &orig_oid, &err) || commit_ref_update(refs, lock, &orig_oid, NULL, REF_SKIP_CREATE_REFLOG, &err)) { error("unable to write current sha1 into %s: %s", oldrefname, err.buf); strbuf_release(&err); @@ -1999,32 +1997,11 @@ static int files_log_ref_write(struct files_ref_store *refs, static enum ref_transaction_error write_ref_to_lockfile(struct files_ref_store *refs, struct ref_lock *lock, const struct object_id *oid, - int skip_oid_verification, struct strbuf *err) { static char term = '\n'; - struct object *o; int fd; - if (!skip_oid_verification) { - o = parse_object(refs->base.repo, oid); - if (!o) { - strbuf_addf( - err, - "trying to write ref '%s' with nonexistent object %s", - lock->ref_name, oid_to_hex(oid)); - unlock_ref(lock); - return REF_TRANSACTION_ERROR_INVALID_NEW_VALUE; - } - if (o->type != OBJ_COMMIT && is_branch(lock->ref_name)) { - strbuf_addf( - err, - "trying to write non-commit object %s to branch '%s'", - oid_to_hex(oid), lock->ref_name); - unlock_ref(lock); - return REF_TRANSACTION_ERROR_INVALID_NEW_VALUE; - } - } fd = get_lock_file_fd(&lock->lk); if (write_in_full(fd, oid_to_hex(oid), refs->base.repo->hash_algo->hexsz) < 0 || write_in_full(fd, &term, 1) < 0 || @@ -2828,7 +2805,6 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re } else { ret = write_ref_to_lockfile( refs, lock, &update->new_oid, - update->flags & REF_SKIP_OID_VERIFICATION, err); if (ret) { char *write_err = strbuf_detach(err, NULL); diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 93374d25c24d21..444b0c24e56cd0 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -1081,25 +1081,6 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor return 0; } - /* Verify that the new object ID is valid. */ - if ((u->flags & REF_HAVE_NEW) && !is_null_oid(&u->new_oid) && - !(u->flags & REF_SKIP_OID_VERIFICATION) && - !(u->flags & REF_LOG_ONLY)) { - struct object *o = parse_object(refs->base.repo, &u->new_oid); - if (!o) { - strbuf_addf(err, - _("trying to write ref '%s' with nonexistent object %s"), - u->refname, oid_to_hex(&u->new_oid)); - return REF_TRANSACTION_ERROR_INVALID_NEW_VALUE; - } - - if (o->type != OBJ_COMMIT && is_branch(u->refname)) { - strbuf_addf(err, _("trying to write non-commit object %s to branch '%s'"), - oid_to_hex(&u->new_oid), u->refname); - return REF_TRANSACTION_ERROR_INVALID_NEW_VALUE; - } - } - /* * When we update the reference that HEAD points to we enqueue * a second log-only update for HEAD so that its reflog is diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh index b2858a9061a23d..1015f335e31611 100755 --- a/t/t1400-update-ref.sh +++ b/t/t1400-update-ref.sh @@ -1196,6 +1196,20 @@ test_expect_success 'stdin -z create ref fails with empty new value' ' test_must_fail git rev-parse --verify -q $c ' +test_expect_success 'stdin -z create ref fails with non commit object' ' + printf $F "create $c" "$(test_oid 001)" >stdin && + test_must_fail git update-ref -z --stdin err && + grep "fatal: trying to write ref ${SQ}$c${SQ} with nonexistent object" err && + test_must_fail git rev-parse --verify -q $c +' + +test_expect_success 'stdin -z update ref fails with non commit object' ' + printf $F "update $b" "$(test_oid 001)" "" >stdin && + test_must_fail git update-ref -z --stdin err && + grep "fatal: trying to write ref ${SQ}$b${SQ} with nonexistent object" err && + test_must_fail git rev-parse --verify -q $c +' + test_expect_success 'stdin -z update ref works with right old value' ' printf $F "update $b" "$m~1" "$m" >stdin && git update-ref -z --stdin Date: Mon, 4 May 2026 19:44:12 +0200 Subject: [PATCH 15/29] refs: add peeled object ID to the `ref_update` struct Certain reference backends {packed, reftable}, have the ability to also store the peeled object ID for a reference pointing to a tag object. This has the added benefit that during retrieval of such references, we also obtain the peeled object ID without having to use the ODB. To provide this functionality, each backend independently calls the ODB to obtain the peeled OID. To move this functionality to the generic layer, there must be support infrastructure to pass in a peeled OID for reference updates. Add a `peeled` field to the `ref_update` structure and modify `ref_transaction_add_update()` to receive and copy this object ID to the `ref_update` structure. Finally, modify `ref_transaction_update()` to peel tag objects and pass the peeled OID to `ref_transaction_add_update()`. Update all callers of these functions with the new function parameters. Callers which only add reflog updates, need to only pass in NULL, since for reflogs, we don't store peeled OIDs. Reference deletions also only need to pass in NULL. For others, pass along the peeled OID if available. In a following commit, we'll modify the backends to use this peeled OID instead of parsing it themselves. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs.c | 15 +++++++++++++-- refs/files-backend.c | 20 ++++++++++++-------- refs/refs-internal.h | 14 ++++++++++++++ refs/reftable-backend.c | 6 +++--- 4 files changed, 42 insertions(+), 13 deletions(-) diff --git a/refs.c b/refs.c index 662a9e6f9e2048..0648df2b6c7449 100644 --- a/refs.c +++ b/refs.c @@ -1307,6 +1307,7 @@ struct ref_update *ref_transaction_add_update( const char *refname, unsigned int flags, const struct object_id *new_oid, const struct object_id *old_oid, + const struct object_id *peeled, const char *new_target, const char *old_target, const char *committer_info, const char *msg) @@ -1339,6 +1340,8 @@ struct ref_update *ref_transaction_add_update( update->committer_info = xstrdup_or_null(committer_info); update->msg = normalize_reflog_message(msg); } + if (flags & REF_HAVE_PEELED) + oidcpy(&update->peeled, peeled); /* * This list is generally used by the backends to avoid duplicates. @@ -1392,6 +1395,8 @@ enum ref_transaction_error ref_transaction_update(struct ref_transaction *transa unsigned int flags, const char *msg, struct strbuf *err) { + struct object_id peeled; + assert(err); if ((flags & REF_FORCE_CREATE_REFLOG) && @@ -1432,10 +1437,16 @@ enum ref_transaction_error ref_transaction_update(struct ref_transaction *transa oid_to_hex(new_oid), refname); return REF_TRANSACTION_ERROR_INVALID_NEW_VALUE; } + + if (o->type == OBJ_TAG) { + if (!peel_object(transaction->ref_store->repo, new_oid, &peeled, + PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE)) + flags |= REF_HAVE_PEELED; + } } ref_transaction_add_update(transaction, refname, flags, - new_oid, old_oid, new_target, + new_oid, old_oid, &peeled, new_target, old_target, NULL, msg); return 0; @@ -1462,7 +1473,7 @@ int ref_transaction_update_reflog(struct ref_transaction *transaction, return -1; update = ref_transaction_add_update(transaction, refname, flags, - new_oid, old_oid, NULL, NULL, + new_oid, old_oid, NULL, NULL, NULL, committer_info, msg); update->index = index; diff --git a/refs/files-backend.c b/refs/files-backend.c index f20f580fbc9601..d0896d0e373302 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1325,7 +1325,8 @@ static void prune_ref(struct files_ref_store *refs, struct ref_to_prune *r) ref_transaction_add_update( transaction, r->name, REF_NO_DEREF | REF_HAVE_NEW | REF_HAVE_OLD | REF_IS_PRUNING, - null_oid(the_hash_algo), &r->oid, NULL, NULL, NULL, NULL); + null_oid(the_hash_algo), &r->oid, NULL, NULL, NULL, + NULL, NULL); if (ref_transaction_commit(transaction, &err)) goto cleanup; @@ -2468,7 +2469,7 @@ static enum ref_transaction_error split_head_update(struct ref_update *update, new_update = ref_transaction_add_update( transaction, "HEAD", update->flags | REF_LOG_ONLY | REF_NO_DEREF | REF_LOG_VIA_SPLIT, - &update->new_oid, &update->old_oid, + &update->new_oid, &update->old_oid, &update->peeled, NULL, NULL, update->committer_info, update->msg); new_update->parent_update = update; @@ -2530,8 +2531,8 @@ static enum ref_transaction_error split_symref_update(struct ref_update *update, transaction, referent, new_flags, update->new_target ? NULL : &update->new_oid, update->old_target ? NULL : &update->old_oid, - update->new_target, update->old_target, NULL, - update->msg); + &update->peeled, update->new_target, update->old_target, + NULL, update->msg); new_update->parent_update = update; @@ -2994,7 +2995,7 @@ static int files_transaction_prepare(struct ref_store *ref_store, ref_transaction_add_update( packed_transaction, update->refname, REF_HAVE_NEW | REF_NO_DEREF, - &update->new_oid, NULL, + &update->new_oid, NULL, NULL, NULL, NULL, NULL, NULL); } } @@ -3200,19 +3201,22 @@ static int files_transaction_finish_initial(struct files_ref_store *refs, if (update->flags & REF_LOG_ONLY) ref_transaction_add_update(loose_transaction, update->refname, update->flags, &update->new_oid, - &update->old_oid, NULL, NULL, + &update->old_oid, &update->peeled, + NULL, NULL, update->committer_info, update->msg); else ref_transaction_add_update(loose_transaction, update->refname, update->flags & ~REF_HAVE_OLD, update->new_target ? NULL : &update->new_oid, NULL, - update->new_target, NULL, update->committer_info, + &update->peeled, update->new_target, + NULL, update->committer_info, NULL); } else { ref_transaction_add_update(packed_transaction, update->refname, update->flags & ~REF_HAVE_OLD, &update->new_oid, &update->old_oid, - NULL, NULL, update->committer_info, NULL); + &update->peeled, NULL, NULL, + update->committer_info, NULL); } } diff --git a/refs/refs-internal.h b/refs/refs-internal.h index d103387ebf1e92..307dcb277b1f68 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -39,6 +39,13 @@ struct ref_transaction; */ #define REF_LOG_ONLY (1 << 7) +/* + * The reference contains a peeled object ID. This is used when the + * new_oid is pointing to a tag object and the reference backend + * wants to also store the peeled value for optimized retrieval. + */ +#define REF_HAVE_PEELED (1 << 15) + /* * Return the length of time to retry acquiring a loose reference lock * before giving up, in milliseconds: @@ -92,6 +99,12 @@ struct ref_update { */ struct object_id old_oid; + /* + * If the new_oid points to a tag object, set this to the peeled + * object ID for optimized retrieval without needed to hit the odb. + */ + struct object_id peeled; + /* * If set, point the reference to this value. This can also be * used to convert regular references to become symbolic refs. @@ -169,6 +182,7 @@ struct ref_update *ref_transaction_add_update( const char *refname, unsigned int flags, const struct object_id *new_oid, const struct object_id *old_oid, + const struct object_id *peeled, const char *new_target, const char *old_target, const char *committer_info, const char *msg); diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 444b0c24e56cd0..b0c010387d8d4e 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -1107,8 +1107,8 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor ref_transaction_add_update( transaction, "HEAD", u->flags | REF_LOG_ONLY | REF_NO_DEREF, - &u->new_oid, &u->old_oid, NULL, NULL, NULL, - u->msg); + &u->new_oid, &u->old_oid, &u->peeled, NULL, NULL, + NULL, u->msg); } ret = reftable_backend_read_ref(be, rewritten_ref, @@ -1194,7 +1194,7 @@ static enum ref_transaction_error prepare_single_update(struct reftable_ref_stor transaction, referent->buf, new_flags, u->new_target ? NULL : &u->new_oid, u->old_target ? NULL : &u->old_oid, - u->new_target, u->old_target, + &u->peeled, u->new_target, u->old_target, u->committer_info, u->msg); new_update->parent_update = u; From 3ab3d5077dc7048f9a0209e06f7de30971a303d7 Mon Sep 17 00:00:00 2001 From: Karthik Nayak Date: Mon, 4 May 2026 19:44:13 +0200 Subject: [PATCH 16/29] refs: use peeled tag values in reference backends The reference backends peel tag objects when storing references to them. This is to provide optimized reads which avoids hitting the odb. The previous commits ensures that the peeled value is now propagated via the generic layer. So modify the packed and reftable backend to directly use this value instead of calling `peel_object()` independently. Signed-off-by: Karthik Nayak Signed-off-by: Junio C Hamano --- refs/packed-backend.c | 6 ++---- refs/reftable-backend.c | 9 ++------- 2 files changed, 4 insertions(+), 11 deletions(-) diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 35a0f32e1cc45e..0acde48c452513 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -1531,13 +1531,11 @@ static enum ref_transaction_error write_with_updates(struct packed_ref_store *re */ i++; } else { - struct object_id peeled; - int peel_error = peel_object(refs->base.repo, &update->new_oid, - &peeled, PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE); + bool peeled = update->flags & REF_HAVE_PEELED; if (write_packed_entry(out, update->refname, &update->new_oid, - peel_error ? NULL : &peeled)) + peeled ? &update->peeled : NULL)) goto write_error; i++; diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index b0c010387d8d4e..8b4ac2e6180559 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -12,7 +12,6 @@ #include "../hex.h" #include "../ident.h" #include "../iterator.h" -#include "../object.h" #include "../parse.h" #include "../path.h" #include "../refs.h" @@ -1584,17 +1583,13 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data goto done; } else if (u->flags & REF_HAVE_NEW) { struct reftable_ref_record ref = {0}; - struct object_id peeled; - int peel_error; ref.refname = (char *)u->refname; ref.update_index = ts; - peel_error = peel_object(arg->refs->base.repo, &u->new_oid, &peeled, - PEEL_OBJECT_VERIFY_TAGGED_OBJECT_TYPE); - if (!peel_error) { + if (u->flags & REF_HAVE_PEELED) { ref.value_type = REFTABLE_REF_VAL2; - memcpy(ref.value.val2.target_value, peeled.hash, GIT_MAX_RAWSZ); + memcpy(ref.value.val2.target_value, u->peeled.hash, GIT_MAX_RAWSZ); memcpy(ref.value.val2.value, u->new_oid.hash, GIT_MAX_RAWSZ); } else if (!is_null_oid(&u->new_oid)) { ref.value_type = REFTABLE_REF_VAL1; From 663d7abe07ea376c2657019a03297ae87037c993 Mon Sep 17 00:00:00 2001 From: Aliwoto Date: Tue, 5 May 2026 09:19:40 +0000 Subject: [PATCH 17/29] http: reject unsupported proxy URL schemes An explicit proxy URL with an unrecognized scheme such as htpp://127.0.0.1 is currently accepted. Git parses the URL, extracts the host part, and then passes only that host to libcurl. Because no proxy type is selected for the unknown scheme, Git leaves libcurl at its default HTTP proxy type, so the typo is silently treated as an HTTP proxy. Reject proxy URLs with explicit unsupported schemes instead of silently accepting them. Keep the existing host:port-without-scheme behavior unchanged. Implement the SOCKS proxy handling with a shared table-driven mapping. Add a regression test to cover the unsupported-scheme case. Signed-off-by: Aliwoto Signed-off-by: Junio C Hamano --- http.c | 93 +++++++++++++++++++++++++++++++------------ t/t5564-http-proxy.sh | 6 +++ 2 files changed, 74 insertions(+), 25 deletions(-) diff --git a/http.c b/http.c index 67c9c6fc60673d..8e5a4d8bcf8eac 100644 --- a/http.c +++ b/http.c @@ -744,6 +744,69 @@ static int has_proxy_cert_password(void) return 1; } +static const struct socks_proxy_type { + const char *name; + long curlsym; +} socks_proxy_types[] = { + { "socks", CURLPROXY_SOCKS4 }, + { "socks4", CURLPROXY_SOCKS4 }, + { "socks4a", CURLPROXY_SOCKS4A }, + { "socks5", CURLPROXY_SOCKS5 }, + { "socks5h", CURLPROXY_SOCKS5_HOSTNAME }, +}; + +static const struct socks_proxy_type *find_socks_proxy_type(const char *protocol) +{ + int i; + + if (!protocol) + return NULL; + + for (i = 0; i < ARRAY_SIZE(socks_proxy_types); i++) { + if (!strcmp(socks_proxy_types[i].name, protocol)) + return &socks_proxy_types[i]; + } + + return NULL; +} + +static int is_socks_proxy_protocol(const char *protocol) +{ + return !!find_socks_proxy_type(protocol); +} + +static int set_curl_proxy_type(CURL *result, const char *protocol) +{ + const struct socks_proxy_type *socks_proxy_type; + + if (!protocol || !strcmp(protocol, "http")) + return 0; + + socks_proxy_type = find_socks_proxy_type(protocol); + if (socks_proxy_type) { + curl_easy_setopt(result, CURLOPT_PROXYTYPE, socks_proxy_type->curlsym); + return 0; + } + + if (!strcmp(protocol, "https")) { + curl_easy_setopt(result, CURLOPT_PROXYTYPE, (long)CURLPROXY_HTTPS); + + if (http_proxy_ssl_cert) + curl_easy_setopt(result, CURLOPT_PROXY_SSLCERT, + http_proxy_ssl_cert); + + if (http_proxy_ssl_key) + curl_easy_setopt(result, CURLOPT_PROXY_SSLKEY, + http_proxy_ssl_key); + + if (has_proxy_cert_password()) + curl_easy_setopt(result, CURLOPT_PROXY_KEYPASSWD, + proxy_cert_auth.password); + } + + return -1; +} + /* Return 1 if redactions have been made, 0 otherwise. */ static int redact_sensitive_header(struct strbuf *header, size_t offset) { @@ -1214,30 +1277,6 @@ static CURL *get_curl_handle(void) } else if (curl_http_proxy) { struct strbuf proxy = STRBUF_INIT; - if (starts_with(curl_http_proxy, "socks5h")) - curl_easy_setopt(result, - CURLOPT_PROXYTYPE, (long)CURLPROXY_SOCKS5_HOSTNAME); - else if (starts_with(curl_http_proxy, "socks5")) - curl_easy_setopt(result, - CURLOPT_PROXYTYPE, (long)CURLPROXY_SOCKS5); - else if (starts_with(curl_http_proxy, "socks4a")) - curl_easy_setopt(result, - CURLOPT_PROXYTYPE, (long)CURLPROXY_SOCKS4A); - else if (starts_with(curl_http_proxy, "socks")) - curl_easy_setopt(result, - CURLOPT_PROXYTYPE, (long)CURLPROXY_SOCKS4); - else if (starts_with(curl_http_proxy, "https")) { - curl_easy_setopt(result, CURLOPT_PROXYTYPE, (long)CURLPROXY_HTTPS); - - if (http_proxy_ssl_cert) - curl_easy_setopt(result, CURLOPT_PROXY_SSLCERT, http_proxy_ssl_cert); - - if (http_proxy_ssl_key) - curl_easy_setopt(result, CURLOPT_PROXY_SSLKEY, http_proxy_ssl_key); - - if (has_proxy_cert_password()) - curl_easy_setopt(result, CURLOPT_PROXY_KEYPASSWD, proxy_cert_auth.password); - } if (strstr(curl_http_proxy, "://")) credential_from_url(&proxy_auth, curl_http_proxy); else { @@ -1247,6 +1286,10 @@ static CURL *get_curl_handle(void) strbuf_release(&url); } + if (set_curl_proxy_type(result, proxy_auth.protocol) < 0) + die("Invalid proxy URL '%s': unsupported proxy scheme '%s'", + curl_http_proxy, proxy_auth.protocol); + if (!proxy_auth.host) die("Invalid proxy URL '%s'", curl_http_proxy); @@ -1257,7 +1300,7 @@ static CURL *get_curl_handle(void) if (ver->version_num < 0x075400) die("libcurl 7.84 or later is required to support paths in proxy URLs"); - if (!starts_with(proxy_auth.protocol, "socks")) + if (!is_socks_proxy_protocol(proxy_auth.protocol)) die("Invalid proxy URL '%s': only SOCKS proxies support paths", curl_http_proxy); diff --git a/t/t5564-http-proxy.sh b/t/t5564-http-proxy.sh index 3bcbdef409b25f..5669ce37d805a4 100755 --- a/t/t5564-http-proxy.sh +++ b/t/t5564-http-proxy.sh @@ -95,4 +95,10 @@ test_expect_success 'Unix socket requires localhost' - <<\EOT } EOT +test_expect_success 'unknown proxy scheme is rejected' ' + test_must_fail git clone -c http.proxy=htpp://127.0.0.1 \ + https://example.com/repo.git 2>err && + test_grep "unsupported proxy scheme '\''htpp'\''" err +' + test_done From a8f96968a96d5b0a90118402e81742d26c8347cb Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:35 +0000 Subject: [PATCH 18/29] connect: rename enum protocol to url_scheme MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit RFC 1738 names the part of a URL before the colon a "scheme". connect.c calls it "protocol", which is more generic and collides with the unrelated enum protocol_version. Rename: enum protocol -> enum url_scheme PROTO_* -> URL_SCHEME_* prot_name -> url_scheme_name get_protocol -> url_get_scheme The local variables in parse_connect_url and git_connect are renamed accordingly, from protocol to scheme. No behavior change. The user-visible diagnostics and translated error messages are preserved: "Diag: protocol=..." "protocol '%s' is not supported" "unknown protocol" This rename also prepares for moving the scheme-detection functions to a shared header so that a future plumbing command can parse URLs using the same logic as the connect path. Suggested-by: Torsten Bögershausen Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- connect.c | 68 +++++++++++++++++++++++++++---------------------------- 1 file changed, 34 insertions(+), 34 deletions(-) diff --git a/connect.c b/connect.c index fcd35c5539a76e..46da89905ee7c6 100644 --- a/connect.c +++ b/connect.c @@ -700,11 +700,11 @@ int server_supports(const char *feature) return !!server_feature_value(feature, NULL); } -enum protocol { - PROTO_LOCAL = 1, - PROTO_FILE, - PROTO_SSH, - PROTO_GIT +enum url_scheme { + URL_SCHEME_LOCAL = 1, + URL_SCHEME_FILE, + URL_SCHEME_SSH, + URL_SCHEME_GIT }; int url_is_local_not_ssh(const char *url) @@ -715,33 +715,33 @@ int url_is_local_not_ssh(const char *url) (has_dos_drive_prefix(url) && is_valid_path(url)); } -static const char *prot_name(enum protocol protocol) +static const char *url_scheme_name(enum url_scheme scheme) { - switch (protocol) { - case PROTO_LOCAL: - case PROTO_FILE: + switch (scheme) { + case URL_SCHEME_LOCAL: + case URL_SCHEME_FILE: return "file"; - case PROTO_SSH: + case URL_SCHEME_SSH: return "ssh"; - case PROTO_GIT: + case URL_SCHEME_GIT: return "git"; default: return "unknown protocol"; } } -static enum protocol get_protocol(const char *name) +static enum url_scheme url_get_scheme(const char *name) { if (!strcmp(name, "ssh")) - return PROTO_SSH; + return URL_SCHEME_SSH; if (!strcmp(name, "git")) - return PROTO_GIT; + return URL_SCHEME_GIT; if (!strcmp(name, "git+ssh")) /* deprecated - do not use */ - return PROTO_SSH; + return URL_SCHEME_SSH; if (!strcmp(name, "ssh+git")) /* deprecated - do not use */ - return PROTO_SSH; + return URL_SCHEME_SSH; if (!strcmp(name, "file")) - return PROTO_FILE; + return URL_SCHEME_FILE; die(_("protocol '%s' is not supported"), name); } @@ -1083,14 +1083,14 @@ static char *get_port(char *host) * Extract protocol and relevant parts from the specified connection URL. * The caller must free() the returned strings. */ -static enum protocol parse_connect_url(const char *url_orig, char **ret_host, - char **ret_path) +static enum url_scheme parse_connect_url(const char *url_orig, char **ret_host, + char **ret_path) { char *url; char *host, *path; char *end; int separator = '/'; - enum protocol protocol = PROTO_LOCAL; + enum url_scheme scheme = URL_SCHEME_LOCAL; if (is_url(url_orig)) url = url_decode(url_orig); @@ -1100,12 +1100,12 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host, host = strstr(url, "://"); if (host) { *host = '\0'; - protocol = get_protocol(url); + scheme = url_get_scheme(url); host += 3; } else { host = url; if (!url_is_local_not_ssh(url)) { - protocol = PROTO_SSH; + scheme = URL_SCHEME_SSH; separator = ':'; } } @@ -1116,13 +1116,13 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host, */ end = host_end(&host, 0); - if (protocol == PROTO_LOCAL) + if (scheme == URL_SCHEME_LOCAL) path = end; - else if (protocol == PROTO_FILE && *host != '/' && + else if (scheme == URL_SCHEME_FILE && *host != '/' && !has_dos_drive_prefix(host) && offset_1st_component(host - 2) > 1) path = host - 2; /* include the leading "//" */ - else if (protocol == PROTO_FILE && has_dos_drive_prefix(end)) + else if (scheme == URL_SCHEME_FILE && has_dos_drive_prefix(end)) path = end; /* "file://$(pwd)" may be "file://C:/projects/repo" */ else path = strchr(end, separator); @@ -1138,7 +1138,7 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host, end = path; /* Need to \0 terminate host here */ if (separator == ':') path++; /* path starts after ':' */ - if (protocol == PROTO_GIT || protocol == PROTO_SSH) { + if (scheme == URL_SCHEME_GIT || scheme == URL_SCHEME_SSH) { if (path[1] == '~') path++; } @@ -1149,7 +1149,7 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host, *ret_host = xstrdup(host); *ret_path = path; free(url); - return protocol; + return scheme; } static const char *get_ssh_command(void) @@ -1434,7 +1434,7 @@ struct child_process *git_connect(int fd[2], const char *url, { char *hostandport, *path; struct child_process *conn; - enum protocol protocol; + enum url_scheme scheme; enum protocol_version version = get_protocol_version_config(); /* @@ -1451,14 +1451,14 @@ struct child_process *git_connect(int fd[2], const char *url, */ signal(SIGCHLD, SIG_DFL); - protocol = parse_connect_url(url, &hostandport, &path); - if ((flags & CONNECT_DIAG_URL) && (protocol != PROTO_SSH)) { + scheme = parse_connect_url(url, &hostandport, &path); + if ((flags & CONNECT_DIAG_URL) && (scheme != URL_SCHEME_SSH)) { printf("Diag: url=%s\n", url ? url : "NULL"); - printf("Diag: protocol=%s\n", prot_name(protocol)); + printf("Diag: protocol=%s\n", url_scheme_name(scheme)); printf("Diag: hostandport=%s\n", hostandport ? hostandport : "NULL"); printf("Diag: path=%s\n", path ? path : "NULL"); conn = NULL; - } else if (protocol == PROTO_GIT) { + } else if (scheme == URL_SCHEME_GIT) { conn = git_connect_git(fd, hostandport, path, prog, version, flags); conn->trace2_child_class = "transport/git"; } else { @@ -1481,7 +1481,7 @@ struct child_process *git_connect(int fd[2], const char *url, conn->use_shell = 1; conn->in = conn->out = -1; - if (protocol == PROTO_SSH) { + if (scheme == URL_SCHEME_SSH) { char *ssh_host = hostandport; const char *port = NULL; transport_check_allowed("ssh"); @@ -1492,7 +1492,7 @@ struct child_process *git_connect(int fd[2], const char *url, if (flags & CONNECT_DIAG_URL) { printf("Diag: url=%s\n", url ? url : "NULL"); - printf("Diag: protocol=%s\n", prot_name(protocol)); + printf("Diag: protocol=%s\n", url_scheme_name(scheme)); printf("Diag: userandhost=%s\n", ssh_host ? ssh_host : "NULL"); printf("Diag: port=%s\n", port ? port : "NONE"); printf("Diag: path=%s\n", path ? path : "NULL"); From 51fcf73014f542f074a253add5867c24c82c854f Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:36 +0000 Subject: [PATCH 19/29] url: move url_is_local_not_ssh to url.h Move url_is_local_not_ssh from connect.c/connect.h to url.c/url.h so that the new url_parse function in urlmatch.c, and any future code that needs to distinguish a local path from an scp style SSH URL, can reuse the heuristic without depending on connect.c. No behavior change. Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- connect.c | 8 -------- connect.h | 1 - remote.c | 1 + url.c | 8 ++++++++ url.h | 2 ++ 5 files changed, 11 insertions(+), 9 deletions(-) diff --git a/connect.c b/connect.c index 46da89905ee7c6..cb145de30e502c 100644 --- a/connect.c +++ b/connect.c @@ -707,14 +707,6 @@ enum url_scheme { URL_SCHEME_GIT }; -int url_is_local_not_ssh(const char *url) -{ - const char *colon = strchr(url, ':'); - const char *slash = strchr(url, '/'); - return !colon || (slash && slash < colon) || - (has_dos_drive_prefix(url) && is_valid_path(url)); -} - static const char *url_scheme_name(enum url_scheme scheme) { switch (scheme) { diff --git a/connect.h b/connect.h index 1645126c17f889..8d84f6656b1a2a 100644 --- a/connect.h +++ b/connect.h @@ -13,7 +13,6 @@ int git_connection_is_socket(struct child_process *conn); int server_supports(const char *feature); int parse_feature_request(const char *features, const char *feature); const char *server_feature_value(const char *feature, size_t *len_ret); -int url_is_local_not_ssh(const char *url); struct packet_reader; enum protocol_version discover_version(struct packet_reader *reader); diff --git a/remote.c b/remote.c index a664cd166aa3b9..24a8118d250c72 100644 --- a/remote.c +++ b/remote.c @@ -8,6 +8,7 @@ #include "gettext.h" #include "hex.h" #include "remote.h" +#include "url.h" #include "urlmatch.h" #include "refs.h" #include "refspec.h" diff --git a/url.c b/url.c index 3ca5987e905d59..057576042af1be 100644 --- a/url.c +++ b/url.c @@ -132,3 +132,11 @@ void str_end_url_with_slash(const char *url, char **dest) free(*dest); *dest = strbuf_detach(&buf, NULL); } + +int url_is_local_not_ssh(const char *url) +{ + const char *colon = strchr(url, ':'); + const char *slash = strchr(url, '/'); + return !colon || (slash && slash < colon) || + (has_dos_drive_prefix(url) && is_valid_path(url)); +} diff --git a/url.h b/url.h index cd9140e9946b16..39d621312ffdcb 100644 --- a/url.h +++ b/url.h @@ -21,6 +21,8 @@ char *url_decode_parameter_value(const char **query); void end_url_with_slash(struct strbuf *buf, const char *url); void str_end_url_with_slash(const char *url, char **dest); +int url_is_local_not_ssh(const char *url); + /* * The set of unreserved characters as per STD66 (RFC3986) is * '[A-Za-z0-9-._~]'. These characters are safe to appear in URI From d48e36a8a23d931e869fbb3156fc95a5732cb061 Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:37 +0000 Subject: [PATCH 20/29] url: move scheme detection to URL header/source Move enum url_scheme and url_get_scheme() from connect.c to url.h and url.c so that other code can identify a URL's scheme without depending on connect.c. No behavior change. url_get_scheme() still dies on an unrecognized scheme name, with the same translated message as before. scheme_name() stays in connect.c because it has no other callers. Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- connect.c | 22 ---------------------- url.c | 16 ++++++++++++++++ url.h | 13 +++++++++++++ 3 files changed, 29 insertions(+), 22 deletions(-) diff --git a/connect.c b/connect.c index cb145de30e502c..1ac7acc6e881db 100644 --- a/connect.c +++ b/connect.c @@ -700,13 +700,6 @@ int server_supports(const char *feature) return !!server_feature_value(feature, NULL); } -enum url_scheme { - URL_SCHEME_LOCAL = 1, - URL_SCHEME_FILE, - URL_SCHEME_SSH, - URL_SCHEME_GIT -}; - static const char *url_scheme_name(enum url_scheme scheme) { switch (scheme) { @@ -722,21 +715,6 @@ static const char *url_scheme_name(enum url_scheme scheme) } } -static enum url_scheme url_get_scheme(const char *name) -{ - if (!strcmp(name, "ssh")) - return URL_SCHEME_SSH; - if (!strcmp(name, "git")) - return URL_SCHEME_GIT; - if (!strcmp(name, "git+ssh")) /* deprecated - do not use */ - return URL_SCHEME_SSH; - if (!strcmp(name, "ssh+git")) /* deprecated - do not use */ - return URL_SCHEME_SSH; - if (!strcmp(name, "file")) - return URL_SCHEME_FILE; - die(_("protocol '%s' is not supported"), name); -} - static char *host_end(char **hoststart, int removebrackets) { char *host = *hoststart; diff --git a/url.c b/url.c index 057576042af1be..300acf98feae03 100644 --- a/url.c +++ b/url.c @@ -1,4 +1,5 @@ #include "git-compat-util.h" +#include "gettext.h" #include "hex-ll.h" #include "strbuf.h" #include "url.h" @@ -140,3 +141,18 @@ int url_is_local_not_ssh(const char *url) return !colon || (slash && slash < colon) || (has_dos_drive_prefix(url) && is_valid_path(url)); } + +enum url_scheme url_get_scheme(const char *name) +{ + if (!strcmp(name, "ssh")) + return URL_SCHEME_SSH; + if (!strcmp(name, "git")) + return URL_SCHEME_GIT; + if (!strcmp(name, "git+ssh")) /* deprecated - do not use */ + return URL_SCHEME_SSH; + if (!strcmp(name, "ssh+git")) /* deprecated - do not use */ + return URL_SCHEME_SSH; + if (!strcmp(name, "file")) + return URL_SCHEME_FILE; + die(_("protocol '%s' is not supported"), name); +} diff --git a/url.h b/url.h index 39d621312ffdcb..24c8cd91d0f02e 100644 --- a/url.h +++ b/url.h @@ -23,6 +23,19 @@ void str_end_url_with_slash(const char *url, char **dest); int url_is_local_not_ssh(const char *url); +enum url_scheme { + URL_SCHEME_LOCAL = 1, + URL_SCHEME_FILE, + URL_SCHEME_SSH, + URL_SCHEME_GIT, +}; + +/* + * Identify the URL scheme by name. Dies if the name does not match + * any scheme that Git knows about. + */ +enum url_scheme url_get_scheme(const char *name); + /* * The set of unreserved characters as per STD66 (RFC3986) is * '[A-Za-z0-9-._~]'. These characters are safe to appear in URI From 46d6fb752e7d8550a3511eb370536d216ddb5b8f Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:38 +0000 Subject: [PATCH 21/29] url: return URL_SCHEME_UNKNOWN instead of dying Enumerate a URL_SCHEME_UNKNOWN result with value 0. Have url_get_scheme() return it for unrecognized schemes instead of calling die() itself. Move the die() call to parse_connect_url() where url_get_scheme() is used. This lets url_get_scheme() be used from contexts that need to identify a URL's scheme without aborting the program. For example, a future plumbing command that validates URLs. No external behavior change. parse_connect_url() still dies with the same translated message for unrecognized schemes. Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- connect.c | 2 ++ url.c | 3 +-- url.h | 7 ++++--- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/connect.c b/connect.c index 1ac7acc6e881db..73d7a6b8d03afe 100644 --- a/connect.c +++ b/connect.c @@ -1071,6 +1071,8 @@ static enum url_scheme parse_connect_url(const char *url_orig, char **ret_host, if (host) { *host = '\0'; scheme = url_get_scheme(url); + if (scheme == URL_SCHEME_UNKNOWN) + die(_("protocol '%s' is not supported"), url); host += 3; } else { host = url; diff --git a/url.c b/url.c index 300acf98feae03..a59818278f49df 100644 --- a/url.c +++ b/url.c @@ -1,5 +1,4 @@ #include "git-compat-util.h" -#include "gettext.h" #include "hex-ll.h" #include "strbuf.h" #include "url.h" @@ -154,5 +153,5 @@ enum url_scheme url_get_scheme(const char *name) return URL_SCHEME_SSH; if (!strcmp(name, "file")) return URL_SCHEME_FILE; - die(_("protocol '%s' is not supported"), name); + return URL_SCHEME_UNKNOWN; } diff --git a/url.h b/url.h index 24c8cd91d0f02e..728952360586a9 100644 --- a/url.h +++ b/url.h @@ -24,15 +24,16 @@ void str_end_url_with_slash(const char *url, char **dest); int url_is_local_not_ssh(const char *url); enum url_scheme { - URL_SCHEME_LOCAL = 1, + URL_SCHEME_UNKNOWN = 0, + URL_SCHEME_LOCAL, URL_SCHEME_FILE, URL_SCHEME_SSH, URL_SCHEME_GIT, }; /* - * Identify the URL scheme by name. Dies if the name does not match - * any scheme that Git knows about. + * Identify the URL scheme by name. Returns URL_SCHEME_UNKNOWN + * if the name does not match any scheme that Git knows about. */ enum url_scheme url_get_scheme(const char *name); From 18a828171243b630bc7585c7bc8d85bb37125c01 Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:39 +0000 Subject: [PATCH 22/29] urlmatch: define url_parse function Define url_parse, a general parsing function that supports all Git URLs including scp style URLs such as hostname:~user/repo. It is adapted from the algorithm in connect.c's parse_connect_url and reuses the shared enum url_scheme and url_get_scheme function that previous commits made available in url.h. The new parser and the connect path agree on scheme classification. url_parse has the same interface as url_normalize and uses the same data structures. Both functions accept the same URL forms with one deliberate exception. Bare local paths such as "/abs/path", "./rel" or "repo" are accepted by parse_connect_url as URL_SCHEME_LOCAL, but rejected by url_parse because url_normalize requires a URL with a scheme://host form. A consumer that wants to handle both URLs and local paths needs to dispatch on url_is_local_not_ssh before calling url_parse, just as the connect path does internally. The duplication with parse_connect_url is intentional. The two functions have different contracts: - parse_connect_url Calls die() on an unknown scheme and returns NUL-terminated host/path strings for the connect path - url_parse Returns NULL on failure while populating out_info->err, and exposes components as offset/length pairs into the normalized URL buffer, matching url_normalize. Reconciling both is possible, but not in the scope of the current patch set. Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- t/unit-tests/u-urlmatch-normalization.c | 45 +++++++++ urlmatch.c | 127 ++++++++++++++++++++++++ urlmatch.h | 1 + 3 files changed, 173 insertions(+) diff --git a/t/unit-tests/u-urlmatch-normalization.c b/t/unit-tests/u-urlmatch-normalization.c index 39f6e1ba26f3e5..3595d893a2b5e6 100644 --- a/t/unit-tests/u-urlmatch-normalization.c +++ b/t/unit-tests/u-urlmatch-normalization.c @@ -245,3 +245,48 @@ void test_urlmatch_normalization__equivalents(void) compare_normalized_urls("https://@x.y/^/../abc", "httpS://@x.y:0443/abc", 1); compare_normalized_urls("https://@x.y/^/..", "httpS://@x.y:0443/", 1); } + +static void check_parsed_path(const char *url, const char *expected_path) +{ + struct url_info info; + char *parsed = url_parse(url, &info); + char *path; + + cl_assert(parsed != NULL); + path = xstrndup(parsed + info.path_off, info.path_len); + cl_assert_equal_s(path, expected_path); + free(path); + free(parsed); +} + +void test_urlmatch_normalization__parse_scp(void) +{ + check_parsed_path("host:path", "/path"); + check_parsed_path("user@host:path", "/path"); + check_parsed_path("host:~user/repo", "~user/repo"); + check_parsed_path("user@host:~user/repo", "~user/repo"); + check_parsed_path("[host]:src", "/src"); + check_parsed_path("[host:123]:src", "/src"); + check_parsed_path("[::1]:repo", "/repo"); + check_parsed_path("user@[::1]:repo", "/repo"); +} + +void test_urlmatch_normalization__parse_url_form(void) +{ + check_parsed_path("ssh://host/repo", "/repo"); + check_parsed_path("ssh://host/~user/repo", "~user/repo"); + check_parsed_path("git://host:9418/repo", "/repo"); + check_parsed_path("git://host/~user/repo", "~user/repo"); + check_parsed_path("ssh://[::1]:1234/repo", "/repo"); + check_parsed_path("http://[2001:db8::1]/repo", "/repo"); +} + +void test_urlmatch_normalization__parse_strips_query_and_fragment(void) +{ + check_parsed_path("ssh://host/~user/repo?q", "~user/repo"); + check_parsed_path("ssh://host/~user/repo#frag", "~user/repo"); + check_parsed_path("git://host/~user/repo?q", "~user/repo"); + check_parsed_path("user@host:~user/repo?q", "~user/repo"); + check_parsed_path("https://host/repo?q", "/repo"); + check_parsed_path("https://host/repo#frag", "/repo"); +} diff --git a/urlmatch.c b/urlmatch.c index eea8300489d79b..bf8cce6de9d8da 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -5,6 +5,7 @@ #include "hex-ll.h" #include "strbuf.h" #include "urlmatch.h" +#include "url.h" #define URL_ALPHA "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" #define URL_DIGIT "0123456789" @@ -440,6 +441,132 @@ char *url_normalize(const char *url, struct url_info *out_info) return url_normalize_1(url, out_info, 0); } +char *url_parse(const char *url_orig, struct url_info *out_info) +{ + struct strbuf url; + char *host, *separator; + char *detached, *normalized; + char *url_decoded; + enum url_scheme scheme = URL_SCHEME_LOCAL; + struct url_info local_info; + struct url_info *info = out_info ? out_info : &local_info; + bool scp_syntax = false; + + if (is_url(url_orig)) + url_decoded = url_decode(url_orig); + else + url_decoded = xstrdup(url_orig); + + strbuf_init(&url, strlen(url_decoded) + sizeof("ssh://")); + strbuf_addstr(&url, url_decoded); + free(url_decoded); + + host = strstr(url.buf, "://"); + if (host) { + /* + * Temporarily NUL-terminate the scheme name + * so we can pass it to url_get_scheme(), + * then restore the ':' so the buffer + * is intact for url_normalize() below. + */ + char saved = *host; + *host = '\0'; + scheme = url_get_scheme(url.buf); + *host = saved; + host += 3; + } else { + if (!url_is_local_not_ssh(url.buf)) { + scp_syntax = true; + scheme = URL_SCHEME_SSH; + strbuf_insertstr(&url, 0, "ssh://"); + host = url.buf + strlen("ssh://"); + } + } + + /* + * Path starts after ':' in scp style SSH URLs. + * + * The host portion can begin with an optional "user@", + * and the host itself can be wrapped in '[' ']' brackets. + * The bracket form is git's legacy way of supporting: + * + * - IPv6 literals: [::1]:repo + * - host:port pairs in the short form: [myhost:123]:src + * - Plain hostnames that happen to need bracketing: [host]:path + * + * Treat '[' followed by 0 or 1 inner colons as the host:port + * or plain hostname form and strip the brackets so url_normalize + * sees host[:port] natively. Two or more inner colons mark an + * IPv6 literal: keep the brackets for url_normalize to recognize. + * + * The scp path separator is the ':' that follows the host part, + * and we must skip over user@ and any '[...]' before searching. + */ + if (scp_syntax) { + char *user_at; + char *host_start; + char *bracket_end; + + user_at = strchr(host, '@'); + host_start = user_at ? user_at + 1 : host; + + if (*host_start == '[') { + char *p; + int inner_colons; + + bracket_end = strchr(host_start, ']'); + inner_colons = 0; + for (p = host_start + 1; bracket_end && p < bracket_end; p++) + if (*p == ':') + inner_colons++; + + if (bracket_end && inner_colons <= 1) { + size_t close_off = bracket_end - url.buf; + size_t open_off = host_start - url.buf; + strbuf_remove(&url, close_off, 1); + strbuf_remove(&url, open_off, 1); + separator = url.buf + close_off - 1; + } else if (bracket_end) { + separator = strchr(bracket_end + 1, ':'); + } else { + separator = strchr(host_start, ':'); + } + } else { + separator = strchr(host_start, ':'); + } + + if (separator) { + if (separator[1] == '/') + strbuf_remove(&url, separator - url.buf, 1); + else + *separator = '/'; + } + } + + detached = strbuf_detach(&url, NULL); + normalized = url_normalize(detached, info); + free(detached); + + if (!normalized) + return NULL; + + /* + * Point path to ~ for URLs like this: + * + * ssh://host.xz/~user/repo + * git://host.xz/~user/repo + * host.xz:~user/repo + */ + if (scheme == URL_SCHEME_GIT || scheme == URL_SCHEME_SSH) { + if (normalized[info->path_off + 1] == '~') { + info->path_off++; + info->path_len--; + } + } + + return normalized; +} + static size_t url_match_prefix(const char *url, const char *url_prefix, size_t url_prefix_len) diff --git a/urlmatch.h b/urlmatch.h index 5ba85cea1396dd..6b3ce428582da3 100644 --- a/urlmatch.h +++ b/urlmatch.h @@ -35,6 +35,7 @@ struct url_info { }; char *url_normalize(const char *, struct url_info *); +char *url_parse(const char *, struct url_info *); struct urlmatch_item { size_t hostmatch_len; From 533eb14798d0e4e288401b90d4684730a3ed9266 Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:40 +0000 Subject: [PATCH 23/29] builtin: create url-parse command Git commands can accept a rather wide variety of URLs syntaxes. The range of accepted inputs might expand even more in the future. This makes the parsing of URL components difficult since standard URL parsers cannot be used. Extracting the components of a git URL would require implementing all the schemes that git itself supports, not to mention tracking its development continuously in case new URL schemes are added. The url-parse builtin command is designed to solve this problem by exposing git's native URL parsing facilities as a plumbing command. Other programs can then call upon git itself to parse the git URLs and extract their components. This should be quite useful for scripts. Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- .gitignore | 1 + Makefile | 1 + builtin.h | 1 + builtin/url-parse.c | 135 ++++++++++++++++++++++++++++++++++++++++++++ command-list.txt | 1 + git.c | 1 + meson.build | 1 + 7 files changed, 141 insertions(+) create mode 100644 builtin/url-parse.c diff --git a/.gitignore b/.gitignore index 24635cf2d6f4a3..c5673daa6eb672 100644 --- a/.gitignore +++ b/.gitignore @@ -182,6 +182,7 @@ /git-update-server-info /git-upload-archive /git-upload-pack +/git-url-parse /git-var /git-verify-commit /git-verify-pack diff --git a/Makefile b/Makefile index cedc234173e377..1c757a1aa0bf97 100644 --- a/Makefile +++ b/Makefile @@ -1497,6 +1497,7 @@ BUILTIN_OBJS += builtin/update-ref.o BUILTIN_OBJS += builtin/update-server-info.o BUILTIN_OBJS += builtin/upload-archive.o BUILTIN_OBJS += builtin/upload-pack.o +BUILTIN_OBJS += builtin/url-parse.o BUILTIN_OBJS += builtin/var.o BUILTIN_OBJS += builtin/verify-commit.o BUILTIN_OBJS += builtin/verify-pack.o diff --git a/builtin.h b/builtin.h index 235c51f30e5380..c6f767299108cf 100644 --- a/builtin.h +++ b/builtin.h @@ -271,6 +271,7 @@ int cmd_update_server_info(int argc, const char **argv, const char *prefix, stru int cmd_upload_archive(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_upload_pack(int argc, const char **argv, const char *prefix, struct repository *repo); +int cmd_url_parse(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_var(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_verify_commit(int argc, const char **argv, const char *prefix, struct repository *repo); int cmd_verify_tag(int argc, const char **argv, const char *prefix, struct repository *repo); diff --git a/builtin/url-parse.c b/builtin/url-parse.c new file mode 100644 index 00000000000000..7e705538c04d3d --- /dev/null +++ b/builtin/url-parse.c @@ -0,0 +1,135 @@ +#include "builtin.h" +#include "gettext.h" +#include "parse-options.h" +#include "url.h" +#include "urlmatch.h" + +static const char * const builtin_url_parse_usage[] = { + N_("git url-parse [-c ] [--] ..."), + NULL +}; + +static char *component_arg; + +static struct option builtin_url_parse_options[] = { + OPT_STRING('c', "component", &component_arg, N_("component"), + N_("which URL component to extract")), + OPT_END(), +}; + +enum url_component { + URL_NONE = 0, + URL_SCHEME, + URL_USER, + URL_PASSWORD, + URL_HOST, + URL_PORT, + URL_PATH, +}; + +static void parse_or_die(const char *url, struct url_info *info) +{ + if (url_is_local_not_ssh(url)) { + if (*url == '/') + die("'%s' is not a URL; if you meant a local " + "repository, use 'file://%s'", url, url); + if (has_dos_drive_prefix(url)) + die("'%s' is not a URL; if you meant a local " + "repository, use 'file:///%s'", url, url); + die("'%s' is not a URL; if you meant a local repository, " + "use a 'file://' URL with an absolute path", url); + } + if (!url_parse(url, info)) + die("invalid git URL '%s': %s", url, info->err); +} + +static enum url_component get_component_or_die(const char *arg) +{ + if (!strcmp("path", arg)) + return URL_PATH; + if (!strcmp("host", arg)) + return URL_HOST; + if (!strcmp("scheme", arg)) + return URL_SCHEME; + if (!strcmp("user", arg)) + return URL_USER; + if (!strcmp("password", arg)) + return URL_PASSWORD; + if (!strcmp("port", arg)) + return URL_PORT; + die("invalid git URL component '%s'", arg); +} + +static char *extract_component(enum url_component component, + struct url_info *info) +{ + size_t offset, length; + + switch (component) { + case URL_SCHEME: + offset = 0; + length = info->scheme_len; + break; + case URL_USER: + offset = info->user_off; + length = info->user_len; + break; + case URL_PASSWORD: + offset = info->passwd_off; + length = info->passwd_len; + break; + case URL_HOST: + offset = info->host_off; + length = info->host_len; + break; + case URL_PORT: + offset = info->port_off; + length = info->port_len; + break; + case URL_PATH: + offset = info->path_off; + length = info->path_len; + break; + case URL_NONE: + return NULL; + } + + return xstrndup(info->url + offset, length); +} + +int cmd_url_parse(int argc, + const char **argv, + const char *prefix, + struct repository *repo UNUSED) +{ + struct url_info info; + enum url_component selected = URL_NONE; + char *extracted; + int i; + + argc = parse_options(argc, argv, prefix, builtin_url_parse_options, + builtin_url_parse_usage, 0); + + if (argc == 0) + usage_with_options(builtin_url_parse_usage, + builtin_url_parse_options); + + if (component_arg) + selected = get_component_or_die(component_arg); + + for (i = 0; i < argc; i++) { + parse_or_die(argv[i], &info); + + if (selected != URL_NONE) { + extracted = extract_component(selected, &info); + if (extracted) { + puts(extracted); + free(extracted); + } + } + + free(info.url); + } + + return 0; +} diff --git a/command-list.txt b/command-list.txt index f9005cf45979f1..1ede48186f89f1 100644 --- a/command-list.txt +++ b/command-list.txt @@ -202,6 +202,7 @@ git-update-ref plumbingmanipulators git-update-server-info synchingrepositories git-upload-archive synchelpers git-upload-pack synchelpers +git-url-parse purehelpers git-var plumbinginterrogators git-verify-commit ancillaryinterrogators git-verify-pack plumbinginterrogators diff --git a/git.c b/git.c index 5a40eab8a26a66..a073eed9317afc 100644 --- a/git.c +++ b/git.c @@ -670,6 +670,7 @@ static struct cmd_struct commands[] = { { "upload-archive", cmd_upload_archive, NO_PARSEOPT }, { "upload-archive--writer", cmd_upload_archive_writer, NO_PARSEOPT }, { "upload-pack", cmd_upload_pack }, + { "url-parse", cmd_url_parse }, { "var", cmd_var, RUN_SETUP_GENTLY | NO_PARSEOPT }, { "verify-commit", cmd_verify_commit, RUN_SETUP }, { "verify-pack", cmd_verify_pack }, diff --git a/meson.build b/meson.build index 11488623bfd8f8..dc3cf68ee571f2 100644 --- a/meson.build +++ b/meson.build @@ -686,6 +686,7 @@ builtin_sources = [ 'builtin/update-server-info.c', 'builtin/upload-archive.c', 'builtin/upload-pack.c', + 'builtin/url-parse.c', 'builtin/var.c', 'builtin/verify-commit.c', 'builtin/verify-pack.c', From d1671b13dc3c5d87368bd09604540ad0a8ed33b5 Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:41 +0000 Subject: [PATCH 24/29] doc: describe the url-parse builtin The new url-parse builtin validates git URLs and optionally extracts their components. Helped-by: Ghanshyam Thakkar Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- Documentation/git-url-parse.adoc | 80 ++++++++++++++++++++++++++++++++ Documentation/meson.build | 1 + 2 files changed, 81 insertions(+) create mode 100644 Documentation/git-url-parse.adoc diff --git a/Documentation/git-url-parse.adoc b/Documentation/git-url-parse.adoc new file mode 100644 index 00000000000000..9d0d93da4a24a2 --- /dev/null +++ b/Documentation/git-url-parse.adoc @@ -0,0 +1,80 @@ +git-url-parse(1) +================ + +NAME +---- +git-url-parse - Parse and extract git URL components + +SYNOPSIS +-------- +[synopsis] +git url-parse [-c ] [--] ... + +DESCRIPTION +----------- + +Git supports many ways to specify URLs, some of them non-standard. +For example, git supports the scp style [user@]host:[path] format. +This command eases interoperability with git URLs by enabling the +parsing and extraction of the components of all git URLs. + +Any syntactically valid URL is parsed, even if the scheme is not one +git supports for fetching or pushing. + +OPTIONS +------- + +`-c `:: +`--component `:: + Extract the __ component from the given Git URLs. + __ can be one of: + `scheme`, `user`, `password`, `host`, `port`, `path`. + +OUTPUT +------ + +When `--component` is given, the requested component of each URL +is printed on its own line, in the order the URLs were given. If +the URL has no such component (for example, a port in a URL that +does not specify one), an empty line is printed in its place. + +When `--component` is not given, no output is produced. The exit +status is zero if every URL parses successfully and non-zero +otherwise, allowing the command to be used purely as a validator. + +EXAMPLES +-------- + +* Print the host name: ++ +------------ +$ git url-parse --component host https://example.com/user/repo +example.com +------------ + +* Print the path: ++ +------------ +$ git url-parse --component path https://example.com/user/repo +/user/repo +$ git url-parse --component path example.com:~user/repo +~user/repo +$ git url-parse --component path example.com:user/repo +/user/repo +------------ + +* Validate URLs without outputting anything: ++ +------------ +$ git url-parse https://example.com/user/repo example.com:~user/repo +------------ + +SEE ALSO +-------- +linkgit:git-clone[1], +linkgit:git-fetch[1], +linkgit:git-config[1] + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Documentation/meson.build b/Documentation/meson.build index d6365b888bbed3..32c8606a80045f 100644 --- a/Documentation/meson.build +++ b/Documentation/meson.build @@ -155,6 +155,7 @@ manpages = { 'git-update-server-info.adoc' : 1, 'git-upload-archive.adoc' : 1, 'git-upload-pack.adoc' : 1, + 'git-url-parse.adoc' : 1, 'git-var.adoc' : 1, 'git-verify-commit.adoc' : 1, 'git-verify-pack.adoc' : 1, From 0e2149cff1c37c5f9602d515d1d39d9701d15e24 Mon Sep 17 00:00:00 2001 From: Matheus Afonso Martins Moreira Date: Sat, 2 May 2026 05:28:42 +0000 Subject: [PATCH 25/29] t9904: add tests for the new url-parse builtin MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Test git URL parsing, validation and component extraction on all documented git URL schemes and syntaxes. Add IPv6 host coverage in URL form: ssh://[::1]/path ssh://user@[::1]:1234/path git://[::1]:9418/path http://[2001:db8::1]/path https://[2001:db8::1]/path In URL form the brackets are kept in the host component (RFC 3986 syntax for IPv6 literals). Also exercise the bracketed scp short forms that t5601-clone.sh covers via parse_connect_url: [host]:path [host:port]:path [::1]:repo user@[::1]:repo user@[host:port]:path In scp form, brackets are kept for IPv6 literals (two or more inner colons) and stripped for plain hostnames or host:port pairs. Suggested-by: Torsten Bögershausen Signed-off-by: Matheus Afonso Martins Moreira Signed-off-by: Junio C Hamano --- t/meson.build | 1 + t/t9904-url-parse.sh | 319 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 320 insertions(+) create mode 100755 t/t9904-url-parse.sh diff --git a/t/meson.build b/t/meson.build index 7528e5cda5fef0..41b389a4727298 100644 --- a/t/meson.build +++ b/t/meson.build @@ -1114,6 +1114,7 @@ integration_tests = [ 't9901-git-web--browse.sh', 't9902-completion.sh', 't9903-bash-prompt.sh', + 't9904-url-parse.sh', ] benchmarks = [ diff --git a/t/t9904-url-parse.sh b/t/t9904-url-parse.sh new file mode 100755 index 00000000000000..8a369d2040d8fd --- /dev/null +++ b/t/t9904-url-parse.sh @@ -0,0 +1,319 @@ +#!/bin/sh +# +# Copyright (c) 2024 Matheus Afonso Martins Moreira +# + +test_description='git url-parse tests' + +. ./test-lib.sh + +test_expect_success 'git url-parse -- ssh syntax' ' + git url-parse "ssh://user@example.com:1234/repository/path" && + git url-parse "ssh://user@example.com/repository/path" && + git url-parse "ssh://example.com:1234/repository/path" && + git url-parse "ssh://example.com/repository/path" +' + +test_expect_success 'git url-parse -- git syntax' ' + git url-parse "git://example.com:1234/repository/path" && + git url-parse "git://example.com/repository/path" +' + +test_expect_success 'git url-parse -- http syntax' ' + git url-parse "https://example.com:1234/repository/path" && + git url-parse "https://example.com/repository/path" && + git url-parse "http://example.com:1234/repository/path" && + git url-parse "http://example.com/repository/path" +' + +test_expect_success 'git url-parse -- scp syntax' ' + git url-parse "user@example.com:/repository/path" && + git url-parse "example.com:/repository/path" +' + +test_expect_success 'git url-parse -- username expansion - ssh syntax' ' + git url-parse "ssh://user@example.com:1234/~user/repository" && + git url-parse "ssh://user@example.com/~user/repository" && + git url-parse "ssh://example.com:1234/~user/repository" && + git url-parse "ssh://example.com/~user/repository" +' + +test_expect_success 'git url-parse -- username expansion - git syntax' ' + git url-parse "git://example.com:1234/~user/repository" && + git url-parse "git://example.com/~user/repository" +' + +test_expect_success 'git url-parse -- username expansion - scp syntax' ' + git url-parse "user@example.com:~user/repository" && + git url-parse "example.com:~user/repository" +' + +test_expect_success 'git url-parse -- file urls' ' + git url-parse "file:///repository/path" && + git url-parse "file://" +' + +test_expect_success 'git url-parse -c scheme -- ssh syntax' ' + test ssh = "$(git url-parse -c scheme "ssh://user@example.com:1234/repository/path")" && + test ssh = "$(git url-parse -c scheme "ssh://user@example.com/repository/path")" && + test ssh = "$(git url-parse -c scheme "ssh://example.com:1234/repository/path")" && + test ssh = "$(git url-parse -c scheme "ssh://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c scheme -- git syntax' ' + test git = "$(git url-parse -c scheme "git://example.com:1234/repository/path")" && + test git = "$(git url-parse -c scheme "git://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c scheme -- http syntax' ' + test https = "$(git url-parse -c scheme "https://example.com:1234/repository/path")" && + test https = "$(git url-parse -c scheme "https://example.com/repository/path")" && + test http = "$(git url-parse -c scheme "http://example.com:1234/repository/path")" && + test http = "$(git url-parse -c scheme "http://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c scheme -- scp syntax' ' + test ssh = "$(git url-parse -c scheme "user@example.com:/repository/path")" && + test ssh = "$(git url-parse -c scheme "example.com:/repository/path")" +' + +test_expect_success 'git url-parse -c user -- ssh syntax' ' + test user = "$(git url-parse -c user "ssh://user@example.com:1234/repository/path")" && + test user = "$(git url-parse -c user "ssh://user@example.com/repository/path")" && + test "" = "$(git url-parse -c user "ssh://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c user "ssh://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c user -- git syntax' ' + test "" = "$(git url-parse -c user "git://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c user "git://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c user -- http syntax' ' + test "" = "$(git url-parse -c user "https://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c user "https://example.com/repository/path")" && + test "" = "$(git url-parse -c user "http://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c user "http://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c user -- scp syntax' ' + test user = "$(git url-parse -c user "user@example.com:/repository/path")" && + test "" = "$(git url-parse -c user "example.com:/repository/path")" +' + +test_expect_success 'git url-parse -c password -- http syntax' ' + test secret = "$(git url-parse -c password "https://user:secret@example.com:1234/repository/path")" && + test secret = "$(git url-parse -c password "http://user:secret@example.com/repository/path")" && + test "" = "$(git url-parse -c password "https://user@example.com/repository/path")" && + test "" = "$(git url-parse -c password "https://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c host -- ssh syntax' ' + test example.com = "$(git url-parse -c host "ssh://user@example.com:1234/repository/path")" && + test example.com = "$(git url-parse -c host "ssh://user@example.com/repository/path")" && + test example.com = "$(git url-parse -c host "ssh://example.com:1234/repository/path")" && + test example.com = "$(git url-parse -c host "ssh://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c host -- git syntax' ' + test example.com = "$(git url-parse -c host "git://example.com:1234/repository/path")" && + test example.com = "$(git url-parse -c host "git://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c host -- http syntax' ' + test example.com = "$(git url-parse -c host "https://example.com:1234/repository/path")" && + test example.com = "$(git url-parse -c host "https://example.com/repository/path")" && + test example.com = "$(git url-parse -c host "http://example.com:1234/repository/path")" && + test example.com = "$(git url-parse -c host "http://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c host -- scp syntax' ' + test example.com = "$(git url-parse -c host "user@example.com:/repository/path")" && + test example.com = "$(git url-parse -c host "example.com:/repository/path")" +' + +test_expect_success 'git url-parse -c port -- ssh syntax' ' + test 1234 = "$(git url-parse -c port "ssh://user@example.com:1234/repository/path")" && + test "" = "$(git url-parse -c port "ssh://user@example.com/repository/path")" && + test 1234 = "$(git url-parse -c port "ssh://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c port "ssh://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c port -- git syntax' ' + test 1234 = "$(git url-parse -c port "git://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c port "git://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c port -- http syntax' ' + test 1234 = "$(git url-parse -c port "https://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c port "https://example.com/repository/path")" && + test 1234 = "$(git url-parse -c port "http://example.com:1234/repository/path")" && + test "" = "$(git url-parse -c port "http://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c port -- scp syntax' ' + test "" = "$(git url-parse -c port "user@example.com:/repository/path")" && + test "" = "$(git url-parse -c port "example.com:/repository/path")" +' + +test_expect_success 'git url-parse -c path -- ssh syntax' ' + test "/repository/path" = "$(git url-parse -c path "ssh://user@example.com:1234/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "ssh://user@example.com/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "ssh://example.com:1234/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "ssh://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c path -- git syntax' ' + test "/repository/path" = "$(git url-parse -c path "git://example.com:1234/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "git://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c path -- http syntax' ' + test "/repository/path" = "$(git url-parse -c path "https://example.com:1234/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "https://example.com/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "http://example.com:1234/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "http://example.com/repository/path")" +' + +test_expect_success 'git url-parse -c path -- scp syntax' ' + test "/repository/path" = "$(git url-parse -c path "user@example.com:/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "example.com:/repository/path")" +' + +test_expect_success 'git url-parse -c path -- username expansion - ssh syntax' ' + test "~user/repository" = "$(git url-parse -c path "ssh://user@example.com:1234/~user/repository")" && + test "~user/repository" = "$(git url-parse -c path "ssh://user@example.com/~user/repository")" && + test "~user/repository" = "$(git url-parse -c path "ssh://example.com:1234/~user/repository")" && + test "~user/repository" = "$(git url-parse -c path "ssh://example.com/~user/repository")" +' + +test_expect_success 'git url-parse -c path -- username expansion - git syntax' ' + test "~user/repository" = "$(git url-parse -c path "git://example.com:1234/~user/repository")" && + test "~user/repository" = "$(git url-parse -c path "git://example.com/~user/repository")" +' + +test_expect_success 'git url-parse -c path -- username expansion - scp syntax' ' + test "~user/repository" = "$(git url-parse -c path "user@example.com:~user/repository")" && + test "~user/repository" = "$(git url-parse -c path "example.com:~user/repository")" +' + +test_expect_success 'git url-parse -c path -- username expansion strips query and fragment' ' + test "~user/repository" = "$(git url-parse -c path "ssh://example.com/~user/repository?query")" && + test "~user/repository" = "$(git url-parse -c path "ssh://example.com/~user/repository#fragment")" && + test "~user/repository" = "$(git url-parse -c path "git://example.com/~user/repository?query")" && + test "~user/repository" = "$(git url-parse -c path "user@example.com:~user/repository?query")" +' + +test_expect_success 'git url-parse -- ssh syntax with IPv6' ' + git url-parse "ssh://user@[::1]:1234/repository/path" && + git url-parse "ssh://user@[::1]/repository/path" && + git url-parse "ssh://[::1]:1234/repository/path" && + git url-parse "ssh://[::1]/repository/path" && + git url-parse "ssh://[2001:db8::1]/repository/path" +' + +test_expect_success 'git url-parse -- git syntax with IPv6' ' + git url-parse "git://[::1]:9418/repository/path" && + git url-parse "git://[::1]/repository/path" +' + +test_expect_success 'git url-parse -- http syntax with IPv6' ' + git url-parse "https://[::1]:1234/repository/path" && + git url-parse "https://[::1]/repository/path" && + git url-parse "http://[2001:db8::1]/repository/path" +' + +test_expect_success 'git url-parse -c host -- IPv6 in URL form' ' + test "[::1]" = "$(git url-parse -c host "ssh://user@[::1]:1234/repository/path")" && + test "[::1]" = "$(git url-parse -c host "ssh://[::1]/repository/path")" && + test "[2001:db8::1]" = "$(git url-parse -c host "ssh://[2001:db8::1]/repository/path")" && + test "[::1]" = "$(git url-parse -c host "git://[::1]/repository/path")" && + test "[2001:db8::1]" = "$(git url-parse -c host "https://[2001:db8::1]/repository/path")" +' + +test_expect_success 'git url-parse -c port -- IPv6 in URL form' ' + test 1234 = "$(git url-parse -c port "ssh://user@[::1]:1234/repository/path")" && + test "" = "$(git url-parse -c port "ssh://[::1]/repository/path")" && + test 9418 = "$(git url-parse -c port "git://[::1]:9418/repository/path")" +' + +test_expect_success 'git url-parse -- scp syntax with IPv6' ' + git url-parse "[::1]:repository/path" && + git url-parse "user@[::1]:repository/path" && + git url-parse "[2001:db8::1]:repo" +' + +test_expect_success 'git url-parse -- scp syntax with bracketed hostname' ' + git url-parse "[myhost]:src" && + git url-parse "user@[myhost]:src" +' + +test_expect_success 'git url-parse -- scp syntax with bracketed host:port' ' + git url-parse "[myhost:123]:src" && + git url-parse "user@[myhost:123]:src" +' + +test_expect_success 'git url-parse -c host -- scp+IPv6' ' + test "[::1]" = "$(git url-parse -c host "[::1]:repository/path")" && + test "[::1]" = "$(git url-parse -c host "user@[::1]:repository/path")" && + test "[2001:db8::1]" = "$(git url-parse -c host "[2001:db8::1]:repo")" +' + +test_expect_success 'git url-parse -c path -- scp+IPv6' ' + test "/repository/path" = "$(git url-parse -c path "[::1]:/repository/path")" && + test "/repository/path" = "$(git url-parse -c path "[::1]:repository/path")" && + test "/repo" = "$(git url-parse -c path "[2001:db8::1]:repo")" +' + +test_expect_success 'git url-parse -c host,port,path -- scp [host:port]:src' ' + test myhost = "$(git url-parse -c host "[myhost:123]:src")" && + test 123 = "$(git url-parse -c port "[myhost:123]:src")" && + test "/src" = "$(git url-parse -c path "[myhost:123]:src")" +' + +test_expect_success 'git url-parse -c host,path -- scp [host]:src' ' + test myhost = "$(git url-parse -c host "[myhost]:src")" && + test "/src" = "$(git url-parse -c path "[myhost]:src")" +' + +test_expect_success 'git url-parse -c user -- scp with user@ and brackets' ' + test user = "$(git url-parse -c user "user@[::1]:repo")" && + test user = "$(git url-parse -c user "user@[myhost:123]:src")" && + test user = "$(git url-parse -c user "user@[myhost]:src")" +' + +test_expect_success 'git url-parse -- scp+IPv6 with username expansion' ' + test "~user/repo" = "$(git url-parse -c path "[::1]:~user/repo")" && + test "~user/repo" = "$(git url-parse -c path "user@[::1]:~user/repo")" +' + +test_expect_success 'git url-parse fails on invalid URL' ' + test_must_fail git url-parse "not a url" +' + +test_expect_success 'git url-parse helpful error for absolute local path' ' + test_must_fail git url-parse "/abs/path" 2>err && + test_grep "is not a URL" err && + test_grep "file:///" err +' + +test_expect_success 'git url-parse helpful error for relative local path' ' + test_must_fail git url-parse "./rel" 2>err && + test_grep "is not a URL" err && + test_grep "absolute path" err +' + +test_expect_success 'git url-parse fails on unknown -c component name' ' + test_must_fail git url-parse -c bogus "https://example.com/repo" +' + +test_expect_success 'git url-parse fails on URL missing host' ' + test_must_fail git url-parse "https://" +' + +test_expect_success 'git url-parse with no URL prints usage' ' + test_must_fail git url-parse 2>err && + test_grep "usage:" err +' + +test_done From 5ba82911bccc12d5ce2ccad98db935c9a0780cbe Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Mon, 11 May 2026 08:51:15 +0900 Subject: [PATCH 26/29] ci: enable EXPENSIVE for contributor builds Earlier, we enabled EXPENSIVE tests for pushes to integration branches. As we didn't have any CI jobs that run these tests, this was a step in the right direction. It however is an ineffective and inefficient use of the maintainer time, which does not scale, to allow contributors to send changes that are less tested at the list, only to force the maintainer notice breakages caused by their changes but only after these changes are mixed with changes from other contributors. The problematic topic needs to be isolated by bisecting, and it historically has been done by the maintainer alone. It is far better to let the problem identified early, preferably before the problematic code leaves the hands of the original developer. In order for it to happen, the test coverage of the contributor tests must be at least as wide as the coverage of the integration tests. Enable expensive tests for CI jobs triggered by pull requests. This will make each contributor take care of their own, which scales much better. Keep the expensive tests also enabled for the pushes of integration branches, as that is the only place we can notice problems stemming from mismerges and inter-topic interactions, even if the topics from the contributors in isolation all passes these tests. Signed-off-by: Junio C Hamano --- ci/lib.sh | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/ci/lib.sh b/ci/lib.sh index a671994bdf511f..4ca3ecef2c2e24 100755 --- a/ci/lib.sh +++ b/ci/lib.sh @@ -314,11 +314,13 @@ export DEFAULT_TEST_TARGET=prove export GIT_TEST_CLONE_2GB=true export SKIP_DASHED_BUILT_INS=YesPlease -# Enable expensive tests on push builds to integration branches, but -# not on PR builds where the extra time is not justified for every -# iteration. +# In order to give maximum test coverage to contributor builds, +# preferrably even before the changes consume public review bandwidth, +# enable "expensive" tests for PR events. +# In order to catch bugs introduced at integration time by mismerges, +# enable the long tests for pushes to the integration branches as well. case "$GITHUB_EVENT_NAME,$CI_BRANCH" in -push,*next*|push,*master*|push,*main*|push,*maint*) +pull_request,*|push,*next*|push,*master*|push,*main*|push,*maint*) export GIT_TEST_LONG=YesPlease ;; esac From 2431f5e0e5cd415a3ab1bddb435b692536cc95e8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Samo=20Poga=C4=8Dnik?= Date: Mon, 11 May 2026 21:20:42 +0200 Subject: [PATCH 27/29] shallow: fix relative deepen on non-shallow repositories MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The commit "3ef68ff40e (shallow: handling fetch relative-deepen, 2026-02-15)" introduced a bug where using --deepen= on a non- shallow repository incorrectly treated the value as an absolute depth, resulting in a shallow fetch and truncated history. This patch prevents any modification when a relative deepen is requested on a non-shallow repository. A test is added to ensure that history is not changed when --deepen is used on a non-shallow repository. Reported-by: Owen Stephens Signed-off-by: Samo Pogačnik Signed-off-by: Junio C Hamano --- shallow.c | 6 +++++- t/t5537-fetch-shallow.sh | 10 ++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/shallow.c b/shallow.c index a8ad92e303d24d..610ff3d13bf17e 100644 --- a/shallow.c +++ b/shallow.c @@ -245,7 +245,11 @@ struct commit_list *get_shallow_commits(struct object_array *heads, int depth, int shallow_flag, int not_shallow_flag) { if (shallows && deepen_relative) { - depth += get_shallows_depth(heads, shallows); + int cur_shallow_depth = get_shallows_depth(heads, shallows); + if (cur_shallow_depth) + depth += cur_shallow_depth; + else + return NULL; } return get_shallows_or_depth(heads, NULL, NULL, depth, shallow_flag, not_shallow_flag); diff --git a/t/t5537-fetch-shallow.sh b/t/t5537-fetch-shallow.sh index 6588ce62264331..9982dd2aa6d499 100755 --- a/t/t5537-fetch-shallow.sh +++ b/t/t5537-fetch-shallow.sh @@ -251,6 +251,16 @@ test_expect_success '.git/shallow is edited by repack' ' origin "+refs/heads/*:refs/remotes/origin/*" ' +test_expect_success 'fetch --deepen does not truncate' ' + git clone --no-local .git full-clone && + git -C full-clone rev-parse --is-shallow-repository >expect && + git -C full-clone log --oneline >>expect && + git -C full-clone fetch --deepen=1 && + git -C full-clone rev-parse --is-shallow-repository >actual && + git -C full-clone log --oneline >>actual && + test_cmp expect actual +' + . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd From fb999778ccbc8d6761fc9a3034e02cc01130152e Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 21 May 2026 12:05:44 +0900 Subject: [PATCH 28/29] The 6th batch Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.55.0.adoc | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/Documentation/RelNotes/2.55.0.adoc b/Documentation/RelNotes/2.55.0.adoc index 1bd482699e4fce..61e5bc003f1ae4 100644 --- a/Documentation/RelNotes/2.55.0.adoc +++ b/Documentation/RelNotes/2.55.0.adoc @@ -27,6 +27,12 @@ UI, Workflows & Features * "git history" learned "fixup" command. + * The internal URL parsing logic has been made accessible via a new + subcommand "git url-parse". + + * Misspelt proxy URL (e.g., httt://...) did not trigger any warning + or failure, which has been corrected. + Performance, Internal Implementation, Development Support etc. -------------------------------------------------------------- @@ -41,6 +47,15 @@ Performance, Internal Implementation, Development Support etc. * Use a larger buffer size in the code paths to ingest pack stream. + * Refactor service routines in the ref subsystem backends. + + * Shrink wasted memory in Myers diff that does not account for common + prefix and suffix removal. + + * Enable expensive tests to catch topics that may cause breakages on + integration branches closer to their origin in the contributor PR + builds. + Fixes since v2.54 ----------------- @@ -136,6 +151,10 @@ Fixes since v2.54 * Further update to the i18n alias support to avoid regressions. (merge 21186cf9bb jh/alias-i18n-fixes later to maint). + * "git fetch --deepen=" in a full clone truncated the history to + commits deep, which has been corrected to be a no-op instead. + (merge 2431f5e0e5 sp/shallow-deepen-on-non-shallow-repo-fix later to maint). + * Other code cleanup, docfix, build fix, etc. (merge 80f4b802e9 ja/doc-difftool-synopsis-style later to maint). (merge b96490241e jc/doc-timestamps-in-stat later to maint). @@ -145,3 +164,5 @@ Fixes since v2.54 (merge 8547908eb3 pw/rename-to-get-current-worktree later to maint). (merge 890229b3f3 sg/t6112-unwanted-tilde-expansion-fix later to maint). (merge ab9753e7bc kh/doc-restore-double-underscores-fix later to maint). + (merge 4a9e097228 za/t2000-modernise-more later to maint). + (merge b635fd0725 kh/doc-log-decorate-list later to maint). From a89346e34a937f001e5d397ee62224e3e9852040 Mon Sep 17 00:00:00 2001 From: Junio C Hamano Date: Thu, 21 May 2026 12:40:38 +0900 Subject: [PATCH 29/29] Start preparing for 2.54.1 Mostly build and CI related updates taken from the 'master' front are included in here. We still need to grab a couple more topics once they graduate to 'master', namely jk/apply-leakfix jk/commit-sign-overflow-fix Signed-off-by: Junio C Hamano --- Documentation/RelNotes/2.54.1.adoc | 29 +++++++++++++++++++++++++++++ RelNotes | 2 +- 2 files changed, 30 insertions(+), 1 deletion(-) create mode 100644 Documentation/RelNotes/2.54.1.adoc diff --git a/Documentation/RelNotes/2.54.1.adoc b/Documentation/RelNotes/2.54.1.adoc new file mode 100644 index 00000000000000..f561dbba54d029 --- /dev/null +++ b/Documentation/RelNotes/2.54.1.adoc @@ -0,0 +1,29 @@ +Git v2.54.1 Release Notes +========================= + +This release is primarily to merge fixes accumulated on the 'master' +front to prepare for 2.55 release that are still relevant to 2.54.x +maintenance track. + +Fixes since v2.54 +----------------- + + * Headers from glibc 2.43 when used with clang does not allow + disabling C11 language features, causing build failures.. + + * Revert a recent change that introduced a regression to help mksh users. + + * Update various GitHub Actions versions. + + * Avoid hitting the pathname limit for socks proxy socket during the + test. + + * To help Windows 10 installations, avoid removing files whose + contents are still mmap()'ed. + + * Stop using unmaintained custom allocator in Windows build which was + the last user of the code. + + * Further update to the i18n alias support to avoid regressions. + +Also contains minor documentation updates and code clean-ups. diff --git a/RelNotes b/RelNotes index 84ef7387adb35c..a192bf35384599 120000 --- a/RelNotes +++ b/RelNotes @@ -1 +1 @@ -Documentation/RelNotes/2.54.0.adoc \ No newline at end of file +Documentation/RelNotes/2.54.1.adoc \ No newline at end of file