Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions Documentation/config/diff.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,11 @@ endif::git-diff[]
Set this option to `true` to make the diff driver cache the text
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +struct diff_subprocess {
> +	struct subprocess_entry subprocess;
> +	unsigned int supported_capabilities;
> +};
> +
> +static int subprocess_map_initialized;
> +static struct hashmap subprocess_map;

Can we avoid introducing new global variables like these?  Would
"struct userdiff_driver" or "struct diff_options" be a good place to
hang this hashmap, perhaps?

> +static int send_file_content(int fd, const char *buf, long size)
> +{
> +	int ret;
> +
> +	if (size > 0)
> +		ret = write_packetized_from_buf_no_flush(buf, size, fd);
> +	else
> +		ret = 0;

Shouldn't "size == -24" be flagged as an invalid input?

> +	if (ret)
> +		return ret;
> +	return packet_flush_gently(fd);
> +}

> +static int parse_hunk_line(const char *line, struct xdl_hunk *hunk)
> +{
> +...
> +}

This gives a silent error diagnosis, which is good for a lower level
helper.

> +int diff_process_get_hunks(struct userdiff_driver *drv,
> +			   const char *path,
> +			   const char *old_buf, long old_size,
> +			   const char *new_buf, long new_size,
> +			   struct xdl_hunk **hunks_out,
> +			   size_t *nr_hunks_out)
> +{
> +	struct diff_subprocess *backend;
> +	struct child_process *process;
> +	int fd_in, fd_out;
> +	struct strbuf status = STRBUF_INIT;
> +	struct xdl_hunk *hunks = NULL;
> +	struct xdl_hunk hunk;
> +	size_t nr_hunks = 0, alloc_hunks = 0;
> +	int len;
> +	char *line;
> +
> +	if (!drv || !drv->process)
> +		return -1;

A driver that does not define process is not an error; it is
perfectly normal in the current world order where nobody has such an
external process and even fi this patch lands, external processes
are optional.  So here "return -1" does not mean an error, and
silent return is perfectly fine.

> +	backend = find_or_start_process(drv->process);
> +	if (!backend)
> +		return -1;

This is probably an error; the user specified drv->process, we
either tried to find or start the process and failed.  Isn't it an
event that deserves to be reported in an error message?

> +	if (!(backend->supported_capabilities & CAP_HUNKS))
> +		return -1;

Backend started, but the "hunks" feature is not supported.  Perhaps
in a year or two, this external process protocol may have become so
popular that it gained more capabilities, possibly making get_hunks
obsolete.  We may be looking at such an external process that uses
other capabilities but not this one.  This is not an error, so
silent return is perfectly fine.

> +	process = subprocess_get_child_process(&backend->subprocess);
> +	fd_in = process->in;
> +	fd_out = process->out;
> +
> +	/* Send request */
> +	if (packet_write_fmt_gently(fd_in, "command=hunks\n") ||
> +	    packet_write_fmt_gently(fd_in, "pathname=%s\n", path) ||
> +	    packet_flush_gently(fd_in))
> +		goto error;
> +
> +	/* Send old file content */
> +	if (send_file_content(fd_in, old_buf, old_size))
> +		goto error;
> +
> +	/* Send new file content */
> +	if (send_file_content(fd_in, new_buf, new_size))
> +		goto error;
> +
> +	/* Read hunks until flush packet */
> +	while ((len = packet_read_line_gently(fd_out, NULL, &line)) >= 0 &&
> +	       line) {
> +		if (parse_hunk_line(line, &hunk) < 0)
> +			goto error;
> +		ALLOC_GROW(hunks, nr_hunks + 1, alloc_hunks);
> +		hunks[nr_hunks++] = hunk;
> +	}
> +	if (len < 0)
> +		goto error;
> +
> +	/* Read status */
> +	if (subprocess_read_status(fd_out, &status))
> +		goto error;
> +
> +	if (strcmp(status.buf, "success")) {
> +		if (!strcmp(status.buf, "abort"))
> +			backend->supported_capabilities &= ~CAP_HUNKS;
> +		goto error;
> +	}
> +
> +	*hunks_out = hunks;
> +	*nr_hunks_out = nr_hunks;
> +	strbuf_release(&status);
> +	return 0;
> +
> +error:

All exceptions that lead here look like events that should be
reported to the end-user.

> +	free(hunks);
> +	strbuf_release(&status);
> +	return -1;
> +}

> +/*
> + * Query a diff process for hunks describing the changes
> + * between old_buf and new_buf.
> + *
> + * The backend is a long-running subprocess configured via
> + * diff.<driver>.process.  It receives file content via
> + * pkt-line and returns hunks with 1-based line numbers.
> + *
> + * On success, sets *hunks_out and *nr_hunks_out to a newly allocated
> + * array (caller must free) and returns 0.
> + *
> + * On failure, returns -1.  The caller should fall back to the
> + * builtin diff algorithm.
> + */

I do not agree with this.  If it is a failure, the user should fix
the external process (or disable).  It shouldn't be hidden behind a
fallback.  As I left comments, in this round of implementation,
there are conditions that returns -1 for soemthing that is not an
error (i.e., not configured, or process not supporting the
particular capability) *and* in those cases the caller should fall
back as if nothing happened.  But some error cases, the caller
should't hide them.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Michael Montalbo wrote on the Git mailing list (how to reply to this email):

On Mon, May 25, 2026 at 6:56 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > +struct diff_subprocess {
> > +     struct subprocess_entry subprocess;
> > +     unsigned int supported_capabilities;
> > +};
> > +
> > +static int subprocess_map_initialized;
> > +static struct hashmap subprocess_map;
>
> Can we avoid introducing new global variables like these?  Would
> "struct userdiff_driver" or "struct diff_options" be a good place to
> hang this hashmap, perhaps?
>

Will clean this up.

> > +static int send_file_content(int fd, const char *buf, long size)
> > +{
> > +     int ret;
> > +
> > +     if (size > 0)
> > +             ret = write_packetized_from_buf_no_flush(buf, size, fd);
> > +     else
> > +             ret = 0;
>
> Shouldn't "size == -24" be flagged as an invalid input?
>

Will fix and do a broader audit of input validation and bounds checking.

> > +     if (ret)
> > +             return ret;
> > +     return packet_flush_gently(fd);
> > +}
>
> > +static int parse_hunk_line(const char *line, struct xdl_hunk *hunk)
> > +{
> > +...
> > +}
>
> This gives a silent error diagnosis, which is good for a lower level
> helper.
>
> > +int diff_process_get_hunks(struct userdiff_driver *drv,
> > +                        const char *path,
> > +                        const char *old_buf, long old_size,
> > +                        const char *new_buf, long new_size,
> > +                        struct xdl_hunk **hunks_out,
> > +                        size_t *nr_hunks_out)
> > +{
> > +     struct diff_subprocess *backend;
> > +     struct child_process *process;
> > +     int fd_in, fd_out;
> > +     struct strbuf status = STRBUF_INIT;
> > +     struct xdl_hunk *hunks = NULL;
> > +     struct xdl_hunk hunk;
> > +     size_t nr_hunks = 0, alloc_hunks = 0;
> > +     int len;
> > +     char *line;
> > +
> > +     if (!drv || !drv->process)
> > +             return -1;
>
> A driver that does not define process is not an error; it is
> perfectly normal in the current world order where nobody has such an
> external process and even fi this patch lands, external processes
> are optional.  So here "return -1" does not mean an error, and
> silent return is perfectly fine.
>
> > +     backend = find_or_start_process(drv->process);
> > +     if (!backend)
> > +             return -1;
>
> This is probably an error; the user specified drv->process, we
> either tried to find or start the process and failed.  Isn't it an
> event that deserves to be reported in an error message?
>
> > +     if (!(backend->supported_capabilities & CAP_HUNKS))
> > +             return -1;
>
> Backend started, but the "hunks" feature is not supported.  Perhaps
> in a year or two, this external process protocol may have become so
> popular that it gained more capabilities, possibly making get_hunks
> obsolete.  We may be looking at such an external process that uses
> other capabilities but not this one.  This is not an error, so
> silent return is perfectly fine.
>
> > +     process = subprocess_get_child_process(&backend->subprocess);
> > +     fd_in = process->in;
> > +     fd_out = process->out;
> > +
> > +     /* Send request */
> > +     if (packet_write_fmt_gently(fd_in, "command=hunks\n") ||
> > +         packet_write_fmt_gently(fd_in, "pathname=%s\n", path) ||
> > +         packet_flush_gently(fd_in))
> > +             goto error;
> > +
> > +     /* Send old file content */
> > +     if (send_file_content(fd_in, old_buf, old_size))
> > +             goto error;
> > +
> > +     /* Send new file content */
> > +     if (send_file_content(fd_in, new_buf, new_size))
> > +             goto error;
> > +
> > +     /* Read hunks until flush packet */
> > +     while ((len = packet_read_line_gently(fd_out, NULL, &line)) >= 0 &&
> > +            line) {
> > +             if (parse_hunk_line(line, &hunk) < 0)
> > +                     goto error;
> > +             ALLOC_GROW(hunks, nr_hunks + 1, alloc_hunks);
> > +             hunks[nr_hunks++] = hunk;
> > +     }
> > +     if (len < 0)
> > +             goto error;
> > +
> > +     /* Read status */
> > +     if (subprocess_read_status(fd_out, &status))
> > +             goto error;
> > +
> > +     if (strcmp(status.buf, "success")) {
> > +             if (!strcmp(status.buf, "abort"))
> > +                     backend->supported_capabilities &= ~CAP_HUNKS;
> > +             goto error;
> > +     }
> > +
> > +     *hunks_out = hunks;
> > +     *nr_hunks_out = nr_hunks;
> > +     strbuf_release(&status);
> > +     return 0;
> > +
> > +error:
>
> All exceptions that lead here look like events that should be
> reported to the end-user.
>

Agreed on all points. I will restructure things so errors are flagged when
appropriate (i.e., user specified a process but one was not found / couldn't
start and exceptions) and non-errors are treated as they should be.

> > +     free(hunks);
> > +     strbuf_release(&status);
> > +     return -1;
> > +}
>
> > +/*
> > + * Query a diff process for hunks describing the changes
> > + * between old_buf and new_buf.
> > + *
> > + * The backend is a long-running subprocess configured via
> > + * diff.<driver>.process.  It receives file content via
> > + * pkt-line and returns hunks with 1-based line numbers.
> > + *
> > + * On success, sets *hunks_out and *nr_hunks_out to a newly allocated
> > + * array (caller must free) and returns 0.
> > + *
> > + * On failure, returns -1.  The caller should fall back to the
> > + * builtin diff algorithm.
> > + */
>
> I do not agree with this.  If it is a failure, the user should fix
> the external process (or disable).  It shouldn't be hidden behind a
> fallback.  As I left comments, in this round of implementation,
> there are conditions that returns -1 for soemthing that is not an
> error (i.e., not configured, or process not supporting the
> particular capability) *and* in those cases the caller should fall
> back as if nothing happened.  But some error cases, the caller
> should't hide them.

Will address in a follow-up.

Thank you for the feedback!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Junio C Hamano wrote on the Git mailing list (how to reply to this email):

"Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Zero hunks with status=success means the tool considers the
> files equivalent.  Git skips diff output for that file.

Is "zero hunk" a common word or some random string you invented?  If
the latter, which is I am assuming it to be, you should define what
it means at/before the first use.  Here in the proposed log message,
and ...

>
> Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
> ---
>  Documentation/config/diff.adoc   |   8 +
>  Documentation/gitattributes.adoc |  40 ++++
>  Makefile                         |   1 +
>  diff-process.c                   | 206 +++++++++++++++++++
>  diff-process.h                   |  28 +++
>  diff.c                           |  23 +++
>  t/.gitattributes                 |   1 +
>  t/t4080-diff-process.sh          | 338 +++++++++++++++++++++++++++++++
>  8 files changed, 645 insertions(+)
>  create mode 100644 diff-process.c
>  create mode 100644 diff-process.h
>  create mode 100755 t/t4080-diff-process.sh
>
> diff --git a/Documentation/config/diff.adoc b/Documentation/config/diff.adoc
> index 1135a62a0a..4ab5f60df6 100644
> --- a/Documentation/config/diff.adoc
> +++ b/Documentation/config/diff.adoc
> @@ -218,6 +218,14 @@ endif::git-diff[]
>  	Set this option to `true` to make the diff driver cache the text
>  	conversion outputs.  See linkgit:gitattributes[5] for details.
>  
> +`diff.<driver>.process`::
> +	The command to run as a long-running diff process.
> +	The tool communicates via the pkt-line protocol and returns
> +	hunks that are fed into Git's diff and blame pipelines.
> +	If the tool returns zero hunks, the file is treated as
> +	unchanged for both diff output and blame attribution.
> +	See linkgit:gitattributes[5] for details.

... also here.

I do not know if you mean "the tool returns no hunks" (there is no
"hunk <old_start> <old_count> <new_start> <new_count>" line passed
from the tool over the protocol) or "the tool returns zero-hunk"
(there is a special "zero-hunk" message to signal this particular
condition sent over the protocol), and this description does not
quite help disambiguating between the two.

If the former, then avoid "zero hunks" as it sounds like a noun with
special meaning.  Yes, we can say "tool returns one hunk", "tool
returns 31 hunks", etc., so "tool returns zero hunks" may logically
be correct, but "when the tool returns no hunks with status=success"
is much less confusing, I think.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Michael Montalbo wrote on the Git mailing list (how to reply to this email):

On Mon, May 25, 2026 at 7:26 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Michael Montalbo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > Zero hunks with status=success means the tool considers the
> > files equivalent.  Git skips diff output for that file.
>
> Is "zero hunk" a common word or some random string you invented?  If
> the latter, which is I am assuming it to be, you should define what
> it means at/before the first use.  Here in the proposed log message,
> and ...
>
> >
> > Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
> > ---
> >  Documentation/config/diff.adoc   |   8 +
> >  Documentation/gitattributes.adoc |  40 ++++
> >  Makefile                         |   1 +
> >  diff-process.c                   | 206 +++++++++++++++++++
> >  diff-process.h                   |  28 +++
> >  diff.c                           |  23 +++
> >  t/.gitattributes                 |   1 +
> >  t/t4080-diff-process.sh          | 338 +++++++++++++++++++++++++++++++
> >  8 files changed, 645 insertions(+)
> >  create mode 100644 diff-process.c
> >  create mode 100644 diff-process.h
> >  create mode 100755 t/t4080-diff-process.sh
> >
> > diff --git a/Documentation/config/diff.adoc b/Documentation/config/diff.adoc
> > index 1135a62a0a..4ab5f60df6 100644
> > --- a/Documentation/config/diff.adoc
> > +++ b/Documentation/config/diff.adoc
> > @@ -218,6 +218,14 @@ endif::git-diff[]
> >       Set this option to `true` to make the diff driver cache the text
> >       conversion outputs.  See linkgit:gitattributes[5] for details.
> >
> > +`diff.<driver>.process`::
> > +     The command to run as a long-running diff process.
> > +     The tool communicates via the pkt-line protocol and returns
> > +     hunks that are fed into Git's diff and blame pipelines.
> > +     If the tool returns zero hunks, the file is treated as
> > +     unchanged for both diff output and blame attribution.
> > +     See linkgit:gitattributes[5] for details.
>
> ... also here.
>
> I do not know if you mean "the tool returns no hunks" (there is no
> "hunk <old_start> <old_count> <new_start> <new_count>" line passed
> from the tool over the protocol) or "the tool returns zero-hunk"
> (there is a special "zero-hunk" message to signal this particular
> condition sent over the protocol), and this description does not
> quite help disambiguating between the two.
>
> If the former, then avoid "zero hunks" as it sounds like a noun with
> special meaning.  Yes, we can say "tool returns one hunk", "tool
> returns 31 hunks", etc., so "tool returns zero hunks" may logically
> be correct, but "when the tool returns no hunks with status=success"
> is much less confusing, I think.

Yes, "zero hunks" was my own invention and I see why it's confusing. Will
update the messaging to use "no hunks" instead and do a broader sweep of
the documentation to clarify the protocol and expected tool behavior.

conversion outputs. See linkgit:gitattributes[5] for details.

`diff.<driver>.process`::
The command to run as a long-running diff process that
provides hunks to Git's diff pipeline.
See linkgit:gitattributes[5] for details.

`diff.indentHeuristic`::
Set this option to `false` to disable the default heuristics
that shift diff hunk boundaries to make patches easier to read.
Expand Down
3 changes: 3 additions & 0 deletions Documentation/diff-algorithm-option.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,6 @@
For instance, if you configured the `diff.algorithm` variable to a
non-default value and want to use the default one, then you
have to use `--diff-algorithm=default` option.
+
If you explicitly choose a diff algorithm, it also bypasses
`diff.<driver>.process` (see linkgit:gitattributes[5]).
4 changes: 3 additions & 1 deletion Documentation/diff-options.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -825,7 +825,9 @@ endif::git-format-patch[]
to use this option with linkgit:git-log[1] and friends.

`--no-ext-diff`::
Disallow external diff drivers.
Disallow external diff helpers, including
`diff.<driver>.command` and `diff.<driver>.process`
(see linkgit:gitattributes[5]).

`--textconv`::
`--no-textconv`::
Expand Down
139 changes: 139 additions & 0 deletions Documentation/gitattributes.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -821,6 +821,145 @@ NOTE: If `diff.<name>.command` is defined for path with the
(see above), and adding `diff.<name>.algorithm` has no effect, as the
algorithm is not passed to the external diff driver.

Using an external diff process
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

If `diff.<name>.process` is defined, Git sends the old and new file
content to an external tool and receives back a list of changed
regions (pairs of line ranges in the old and new file). Git uses
these instead of its builtin diff algorithm, but still controls
all output formatting, so features like word diff, function context,
color, and blame work normally. This is achieved by using the
long-running process protocol (described in
Documentation/technical/long-running-process-protocol.adoc).
Unlike `diff.<name>.command`, which replaces Git's output entirely,
the diff process feeds results back into the standard pipeline.

First, in `.gitattributes`, assign the `diff` attribute for paths.

------------------------
*.c diff=cdiff
------------------------

Then, define a "diff.<name>.process" configuration to specify
the diff process command.

----------------------------------------------------------------
[diff "cdiff"]
process = /path/to/diff-process-tool
----------------------------------------------------------------

When Git encounters the first file that needs to be diffed, it starts
the process and performs the handshake. In the handshake, the welcome
message sent by Git is "git-diff-client", only version 1 is supported,
and the supported capability is "hunks" (the changed regions
described below).

For each file, Git sends a list of "key=value" pairs terminated with
a flush packet, followed by the old and new file content as packetized
data, each terminated with a flush packet. The pathname is relative
to the repository root. When `diff.<name>.textconv` is also set,
the tool receives the textconv-transformed content rather than the
raw blob. Git does not send binary files to the diff process.

-----------------------
packet: git> command=hunks
packet: git> pathname=path/file.c
packet: git> 0000
packet: git> OLD_CONTENT
packet: git> 0000
packet: git> NEW_CONTENT
packet: git> 0000
-----------------------

The tool is expected to respond with zero or more hunk lines,
a flush packet, and a status packet terminated with a flush packet.
Each hunk line has the form:

`hunk <old_start> <old_count> <new_start> <new_count>`

where `<old_start>` and `<old_count>` identify a range of lines in
the old file, and `<new_start>` and `<new_count>` identify the
replacement range in the new file. Start values are 1-based and
counts are non-negative. Ranges must not extend beyond the end of
the file. For example, `hunk 3 2 3 4` means that 2 lines starting
at line 3 in the old file were replaced by 4 lines starting at
line 3 in the new file. An `<old_count>` of 0 means no lines were
removed (pure insertion); a `<new_count>` of 0 means no lines were
added (pure deletion).

Lines are delimited by newlines. A file `"foo\nbar\n"` and a
file `"foo\nbar"` both have 2 lines.

Hunks must be listed in order and must not overlap. Any line
not covered by a hunk is treated as unchanged, so the total
number of unchanged lines must be the same on both sides.
For example, if the old file has 10 lines and the hunks cover
4 of them (`old_count` values summing to 4), then 6 old lines
are unchanged. The new file must also have exactly 6 lines
not covered by hunks, so the `new_count` values must sum to
`new_file_lines - 6`.

-----------------------
packet: git< hunk 1 3 1 5
packet: git< hunk 10 2 12 2
packet: git< 0000
packet: git< status=success
packet: git< 0000
-----------------------

If the tool responds with hunks and "success", Git marks those lines
as changed and feeds them into the standard diff pipeline. Patch
output features (word diff, function context, color) work normally.
Note that `--stat` and other summary formats use their own diff path
and are not affected by the diff process.

If no hunk lines precede the flush, followed by "success", Git
treats the files as having no changes: `git diff` produces no output
and `git blame` skips the commit, attributing lines to earlier commits.

-----------------------
packet: git< 0000
packet: git< status=success
packet: git< 0000
-----------------------

If the tool returns invalid hunks (out of bounds, overlapping), Git
silently falls back to the builtin diff algorithm.

In case the tool cannot or does not want to process the content,
it is expected to respond with an "error" status. Git warns and
falls back to the builtin diff algorithm for this file. The tool
remains available for subsequent files.

-----------------------
packet: git< 0000
packet: git< status=error
packet: git< 0000
-----------------------

In case the tool cannot or does not want to process the content as
well as any future content for the lifetime of the Git process, it
is expected to respond with an "abort" status. Git silently falls
back to the builtin diff algorithm for this file and does not send
further requests to the tool.

-----------------------
packet: git< 0000
packet: git< status=abort
packet: git< 0000
-----------------------

If the tool dies during the communication or does not adhere to the
protocol then Git will stop the process and fall back to the builtin
diff algorithm. Git warns once and does not restart the process for
subsequent files.

Tools should ignore unknown keys in the per-file request to remain
forward-compatible. Future versions of Git may send additional
`command=` values; tools that receive an unrecognized command should
respond with `status=error` rather than terminating.

Defining a custom hunk-header
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -1142,6 +1142,7 @@ LIB_OBJS += diff-delta.o
LIB_OBJS += diff-merges.o
LIB_OBJS += diff-lib.o
LIB_OBJS += diff-no-index.o
LIB_OBJS += diff-process.o
LIB_OBJS += diff.o
LIB_OBJS += diffcore-break.o
LIB_OBJS += diffcore-delta.o
Expand Down
40 changes: 31 additions & 9 deletions blame.c
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
#include "tag.h"
#include "trace2.h"
#include "blame.h"
#include "diff-process.h"
#include "xdiff-interface.h"
#include "alloc.h"
#include "commit-slab.h"
#include "bloom.h"
Expand Down Expand Up @@ -314,17 +316,25 @@ static struct commit *fake_working_tree_commit(struct repository *r,



static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
xdl_emit_hunk_consume_func_t hunk_func, void *cb_data, int xdl_opts)
static int diff_hunks_xpp(mmfile_t *file_a, mmfile_t *file_b,
xdl_emit_hunk_consume_func_t hunk_func,
void *cb_data, xpparam_t *xpp)
{
xpparam_t xpp = {0};
xdemitconf_t xecfg = {0};
xdemitcb_t ecb = {NULL};

xpp.flags = xdl_opts;
xecfg.hunk_func = hunk_func;
ecb.priv = cb_data;
return xdi_diff(file_a, file_b, &xpp, &xecfg, &ecb);
return xdi_diff(file_a, file_b, xpp, &xecfg, &ecb);
}

static int diff_hunks(mmfile_t *file_a, mmfile_t *file_b,
xdl_emit_hunk_consume_func_t hunk_func, void *cb_data, int xdl_opts)
{
xpparam_t xpp = {0};

xpp.flags = xdl_opts;
return diff_hunks_xpp(file_a, file_b, hunk_func, cb_data, &xpp);
}

static const char *get_next_line(const char *start, const char *end)
Expand Down Expand Up @@ -1943,6 +1953,7 @@ static void pass_blame_to_parent(struct blame_scoreboard *sb,
struct blame_origin *parent, int ignore_diffs)
{
mmfile_t file_p, file_o;
xpparam_t xpp = {0};
struct blame_chunk_cb_data d;
struct blame_entry *newdest = NULL;

Expand All @@ -1961,10 +1972,21 @@ static void pass_blame_to_parent(struct blame_scoreboard *sb,
&sb->num_read_blob, ignore_diffs);
sb->num_get_patch++;

if (diff_hunks(&file_p, &file_o, blame_chunk_cb, &d, sb->xdl_opts))
die("unable to generate diff (%s -> %s)",
oid_to_hex(&parent->commit->object.oid),
oid_to_hex(&target->commit->object.oid));
xpp.flags = sb->xdl_opts;
/*
* If the diff process considers the files equivalent,
* skip the diff so blame looks past this commit.
*/
if (diff_process_fill_hunks(&sb->revs->diffopt, target->path,
&file_p, &file_o, &xpp)
!= DIFF_PROCESS_EQUIVALENT) {
if (diff_hunks_xpp(&file_p, &file_o, blame_chunk_cb,
&d, &xpp))
die("unable to generate diff (%s -> %s)",
oid_to_hex(&parent->commit->object.oid),
oid_to_hex(&target->commit->object.oid));
}
free(xpp.external_hunks);
/* The rest are the same as the parent */
blame_chunk(&d.dstq, &d.srcq, INT_MAX, d.offset, INT_MAX, 0,
parent, target, 0);
Expand Down
7 changes: 7 additions & 0 deletions builtin/log.c
Original file line number Diff line number Diff line change
Expand Up @@ -2213,6 +2213,13 @@ int cmd_format_patch(int argc,
if (argc > 1)
die(_("unrecognized argument: %s"), argv[1]);

/*
* Disable diff.<driver>.process so that patches generated by
* format-patch are always based on the builtin diff algorithm
* and can be applied reliably.
*/
rev.diffopt.flags.no_diff_process = 1;

if (rev.diffopt.output_format & DIFF_FORMAT_NAME)
die(_("--name-only does not make sense"));
if (rev.diffopt.output_format & DIFF_FORMAT_NAME_STATUS)
Expand Down
Loading
Loading