Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: git/git
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 7f13e5c8c744ec8da268b6f774d16f2ea729f48e
Choose a base ref
...
head repository: git/git
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: e417277ae99687b576e48cb477a7a50241ea0096
Choose a head ref
  • 1 commit
  • 3 files changed
  • 1 contributor

Commits on Mar 3, 2026

  1. xdiff: re-diff shifted change groups when using histogram algorithm

    After a diff algorithm has been run, the compaction phase
    (xdl_change_compact()) shifts and merges change groups to produce a
    cleaner output. However, this shifting could create a new matched group
    where both sides now have matching lines. This results in a
    wrong-looking diff output which contains redundant lines that are the
    same on both files.
    
    Fix this by detecting this situation, and re-diff the texts on each side
    to find similar lines, using the fall-back Myer's diff. Only do this for
    histogram diff as it's the only algorithm where this is relevant. Below
    contains an example, and more details.
    
    For an example, consider two files below:
    
        file1:
            A
    
            A
            A
            A
    
            A
            A
            A
    
        file2:
            A
    
            A
            x
            A
    
            A
            A
            A
    
    When using Myer's diff, the algorithm finds that only the "x" has been
    changed, and produces a final diff result (these are line diffs, but
    using word-diff syntax for ease of presentation):
    
            A A[-A-]{+x+}A AAA
    
    When using histogram diff, the algorithm first discovers the LCS "A
    AAA", which it uses as anchor, then produces an intermediate diff:
    
            {+A Ax+}A AAA[- AAA-].
    
    This is a longer diff than Myer's, but it's still self-consistent.
    However, the compaction phase attempts to shift the first file's diff
    group upwards (note that this shift crosses the anchor that histogram
    had used), leading to the final results for histogram diff:
    
            [-A AA-]{+A Ax+}A AAA
    
    This is a technically correct patch but looks clearly redundant to a
    human as the first 3 lines should not be in the diff.
    
    The fix would detect that a shift has caused matching to a new group,
    and re-diff the "A AA" and "A Ax" parts, which results in "A A"
    correctly re-marked as unchanged. This creates the now correct histogram
    diff:
    
            A A[-A-]{+x+}A AAA
    
    This issue is not applicable to Myer's diff algorithm as it already
    generates a minimal diff, which means a shift cannot result in a smaller
    diff output (the default Myer's diff in xdiff is not guaranteed to be
    minimal for performance reasons, but it typically does a good enough
    job).
    
    It's also not applicable to patience diff, because it uses only unique
    lines as anchor for its splits, and falls back to Myer's diff within
    each split. Shifting requires both ends having the same lines, and
    therefore cannot cross the unique line boundaries established by the
    patience algorithm. In contrast histogram diff uses non-unique lines as
    anchors, and therefore shifting can cross over them.
    
    This issue is rare in a normal repository. Below is a table of
    repositories (`git log --no-merges -p --histogram -1000`), showing how
    many times a re-diff was done and how many times it resulted in finding
    matching lines (therefore addressing this issue) with the fix. In
    general it is fewer than 1% of diff's that exhibit this offending
    behavior:
    
    | Repo (1k commits)  | Re-diff | Found matching lines |
    |--------------------|---------|----------------------|
    | llvm-project       |  45     | 11                   |
    | vim                | 110     |  9                   |
    | git                |  18     |  2                   |
    | WebKit             | 168     |  1                   |
    | ripgrep            |  22     |  1                   |
    | cpython            |  32     |  0                   |
    | vscode             |  13     |  0                   |
    
    Signed-off-by: Yee Cheng Chin <ychin.git@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ychin authored and gitster committed Mar 3, 2026
    Configuration menu
    Copy the full SHA
    e417277 View commit details
    Browse the repository at this point in the history
Loading