mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-10-31 14:27:54 +01:00

488 lines

15 KiB

C

Raw Normal View History

[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`/*`
			`* Copyright (C) 2005 Junio C Hamano`
			`*/`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`#ifndef DIFF_H`
			`#define DIFF_H`

tree/diff header cleanup. Introduce tree-walk.[ch] and move "struct tree_desc" and associated functions from various places. Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and move it to cache.h. This macro returns the canonicalized st_mode value in the host byte order for files, symlinks and directories -- to be compared with a tree_desc entry. create_ce_mode(mode) in cache.h is similar but is intended to be used for index entries (so it does not work for directories) and returns the value in the network byte order. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-30 08:55:43 +02:00			`#include "tree-walk.h"`
move struct pathspec and related functions to pathspec.[ch] Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-07-14 10:35:25 +02:00			`#include "pathspec.h"`
diff: convert struct combine_diff_path to object_id Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-03-14 00:39:33 +01:00			`#include "object.h"`
diffcore: add a pickaxe option to find a specific blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) One might be tempted to extend git-describe to also work with blobs, such that `git describe <blob-id>` gives a description as '<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer number of responses (>110), it turns out this is tricky to get right. The hard part to get right is picking the correct 'commit-ish' as that could be the commit that (re-)introduced the blob or the blob that removed the blob; the blob could exist in different branches. Junio hinted at a different approach of solving this problem, which this patch implements. Teach the diff machinery another flag for restricting the information to what is shown. For example: $ ./git log --oneline --find-object=v2.0.0:Makefile b2feb64309 Revert the whole "ask curl-config" topic for now 47fbfded53 i18n: only extract comments marked with "TRANSLATORS:" we observe that the Makefile as shipped with 2.0 was appeared in v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The reason why these commits both occur prior to v2.0.0 are evil merges that are not found using this new mechanism. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-01-04 23:50:42 +01:00			`#include "oidset.h"`
Make the "struct tree_desc" operations available to others We have operations to "extract" and "update" a "struct tree_desc", but we only used them in tree-diff.c and they were static to that file. But other tree traversal functions can use them to their advantage Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-31 23:10:56 +01:00
Log message printout cleanups On Sun, 16 Apr 2006, Junio C Hamano wrote: > > In the mid-term, I am hoping we can drop the generate_header() > callchain _and_ the custom code that formats commit log in-core, > found in cmd_log_wc(). Ok, this was nastier than expected, just because the dependencies between the different log-printing stuff were absolutely _everywhere_, but here's a patch that does exactly that. The patch is not very easy to read, and the "--patch-with-stat" thing is still broken (it does not call the "show_log()" thing properly for merges). That's not a new bug. In the new world order it _should_ do something like if (rev->logopt) show_log(rev, rev->logopt, "---\n"); but it doesn't. I haven't looked at the --with-stat logic, so I left it alone. That said, this patch removes more lines than it adds, and in particular, the "cmd_log_wc()" loop is now a very clean: while ((commit = get_revision(rev)) != NULL) { log_tree_commit(rev, commit); free(commit->buffer); commit->buffer = NULL; } so it doesn't get much prettier than this. All the complexity is entirely hidden in log-tree.c, and any code that needs to flush the log literally just needs to do the "if (rev->logopt) show_log(...)" incantation. I had to make the combined_diff() logic take a "struct rev_info" instead of just a "struct diff_options", but that part is pretty clean. This does change "git whatchanged" from using "diff-tree" as the commit descriptor to "commit", and I changed one of the tests to reflect that new reality. Otherwise everything still passes, and my other tests look fine too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-17 20:59:32 +02:00			`struct rev_info;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`struct diff_options;`
diff: support custom callbacks for output Users can request the DIFF_FORMAT_CALLBACK output format to get a callback consisting of the whole diff_queue_struct. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-07 08:35:42 +02:00			`struct diff_queue_struct;`
Add a prefix output callback to diff output The callback can be used to add some prefix string to each line of diff output. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-26 09:08:02 +02:00			`struct strbuf;`
textconv: make the API public The textconv functionality allows one to convert a file into text before running diff. But this functionality can be useful to other features such as blame. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-07 17:23:36 +02:00			`struct diff_filespec;`
			`struct userdiff_driver;`
Rename sha1_array to oid_array Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-03-31 03:40:00 +02:00			`struct oid_array;`
pass struct commit to diff_tree_combined_merge() Instead of passing the hash of a commit and then searching that same commit in the single caller, simply pass the commit directly. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-12-17 11:20:07 +01:00			`struct commit;`
tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-06 23:46:26 +02:00			`struct combine_diff_path;`
diff.c: reduce implicit dependency on the_index diff and textconv code has so widespread use that it's hard to simply update their api and all call sites at once because it would result in a big patch. For now reduce the_index references to two places: diff_setup() and fill_textconv(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-09-21 17:57:19 +02:00			`struct repository;`
tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-06 23:46:26 +02:00
			`typedef int (pathchange_fn_t)(struct diff_options options,`
			`struct combine_diff_path *path);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
			`typedef void (change_fn_t)(struct diff_options options,`
			`unsigned old_mode, unsigned new_mode,`
diff: convert diff_change to struct object_id Convert diff_change to take a struct object_id. In addition convert the function pointer type 'change_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-30 19:30:49 +02:00			`const struct object_id *old_oid,`
			`const struct object_id *new_oid,`
			`int old_oid_valid, int new_oid_valid,`
Performance optimization for detection of modified submodules In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-18 21:26:18 +01:00			`const char *fullpath,`
			`unsigned old_dirty_submodule, unsigned new_dirty_submodule);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
			`typedef void (add_remove_fn_t)(struct diff_options options,`
			`int addremove, unsigned mode,`
diff: convert diff_addremove to struct object_id Convert diff_addremove to take a struct object_id. In addtion convert the function pointer type 'add_remove_fn_t' to also take a struct object_id. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-30 19:30:47 +02:00			`const struct object_id *oid,`
			`int oid_valid,`
Performance optimization for detection of modified submodules In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-18 21:26:18 +01:00			`const char *fullpath, unsigned dirty_submodule);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
diff: support custom callbacks for output Users can request the DIFF_FORMAT_CALLBACK output format to get a callback consisting of the whole diff_queue_struct. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-07 08:35:42 +02:00			`typedef void (diff_format_fn_t)(struct diff_queue_struct q,`
			`struct diff_options options, void data);`

Add a prefix output callback to diff output The callback can be used to add some prefix string to each line of diff output. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-26 09:08:02 +02:00			`typedef struct strbuf (diff_prefix_fn_t)(struct diff_options opt, void data);`

Merge with_raw, with_stat and summary variables to output_format DIFF_FORMAT_* are now bit-flags instead of enumerated values. Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-24 19:21:53 +02:00			`#define DIFF_FORMAT_RAW 0x0001`
			`#define DIFF_FORMAT_DIFFSTAT 0x0002`
diff --numstat [jc: with documentation from Jakub] Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-12 12:01:00 +02:00			`#define DIFF_FORMAT_NUMSTAT 0x0004`
			`#define DIFF_FORMAT_SUMMARY 0x0008`
			`#define DIFF_FORMAT_PATCH 0x0010`
make commit message a little more consistent and conforting It is nicer to let the user know when a commit succeeded all the time, not only the first time. Also the commit sha1 is much more useful than the tree sha1 in this case. This patch also introduces a -q switch to supress this message as well as the summary of created/deleted files. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-15 05:15:44 +01:00			`#define DIFF_FORMAT_SHORTSTAT 0x0020`
Add "--dirstat" for some directory statistics This adds a new form of overview diffstat output, doing something that I have occasionally ended up doing manually (and badly, because it's actually pretty nasty to do), and that I think is very useful for an project like the kernel that has a fairly deep and well-separated directory structure with semantic meaning. What I mean by that is that it's often interesting to see exactly which sub-directories are impacted by a patch, and to what degree - even if you don't perhaps care so much about the individual files themselves. What makes the concept more interesting is that the "impact" is often hierarchical: in the kernel, for example, something could either have a very localized impact to "fs/ext3/" and then it's interesting to see that such a patch changes mostly that subdirectory, but you could have another patch that changes some generic VFS-layer issue which affects _many_ subdirectories that are all under "fs/", but none - or perhaps just a couple of them - of the individual filesystems are interesting in themselves. So what commonly happens is that you may have big changes in a specific sub-subdirectory, but still also significant separate changes to the subdirectory leading up to that - maybe you have significant VFS-level changes, but also changes under that VFS layer in the NFS-specific directories, for example. In that case, you do want the low-level parts that are significant to show up, but then the insignificant ones should show up as under the more generic top-level directory. This patch shows all of that with "--dirstat". The output can be either something simple like commit 81772fe... Author: Thomas Gleixner <tglx@linutronix.de> Date: Sun Feb 10 23:57:36 2008 +0100 x86: remove over noisy debug printk pageattr-test.c contains a noisy debug printk that people reported. The condition under which it prints (randomly tapping into a mem_map[] hole and not being able to c_p_a() there) is valid behavior and not interesting to report. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 100.0% arch/x86/mm/ or something much more complex like commit e231c2e... Author: David Howells <dhowells@redhat.com> Date: Thu Feb 7 00:15:26 2008 -0800 Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) 20.5% crypto/ 7.6% fs/afs/ 7.6% fs/fuse/ 7.6% fs/gfs2/ 5.1% fs/jffs2/ 5.1% fs/nfs/ 5.1% fs/nfsd/ 7.6% fs/reiserfs/ 15.3% fs/ 7.6% net/rxrpc/ 10.2% security/keys/ where that latter example is an example of significant work in some individual fs// subdirectories (like the patches to reiserfs accounting for 7.6% of the whole), but then discounting those individual filesystems, there's also 15.3% other "random" things that weren't worth reporting on their oen left over under fs/ in general (either in that directory itself, or in subdirectories of fs/ that didn't have enough changes to be reported individually). I'd like to stress that the "15.3% fs/" mentioned above is the stuff that is under fs/ but that was _not_ significant enough to report on its own. So the above does _not_ mean that 15.3% of the work was under fs/ per se, because that 15.3% does not* include the already-reported 7.6% of afs, 7.6% of fuse etc. If you want to enable "cumulative" directory statistics, you can use the "--cumulative" flag, which adds up percentages recursively even when they have been already reported for a sub-directory. That cumulative output is disabled if all of the changes in one subdirectory come from a deeper subdirectory, to avoid repeating subdirectories all the way to the root. For an example of the cumulative reporting, the above commit becomes commit e231c2e... Author: David Howells <dhowells@redhat.com> Date: Thu Feb 7 00:15:26 2008 -0800 Convert ERR_PTR(PTR_ERR(p)) instances to ERR_CAST(p) 20.5% crypto/ 7.6% fs/afs/ 7.6% fs/fuse/ 7.6% fs/gfs2/ 5.1% fs/jffs2/ 5.1% fs/nfs/ 5.1% fs/nfsd/ 7.6% fs/reiserfs/ 61.5% fs/ 7.6% net/rxrpc/ 10.2% security/keys/ in which the commit percentages now obviously add up to much more than 100%: now the changes that were already reported for the sub-directories under fs/ are then cumulatively included in the whole percentage of fs/ (ie now shows 61.5% as opposed to the 15.3% without the cumulative reporting). The default reporting limit has been arbitrarily set at 3%, which seems to be a pretty good cut-off, but you can specify the cut-off manually by giving it as an option parameter (eg "--dirstat=5" makes the cut-off be at 5% instead) NOTE! The percentages are purely about the total lines added and removed, not anything smarter (or dumber) than that. Also note that you should not generally expect things to add up to 100%: not only does it round down, we don't report leftover scraps (they add up to the top-level change count, but we don't even bother reporting that, it only reports subdirectories). Quite frankly, as a top-level manager this is really convenient for me, but it's going to be very boring for git itself since there are few subdirectories. Also, don't expect things to make tons of sense if you combine this with "-M" and there are cross-directory renames etc. But even for git itself, you can get some fun statistics. Try out git log --dirstat and see the occasional mentions of things like Documentation/, git-gui/, gitweb/ and gitk-git/. Or try out something like git diff --dirstat v1.5.0..v1.5.4 which does kind of git an overview that shows something. But in general, the output is more exciting for big projects with deeper structure, and doing a git diff --dirstat v2.6.24..v2.6.25-rc1 on the kernel is what I actually wrote this for! Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-12 22:26:31 +01:00			`#define DIFF_FORMAT_DIRSTAT 0x0040`
Merge with_raw, with_stat and summary variables to output_format DIFF_FORMAT_* are now bit-flags instead of enumerated values. Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-24 19:21:53 +02:00
			`/* These override all above */`
diff --numstat [jc: with documentation from Jakub] Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-12 12:01:00 +02:00			`#define DIFF_FORMAT_NAME 0x0100`
			`#define DIFF_FORMAT_NAME_STATUS 0x0200`
			`#define DIFF_FORMAT_CHECKDIFF 0x0400`
Merge with_raw, with_stat and summary variables to output_format DIFF_FORMAT_* are now bit-flags instead of enumerated values. Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-24 19:21:53 +02:00
			`/* Same as output_format = 0 but we know that -s flag was given`
			`* and we should not give default value to output_format.`
			`*/`
diff --numstat [jc: with documentation from Jakub] Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-12 12:01:00 +02:00			`#define DIFF_FORMAT_NO_OUTPUT 0x0800`
Merge with_raw, with_stat and summary variables to output_format DIFF_FORMAT_* are now bit-flags instead of enumerated values. Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-24 19:21:53 +02:00
diff --numstat [jc: with documentation from Jakub] Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-12 12:01:00 +02:00			`#define DIFF_FORMAT_CALLBACK 0x1000`
diff: support custom callbacks for output Users can request the DIFF_FORMAT_CALLBACK output format to get a callback consisting of the whole diff_queue_struct. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-07 08:35:42 +02:00
diff: convert flags to be stored in bitfields We cannot add many more flags to the diff machinery due to the limitations of the number of flags that can be stored in a single unsigned int. In order to allow for more flags to be added to the diff machinery in the future this patch converts the flags to be stored in bitfields in 'struct diff_flags'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-10-31 19:19:05 +01:00			`#define DIFF_FLAGS_INIT { 0 }`
			`struct diff_flags {`
diff: make struct diff_flags members lowercase Now that the flags stored in struct diff_flags are being accessed directly and not through macros, change all struct members from being uppercase to lowercase. This conversion is done using the following semantic patch: @@ expression E; @@ - E.RECURSIVE + E.recursive @@ expression E; @@ - E.TREE_IN_RECURSIVE + E.tree_in_recursive @@ expression E; @@ - E.BINARY + E.binary @@ expression E; @@ - E.TEXT + E.text @@ expression E; @@ - E.FULL_INDEX + E.full_index @@ expression E; @@ - E.SILENT_ON_REMOVE + E.silent_on_remove @@ expression E; @@ - E.FIND_COPIES_HARDER + E.find_copies_harder @@ expression E; @@ - E.FOLLOW_RENAMES + E.follow_renames @@ expression E; @@ - E.RENAME_EMPTY + E.rename_empty @@ expression E; @@ - E.HAS_CHANGES + E.has_changes @@ expression E; @@ - E.QUICK + E.quick @@ expression E; @@ - E.NO_INDEX + E.no_index @@ expression E; @@ - E.ALLOW_EXTERNAL + E.allow_external @@ expression E; @@ - E.EXIT_WITH_STATUS + E.exit_with_status @@ expression E; @@ - E.REVERSE_DIFF + E.reverse_diff @@ expression E; @@ - E.CHECK_FAILED + E.check_failed @@ expression E; @@ - E.RELATIVE_NAME + E.relative_name @@ expression E; @@ - E.IGNORE_SUBMODULES + E.ignore_submodules @@ expression E; @@ - E.DIRSTAT_CUMULATIVE + E.dirstat_cumulative @@ expression E; @@ - E.DIRSTAT_BY_FILE + E.dirstat_by_file @@ expression E; @@ - E.ALLOW_TEXTCONV + E.allow_textconv @@ expression E; @@ - E.TEXTCONV_SET_VIA_CMDLINE + E.textconv_set_via_cmdline @@ expression E; @@ - E.DIFF_FROM_CONTENTS + E.diff_from_contents @@ expression E; @@ - E.DIRTY_SUBMODULES + E.dirty_submodules @@ expression E; @@ - E.IGNORE_UNTRACKED_IN_SUBMODULES + E.ignore_untracked_in_submodules @@ expression E; @@ - E.IGNORE_DIRTY_SUBMODULES + E.ignore_dirty_submodules @@ expression E; @@ - E.OVERRIDE_SUBMODULE_CONFIG + E.override_submodule_config @@ expression E; @@ - E.DIRSTAT_BY_LINE + E.dirstat_by_line @@ expression E; @@ - E.FUNCCONTEXT + E.funccontext @@ expression E; @@ - E.PICKAXE_IGNORE_CASE + E.pickaxe_ignore_case @@ expression E; @@ - E.DEFAULT_FOLLOW_RENAMES + E.default_follow_renames Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-10-31 19:19:11 +01:00			`unsigned recursive:1;`
			`unsigned tree_in_recursive:1;`
			`unsigned binary:1;`
			`unsigned text:1;`
			`unsigned full_index:1;`
			`unsigned silent_on_remove:1;`
			`unsigned find_copies_harder:1;`
			`unsigned follow_renames:1;`
			`unsigned rename_empty:1;`
			`unsigned has_changes:1;`
			`unsigned quick:1;`
			`unsigned no_index:1;`
			`unsigned allow_external:1;`
			`unsigned exit_with_status:1;`
			`unsigned reverse_diff:1;`
			`unsigned check_failed:1;`
			`unsigned relative_name:1;`
			`unsigned ignore_submodules:1;`
			`unsigned dirstat_cumulative:1;`
			`unsigned dirstat_by_file:1;`
			`unsigned allow_textconv:1;`
			`unsigned textconv_set_via_cmdline:1;`
			`unsigned diff_from_contents:1;`
			`unsigned dirty_submodules:1;`
			`unsigned ignore_untracked_in_submodules:1;`
			`unsigned ignore_dirty_submodules:1;`
			`unsigned override_submodule_config:1;`
			`unsigned dirstat_by_line:1;`
			`unsigned funccontext:1;`
			`unsigned default_follow_renames:1;`
diff: add --compact-summary Certain information is currently shown with --summary, but when used in combination with --stat it's a bit hard to read since info of the same file is in two places (--stat and --summary). On top of that, commits that add or remove files double the number of display lines, which could be a lot if you add or remove a lot of files. --compact-summary embeds most of --summary back in --stat in the little space between the file name part and the graph line, e.g. with commit 0433d533f1: Documentation/merge-config.txt \| 4 + builtin/merge.c \| 2 + ...-pull-verify-signatures.sh (new +x) \| 81 ++++++++++++++ t/t7612-merge-verify-signatures.sh \| 45 ++++++++ 4 files changed, 132 insertions(+) It helps both condensing information and saving some text space. What's new in diffstat is: - A new 0644 file is shown as (new) - A new 0755 file is shown as (new +x) - A new symlink is shown as (new +l) - A deleted file is shown as (gone) - A mode change adding executable bit is shown as (mode +x) - A mode change removing it is shown as (mode -x) Note that --compact-summary does not contain all the information --summary provides. Rewrite percentage is not shown but it could be added later, like R50% or C20%. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-02-24 15:09:59 +01:00			`unsigned stat_with_summary:1;`
range-diff: suppress the diff headers When showing the diff between corresponding patches of the two branch versions, we have to make up a fake filename to run the diff machinery. That filename does not carry any meaningful information, hence tbdiff suppresses it. So we should, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-08-13 13:33:11 +02:00			`unsigned suppress_diff_headers:1;`
diff: add an internal option to dual-color diffs of diffs When diffing diffs, it can be quite daunting to figure out what the heck is going on, as there are nested +/- signs. Let's make this easier by adding a flag in diff_options that allows color-coding the outer diff sign with inverted colors, so that the preimage and postimage is colored like the diff it is. Of course, this really only makes sense when the preimage and postimage are diffs. So let's not expose this flag via a command-line option for now. This is a feature that was invented by git-tbdiff, and it will be used by `git range-diff` in the next commit, by offering it via a new option: `--dual-color`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-08-13 13:33:20 +02:00			`unsigned dual_color_diffed_diffs:1;`
diff: convert flags to be stored in bitfields We cannot add many more flags to the diff machinery due to the limitations of the number of flags that can be stored in a single unsigned int. In order to allow for more flags to be added to the diff machinery in the future this patch converts the flags to be stored in bitfields in 'struct diff_flags'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-10-31 19:19:05 +01:00			`};`

			`static inline void diff_flags_or(struct diff_flags *a,`
			`const struct diff_flags *b)`
			`{`
			`char tmp_a = (char )a;`
			`const char tmp_b = (const char )b;`
			`int i;`

			`for (i = 0; i < sizeof(struct diff_flags); i++)`
			`tmp_a[i] \|= tmp_b[i];`
			`}`

Use DIFF_XDL_SET/DIFF_OPT_SET instead of raw bit-masking Signed-off-by: Keith Cascio <keith@cs.ucla.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-17 04:26:49 +01:00			`#define DIFF_XDL_TST(opts, flag) ((opts)->xdl_opts & XDF_##flag)`
			`#define DIFF_XDL_SET(opts, flag) ((opts)->xdl_opts \|= XDF_##flag)`
			`#define DIFF_XDL_CLR(opts, flag) ((opts)->xdl_opts &= ~XDF_##flag)`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00
xdiff: PATIENCE/HISTOGRAM are not independent option bits Because the default Myers, patience and histogram algorithms cannot be in effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not independent bits. Instead of wasting one bit per algorithm, define a few macros to access the few bits they occupy and update the code that access them. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-20 00:36:55 +01:00			`#define DIFF_WITH_ALG(opts, flag) (((opts)->xdl_opts & ~XDF_DIFF_ALGORITHM_MASK) \| XDF_##flag)`

diff: add --word-diff option that generalizes --color-words This teaches the --color-words engine a more general interface that supports two new modes: * --word-diff=plain, inspired by the 'wdiff' utility (most similar to 'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+} * --word-diff=porcelain, which generates an ad-hoc machine readable format: - each diff unit is prefixed by [-+ ] and terminated by newline as in unified diff - newlines in the input are output as a line consisting only of a tilde '~' Both of these formats still support color if it is enabled, using it to highlight the differences. --color-words becomes a synonym for --word-diff=color, which is the color-only format. Also adds some compatibility/convenience options. Thanks to Junio C Hamano and Miles Bader for good ideas. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-04-14 17:59:06 +02:00			`enum diff_words_type {`
			`DIFF_WORDS_NONE = 0,`
			`DIFF_WORDS_PORCELAIN,`
			`DIFF_WORDS_PLAIN,`
			`DIFF_WORDS_COLOR`
			`};`

diff: prepare for additional submodule formats A future patch will add a new format for displaying the difference of a submodule. Make it easier by changing how we store the current selected format. Replace the DIFF_OPT flag with an enumeration, as each format will be mutually exclusive. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-09-01 01:27:21 +02:00			`enum diff_submodule_format {`
			`DIFF_SUBMODULE_SHORT = 0,`
diff: teach diff to display submodule difference with an inline diff Teach git-diff and friends a new format for displaying the difference of a submodule. The new format is an inline diff of the contents of the submodule between the commit range of the update. This allows the user to see the actual code change caused by a submodule update. Add tests for the new format and option. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-09-01 01:27:25 +02:00			`DIFF_SUBMODULE_LOG,`
			`DIFF_SUBMODULE_INLINE_DIFF`
diff: prepare for additional submodule formats A future patch will add a new format for displaying the difference of a submodule. Make it easier by changing how we store the current selected format. Replace the DIFF_OPT flag with an enumeration, as each format will be mutually exclusive. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-09-01 01:27:21 +02:00			`};`

Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`struct diff_options {`
			`const char *orderfile;`
			`const char *pickaxe;`
git-pickaxe: rename detection optimization The idea is that we are interested in renaming into only one path, so we do not care about renames that happen elsewhere. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-02 09:02:11 +01:00			`const char *single_follow;`
Teach diff machinery to display other prefixes than "a/" and "b/" With the new options "--src-prefix=<prefix>", "--dst-prefix=<prefix>" and "--no-prefix", you can now control the path prefixes of the diff machinery. These used to by hardwired to "a/" for the source prefix and "b/" for the destination prefix. Initial patch by Pascal Obry. Sane option names suggested by Linus. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-18 20:32:14 +01:00			`const char a_prefix, b_prefix;`
graph: add support for --line-prefix on all graph-aware output Add an extension to git-diff and git-log (and any other graph-aware displayable output) such that "--line-prefix=<string>" will print the additional line-prefix on every line of output. To make this work, we have to fix a few bugs in the graph API that force graph_show_commit_msg to be used only when you have a valid graph. Additionally, we extend the default_diff_output_prefix handler to work even when no graph is enabled. This is somewhat of a hack on top of the graph API, but I think it should be acceptable here. This will be used by a future extension of submodule display which displays the submodule diff as the actual diff between the pre and post commit in the submodule project. Add some tests for both git-log and git-diff to ensure that the prefix is honored correctly. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-09-01 01:27:20 +02:00			`const char *line_prefix;`
			`size_t line_prefix_length;`
diff: convert flags to be stored in bitfields We cannot add many more flags to the diff machinery due to the limitations of the number of flags that can be stored in a single unsigned int. In order to allow for more flags to be added to the diff machinery in the future this patch converts the flags to be stored in bitfields in 'struct diff_flags'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-10-31 19:19:05 +01:00			`struct diff_flags flags;`
diff: preparse --diff-filter string argument Instead of running strchr() on the list of status characters over and over again, parse the --diff-filter option into bitfields and use the bits to see if the change to the filepair matches the status requested. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-07-18 00:05:46 +02:00
			`/* diff-filter bits */`
			`unsigned int filter;`

diff: refactor COLOR_DIFF from a flag into an int This lets us store more than just a bit flag for whether we want color; we can also store whether we want automatic colors. This can be useful for making the automatic-color decision closer to the point of use. This mostly just involves replacing DIFF_OPT_* calls with manipulations of the flag. The biggest exception is that calls to DIFF_OPT_TST must check for "o->use_color > 0", which lets an "unknown" value (i.e., the default) stay at "no color". In the previous code, a value of "-1" was not propagated at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-18 07:03:12 +02:00			`int use_color;`
git diff: support "-U" and "--unified" options properly We used to parse "-U" and "--unified" as part of the GIT_DIFF_OPTS environment variable, but strangely enough we would _not_ parse them as part of the normal diff command line (where we only accepted "-u"). This adds parsing of -U and --unified, both with an optional numeric argument. So now you can just say git diff --unified=5 to get a unified diff with a five-line context, instead of having to do something silly like GIT_DIFF_OPTS="--unified=5" git diff -u (that silly format does continue to still work, of course). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-13 22:23:48 +02:00			`int context;`
diff: add option to show context between close hunks Merge two hunks if there is only the specified number of otherwise unshown context between them. For --inter-hunk-context=1, the resulting patch has the same number of lines but shows uninterrupted context instead of a context header line in between. Patches generated with this option are easier to read but are also more likely to conflict if the file to be patched contains other changes. This patch keeps the default for this option at 0. It is intended to just make the feature available in order to see its advantages and downsides. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-12-28 19:45:32 +01:00			`int interhunkcontext;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`int break_opt;`
			`int detect_rename;`
git diff -D: omit the preimage of deletes When reviewing a patch while concentrating primarily on the text after then change, wading through pages of deleted text involves a cognitive burden. Introduce the -D option that omits the preimage text from the patch output for deleted files. When used with -B (represent total rewrite as a single wholesale deletion followed by a single wholesale addition), the preimage text is also omitted. To prevent such a patch from being applied by mistake, the output is designed not to be usable by "git apply" (or GNU "patch"); it is strictly for human consumption. It of course is possible to "apply" such a patch by hand, as a human can read the intention out of such a patch. It however is impossible to apply such a patch even manually in reverse, as the whole point of this option is to omit the information necessary to do so from the output. Initial request by Mart Sõmermaa, documentation and tests helped by Michael J Gruber. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-01 01:11:55 +01:00			`int irreversible_delete;`
git-diff: squelch "empty" diffs After starting to edit a working tree file but later when your edit ends up identical to the original (this can also happen when you ran a wholesale regexp replace with something like "perl -i" that does not actually modify many of the paths), "git diff" between the index and the working tree outputs many "empty" diffs that show "diff --git" headers and nothing else, because these paths are stat-dirty. While it was a way to warn the user that the earlier action of the user made the index ineffective as an optimization mechanism, it was felt too loud for the purpose of warning even to experienced users, and also resulted in confusing people new to git. This replaces the "empty" diffs with a single warning message at the end. Having many such paths hurts performance, and you can run "git-update-index --refresh" to update the lstat(2) information recorded in the index in such a case. "git-status" does so as a side effect, and that is more familiar to the end-user, so we recommend it to them. The change affects only "git diff" that outputs patch text, because that is where the annoyance of too many "empty" diff is most strongly felt, and because the warning message can be safely ignored by downstream tools without getting mistaken as part of the patch. For the low-level "git diff-files" and "git diff-index", the traditional behaviour is retained. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-03 22:33:31 +02:00			`int skip_stat_unmatch;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`int line_termination;`
			`int output_format;`
diff.h: make pickaxe_opts an unsigned bit field This variable is used as a bit field[1], and as we are about to add more fields, indicate its usage as a bit field by making it unsigned. [1] containing the bits #define DIFF_PICKAXE_ALL 1 #define DIFF_PICKAXE_REGEX 2 #define DIFF_PICKAXE_KIND_S 4 #define DIFF_PICKAXE_KIND_G 8 Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-01-04 23:50:39 +01:00			`unsigned pickaxe_opts;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`int rename_score;`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`int rename_limit;`
merge: improve inexact rename limit warning The warning is generated deep in the diffcore code, which means that it will come first, followed possibly by a spew of conflicts, making it hard to see. Instead, let's have diffcore pass back the information about how big the rename limit would needed to have been, and then the caller can provide a more appropriate message (and at a more appropriate time). No refactoring of other non-merge callers is necessary, because nobody else was even using the warn_on_rename_limit feature. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-19 11:20:51 +01:00			`int needed_rename_limit;`
diffcore-rename: fall back to -C when -C -C busts the rename limit When there are too many paths in the project, the number of rename source candidates "git diff -C -C" finds will exceed the rename detection limit, and no inexact rename detection is performed. We however could fall back to "git diff -C" if the number of modified paths is sufficiently small. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-01-06 22:50:06 +01:00			`int degraded_cc_to_c;`
add inexact rename detection progress infrastructure We might spend many seconds doing inexact rename detection with no output. It's nice to let the user know that something is actually happening. This patch adds the infrastructure, but no callers actually turn on progress reporting. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-02-20 10:51:16 +01:00			`int show_rename_progress;`
Allow specifying --dirstat cut-off percentage as a floating point number Only the first digit after the decimal point is kept, as the dirstat calculations all happen in permille. Selftests verifying floating-point percentage input has been added. Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-04-29 11:36:20 +02:00			`int dirstat_permille;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`int setup;`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`int abbrev;`
diff-lib: allow ita entries treated as "not yet exist in index" When comparing the index and the working tree to show which paths are new, and comparing the tree recorded in the HEAD and the index to see if committing the contents recorded in the index would result in an empty commit, we would want the former comparison to say "these are new paths" and the latter to say "there is no change" for paths that are marked as intent-to-add. We made a similar attempt at d95d728a ("diff-lib.c: adjust position of i-t-a entries in diff", 2015-03-16), which redefined the semantics of these two comparison modes globally, which was a disaster and had to be reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a entries in diff"", 2015-06-23). To make sure we do not repeat the same mistake, introduce a new internal diffopt option so that this different semantics can be asked for only by callers that ask it, while making sure other unaudited callers will get the same comparison result. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-24 12:42:19 +02:00			`int ita_invisible_in_index;`
diff.c: --ws-error-highlight=<kind> option Traditionally, we only cared about whitespace breakages introduced in new lines. Some people want to paint whitespace breakages on old lines, too. When they see a whitespace breakage on a new line, they can spot the same kind of whitespace breakage on the corresponding old line and want to say "Ah, those breakages are there but they were inherited from the original, so let's not touch them for now." Introduce `--ws-error-highlight=<kind>` option, that lets them pass a comma separated list of `old`, `new`, and `context` to specify what lines to highlight whitespace errors on. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-26 19:11:28 +02:00			`/* white-space error highlighting */`
diff.c: migrate emit_line_checked to use emit_diff_symbol Add a new flags field to emit_diff_symbol, that will be used by context lines for: * white space rules that are applicable (The first 12 bits) Take a note in cahe.c as well, when this ws rules are extended we have to fix the bits in the flags field. * how the rules are evaluated (actually this double encodes the sign of the line, but the code is easier to keep this way, bits 13,14,15) * if the line a blank line at EOF (bit 16) The check if new lines need to be marked up as extra lines at the end of file, is now done unconditionally. That should be ok, as 'new_blank_line_at_eof' has a quick early return. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 02:06:53 +02:00			`#define WSEH_NEW (1<<12)`
			`#define WSEH_CONTEXT (1<<13)`
			`#define WSEH_OLD (1<<14)`
diff.c: --ws-error-highlight=<kind> option Traditionally, we only cared about whitespace breakages introduced in new lines. Some people want to paint whitespace breakages on old lines, too. When they see a whitespace breakage on a new line, they can spot the same kind of whitespace breakage on the corresponding old line and want to say "Ah, those breakages are there but they were inherited from the original, so let's not touch them for now." Introduce `--ws-error-highlight=<kind>` option, that lets them pass a comma separated list of `old`, `new`, and `context` to specify what lines to highlight whitespace errors on. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-26 19:11:28 +02:00			`unsigned ws_error_highlight;`
diff --relative: output paths as relative to the current subdirectory This adds --relative option to the diff family. When you start from a subdirectory: $ git diff --relative shows only the diff that is inside your current subdirectory, and without $prefix part. People who usually live in subdirectories may like it. There are a few things I should also mention about the change: - This works not just with diff but also works with the log family of commands, but the history pruning is not affected. In other words, if you go to a subdirectory, you can say: $ git log --relative -p but it will show the log message even for commits that do not touch the current directory. You can limit it by giving pathspec yourself: $ git log --relative -p . This originally was not a conscious design choice, but we have a way to affect diff pathspec and pruning pathspec independently. IOW "git log --full-diff -p ." tells it to prune history to commits that affect the current subdirectory but show the changes with full context. I think it makes more sense to leave pruning independent from --relative than the obvious alternative of always pruning with the current subdirectory, which would break the symmetry. - Because this works also with the log family, you could format-patch a single change, limiting the effect to your subdirectory, like so: $ cd gitk-git $ git format-patch -1 --relative 911f1eb But because that is a special purpose usage, this option will never become the default, with or without repository or user preference configuration. The risk of producing a partial patch and sending it out by mistake is too great if we did so. - This is inherently incompatible with --no-index, which is a bolted-on hack that does not have much to do with git itself. I didn't bother checking and erroring out on the combined use of the options, but probably I should. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-12 23:26:02 +01:00			`const char *prefix;`
			`int prefix_length;`
fmt-patch: Support --attach This patch touches a couple of files, because it adds options to print a custom text just after the subject of a commit, and just after the diffstat. [jc: made "many dashes" used as the boundary leader into a single variable, to reduce the possibility of later tweaks to miscount the number of dashes to break it.] Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 15:40:29 +02:00			`const char *stat_sep;`
Teach diff about -b and -w flags This adds -b (--ignore-space-change) and -w (--ignore-all-space) flags to diff. The main part of the patch is teaching libxdiff about it. [jc: renamed xdl_line_match() to xdl_recmatch() since the former is used for different purposes in xpatchi.c which is in the parts of the upstream source we do not use.] Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-14 17:40:23 +02:00			`long xdl_opts;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
diff: support anchoring line(s) Teach diff a new algorithm, one that attempts to prevent user-specified lines from appearing as a deletion or addition in the end result. The end user can use this by specifying "--anchored=<text>" one or more times when using Git commands like "diff" and "show". Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-11-27 20:47:47 +01:00			`/* see Documentation/diff-options.txt */`
			`char **anchors;`
			`size_t anchors_nr, anchors_alloc;`

diff --stat: allow custom diffstat output width. This adds two parameters to "diff --stat". . --stat-width=72 tells that the page should fit on 72-column output. . --stat-name-width=30 tells that the filename part is limited to 30 columns. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-27 03:53:02 +02:00			`int stat_width;`
			`int stat_name_width;`
diff --stat: enable limiting of the graph part A new option --stat-graph-width=<width> can be used to limit the width of the graph part even is more space is available. Up to <width> columns will be used for the graph. If commits changing a lot of lines are displayed in a wide terminal window (200 or more columns), and the +- graph uses the full width, the output can be hard to comfortably scan with a horizontal movement of human eyes. Messages wrapped to about 80 columns would be interspersed with very long +- lines. It makes sense to limit the width of the graph part to a fixed value (e.g. 70 columns), even if more columns are available. Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-01 13:26:45 +01:00			`int stat_graph_width;`
diff: introduce --stat-lines to limit the stat lines Often one is interested in the full --stat output only for commits which change a few files, but not others, because larger restructuring gives a --stat which fills a few screens. Introduce a new option --stat-count=<count> which limits the --stat output to the first <count> lines, followed by a "..." line. It can also be given as the third parameter in --stat=<width>,<name-width>,<count>. Also, the unstuck form is supported analogous to the other two stat parameters. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-27 14:36:41 +02:00			`int stat_count;`
color-words: take an optional regular expression describing words In some applications, words are not delimited by white space. To allow for that, you can specify a regular expression describing what makes a word with git diff --color-words='[A-Za-z0-9]+' Note that words cannot contain newline characters. As suggested by Thomas Rast, the words are the exact matches of the regular expression. Note that a regular expression beginning with a '^' will match only a word at the beginning of the hunk, not a word at the beginning of a line, and is probably not what you want. This commit contains a quoting fix by Thomas Rast. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-17 17:29:45 +01:00			`const char *word_regex;`
diff: add --word-diff option that generalizes --color-words This teaches the --color-words engine a more general interface that supports two new modes: * --word-diff=plain, inspired by the 'wdiff' utility (most similar to 'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+} * --word-diff=porcelain, which generates an ad-hoc machine readable format: - each diff unit is prefixed by [-+ ] and terminated by newline as in unified diff - newlines in the input are output as a line consisting only of a tilde '~' Both of these formats still support color if it is enabled, using it to highlight the differences. --color-words becomes a synonym for --word-diff=color, which is the color-only format. Also adds some compatibility/convenience options. Thanks to Junio C Hamano and Miles Bader for good ideas. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-04-14 17:59:06 +02:00			`enum diff_words_type word_diff;`
diff: prepare for additional submodule formats A future patch will add a new format for displaying the difference of a submodule. Make it easier by changing how we store the current selected format. Replace the DIFF_OPT flag with an enumeration, as each format will be mutually exclusive. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-09-01 01:27:21 +02:00			`enum diff_submodule_format submodule_format;`
diff --stat: allow custom diffstat output width. This adds two parameters to "diff --stat". . --stat-width=72 tells that the page should fit on 72-column output. . --stat-name-width=30 tells that the filename part is limited to 30 columns. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-27 03:53:02 +02:00
diffcore: add a pickaxe option to find a specific blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) One might be tempted to extend git-describe to also work with blobs, such that `git describe <blob-id>` gives a description as '<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer number of responses (>110), it turns out this is tricky to get right. The hard part to get right is picking the correct 'commit-ish' as that could be the commit that (re-)introduced the blob or the blob that removed the blob; the blob could exist in different branches. Junio hinted at a different approach of solving this problem, which this patch implements. Teach the diff machinery another flag for restricting the information to what is shown. For example: $ ./git log --oneline --find-object=v2.0.0:Makefile b2feb64309 Revert the whole "ask curl-config" topic for now 47fbfded53 i18n: only extract comments marked with "TRANSLATORS:" we observe that the Makefile as shipped with 2.0 was appeared in v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The reason why these commits both occur prior to v2.0.0 are evil merges that are not found using this new mechanism. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-01-04 23:50:42 +01:00			`struct oidset *objfind;`

diff --no-index: also imitate the exit status of diff(1) diff sets the exit status to 0 when no changes were found, to 1 when changes were found, and 2 means error. We imitate this to be able to use "git diff" in the test scripts. (Actually, keeping in line with the rest of git, -1 is returned on error, which corresponds to an exit status 255). To find out if the diff is not empty, a member called "found_changes" was introduced in struct diff_options, which is set in builtin_diff() and fn_out_consume(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:34:54 +01:00			`/* this is set by diffcore for DIFF_FORMAT_PATCH */`
			`int found_changes;`

diff --follow: do call diffcore_std() as necessary Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier 1da6175 (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-13 21:17:45 +02:00			`/* to support internal diff recursion by --follow hack*/`
			`int found_follow;`

diff_opt: track whether flags have been set explicitly The diff_opt infrastructure sets flags based on defaults and command line options. It is impossible to tell whether a flag has been set as a default or on explicit request. Update the structure so that this detection is possible: * Add an extra "opt->touched_flags" that keeps track of all the fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR. * You may continue setting the default values to the flags, like commands in the "log" family do in cmd_log_init_defaults(), but after you finished setting the defaults, you clear the touched_flags field; * And then you let the usual callchain call diff_opt_parse(), allowing the opt->flags be set or unset, while keeping track of which bits the user touched; * There is an optional callback "opt->set_default" that is called at the very beginning to let you inspect touched_flags and update opt->flags appropriately, before the remainder of the diffcore machinery is set up, taking the opt->flags value into account. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-05-10 17:10:11 +02:00			`void (set_default)(struct diff_options );`

Write diff output to a file in struct diff_options Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-03-10 03:43:39 +01:00			`FILE *file;`
			`int close_file;`

diff.c: add --output-indicator-{new, old, context} This will prove useful in range-diff in a later patch as we will be able to differentiate between adding a new file (that line is starting with +++ and then the file name) and regular new lines. It could also be useful for experimentation in new patch formats, i.e. we could teach git to emit moved lines with lines other than +/-. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-08-17 22:43:52 +02:00			`#define OUTPUT_INDICATOR_NEW 0`
			`#define OUTPUT_INDICATOR_OLD 1`
			`#define OUTPUT_INDICATOR_CONTEXT 2`
			`char output_indicators[3];`

Convert struct diff_options to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:38 +01:00			`struct pathspec pathspec;`
tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-06 23:46:26 +02:00			`pathchange_fn_t pathchange;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`change_fn_t change;`
			`add_remove_fn_t add_remove;`
revision: quit pruning diff more quickly when possible When the revision traversal machinery is given a pathspec, we must compute the parent-diff for each commit to determine which ones are TREESAME. We set the QUICK diff flag to avoid looking at more entries than we need; we really just care whether there are any changes at all. But there is one case where we want to know a bit more: if --remove-empty is set, we care about finding cases where the change consists only of added entries (in which case we may prune the parent in try_to_simplify_commit()). To cover that case, our file_add_remove() callback does not quit the diff upon seeing an added entry; it keeps looking for other types of entries. But this means when --remove-empty is not set (and it is not by default), we compute more of the diff than is necessary. You can see this in a pathological case where a commit adds a very large number of entries, and we limit based on a broad pathspec. E.g.: perl -e ' chomp(my $blob = `git hash-object -w --stdin </dev/null`); for my $a (1..1000) { for my $b (1..1000) { print "100644 $blob\t$a/$b\n"; } } ' \| git update-index --index-info git commit -qm add git rev-list HEAD -- . This case takes about 100ms now, but after this patch only needs 6ms. That's not a huge improvement, but it's easy to get and it protects us against even more pathological cases (e.g., going from 1 million to 10 million files would take ten times as long with the current code, but not increase at all after this patch). This is reported to minorly speed-up pathspec limiting in real world repositories (like the 100-million-file Windows repository), but probably won't make a noticeable difference outside of pathological setups. This patch actually covers the case without --remove-empty, and the case where we see only deletions. See the in-code comment for details. Note that we have to add a new member to the diff_options struct so that our callback can see the value of revs->remove_empty_trees. This callback parameter could be passed to the "add_remove" and "change" callbacks, but there's not much point. They already receive the diff_options struct, and doing it this way avoids having to update the function signature of the other callbacks (arguably the format_callback and output_prefix functions could benefit from the same simplification). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-10-13 17:27:45 +02:00			`void *change_fn_data;`
diff: support custom callbacks for output Users can request the DIFF_FORMAT_CALLBACK output format to get a callback consisting of the whole diff_queue_struct. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-07 08:35:42 +02:00			`diff_format_fn_t format_callback;`
			`void *format_callback_data;`
Add a prefix output callback to diff output The callback can be used to add some prefix string to each line of diff output. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-26 09:08:02 +02:00			`diff_prefix_fn_t output_prefix;`
			`void *output_prefix_data;`
difftool: display the number of files in the diff queue in the prompt When --prompt option is set, git-difftool displays a prompt for each modified file to be viewed in an external diff program. At that point, it could be useful to display a counter and the total number of files in the diff queue. Below is the current difftool prompt for the first of 5 modified files: Viewing: 'diff.c' Launch 'vimdiff' [Y/n]: Consider the modified prompt: Viewing (1/5): 'diff.c' Launch 'vimdiff' [Y/n]: The current GIT_EXTERNAL_DIFF mechanism does not tell the number of paths in the diff queue nor the current counter. To make this "counter/total" info available for GIT_EXTERNAL_DIFF programs without breaking existing ones by doing the following: - Keep track of the number of paths shown so far in diff_options; - Export two new environment variables from run_external_diff() to show the total number of paths (from diff_queue_struct) and the current value of the counter (from diff_options); and - Update git-difftool--helper to use these two environment variables. Signed-off-by: Zoltan Klinger <zoltan.klinger@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-12-06 00:38:46 +01:00
			`int diff_path_counter;`
diff.c: buffer all output if asked to Introduce a new option 'emitted_symbols' in the struct diff_options which controls whether all output is buffered up until all output is available. It is set internally in diff.c when necessary. We'll have a new struct 'emitted_string' in diff.c which will be used to buffer each line. The emitted_string will duplicate the memory of the line to buffer as that is easiest to reason about for now. In a future patch we may want to decrease the memory usage by not duplicating all output for buffering but rather we may want to store offsets into the file or in case of hunk descriptions such as the similarity score, we could just store the relevant number and reproduce the text later on. This approach was chosen as a first step because it is quite simple compared to the alternative with less memory footprint. emit_diff_symbol factors out the emission part and depending on the diff_options->emitted_symbols the emission will be performed directly when calling emit_diff_symbol or after the whole process is done, i.e. by buffering we have add the possibility for a second pass over the whole output before doing the actual output. In 6440d34 (2012-03-14, diff: tweak a _copy_ of diff_options with word-diff) we introduced a duplicate diff options struct for word emissions as we may have different regex settings in there. When buffering the output, we need to operate on just one buffer, so we have to copy back the emissions of the word buffer into the main buffer. Unconditionally enable output via buffer in this patch as it yields a great opportunity for testing, i.e. all the diff tests from the test suite pass without having reordering issues (i.e. only parts of the output got buffered, and we forgot to buffer other parts). The test suite passes, which gives confidence that we converted all functions to use emit_string for output. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 02:07:06 +02:00
			`struct emitted_diff_symbols *emitted_symbols;`
diff.c: color moved lines differently When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] - if (!is_authorized_user()) [OM] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OM] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NM] + sensitive_stuff(spanning, [NM] + multiple, [NM] + lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] - if (!is_authorized_user()) [OMA] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OMA] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] + sensitive_stuff(spanning, [NMA] + multiple, [NMA] + lines); [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. It is useful to have moved lines colored, but there are annoying corner cases, such as a single line moved, that is very common. For example in a typical patch of C code, we have closing braces that end statement blocks or functions. While it is technically true that these lines are moved as they show up elsewhere, it is harmful for the review as the reviewers attention is drawn to such a minor side annoyance. For now let's have a simple solution of hardcoding the number of moved lines to be at least 3 before coloring them. Note, that the length is applied across all blocks to find the 'lonely' blocks that pollute new code, but do not interfere with a permutated block where each permutation has less lines than 3. Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 22:53:07 +02:00			`enum {`
			`COLOR_MOVED_NO = 0,`
diff.c: color moved lines differently, plain mode Add the 'plain' mode for move detection of code. This omits the checking for adjacent blocks, so it is not as useful. If you have a lot of the same blocks moved in the same patch, the 'Zebra' would end up slow as it is O(n^2) (n is number of same blocks). So this may be useful there and is generally easy to add. Instead be very literal at the move detection, do not skip over short blocks here. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 22:53:08 +02:00			`COLOR_MOVED_PLAIN = 1,`
diff.c: add a blocks mode for moved code detection The new "blocks" mode provides a middle ground between plain and zebra. It is as intuitive (few colors) as plain, but still has the requirement for a minimum of lines/characters to count a block as moved. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> (https://public-inbox.org/git/87o9j0uljo.fsf@evledraar.gmail.com/) Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-07-17 01:05:39 +02:00			`COLOR_MOVED_BLOCKS = 2,`
			`COLOR_MOVED_ZEBRA = 3,`
			`COLOR_MOVED_ZEBRA_DIM = 4,`
diff.c: color moved lines differently When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] - if (!is_authorized_user()) [OM] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OM] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NM] + sensitive_stuff(spanning, [NM] + multiple, [NM] + lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] - if (!is_authorized_user()) [OMA] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OMA] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] + sensitive_stuff(spanning, [NMA] + multiple, [NMA] + lines); [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. It is useful to have moved lines colored, but there are annoying corner cases, such as a single line moved, that is very common. For example in a typical patch of C code, we have closing braces that end statement blocks or functions. While it is technically true that these lines are moved as they show up elsewhere, it is harmful for the review as the reviewers attention is drawn to such a minor side annoyance. For now let's have a simple solution of hardcoding the number of moved lines to be at least 3 before coloring them. Note, that the length is applied across all blocks to find the 'lonely' blocks that pollute new code, but do not interfere with a permutated block where each permutation has less lines than 3. Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 22:53:07 +02:00			`} color_moved;`
			`#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA`
diff: define block by number of alphanumeric chars The existing behavior of diff --color-moved=zebra does not define the minimum size of a block at all, instead relying on a heuristic applied later to filter out sets of adjacent moved lines that are shorter than 3 lines long. This can be confusing, because a block could thus be colored as moved at the source but not at the destination (or vice versa), depending on its neighbors. Instead, teach diff that the minimum size of a block is 20 alphanumeric characters, the same heuristic used by "git blame". This allows diff to still exclude uninteresting lines appearing on their own (such as those solely consisting of one or a few closing braces), as was the intention of the adjacent-moved-line heuristic. This requires a change in some tests in that some of their lines are no longer considered to be part of a block, because they are too short. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-08-16 03:27:39 +02:00			`#define COLOR_MOVED_MIN_ALNUM_COUNT 20`
diff.c: add white space mode to move detection that allows indent changes The option of --color-moved has proven to be useful as observed on the mailing list. However when refactoring sometimes the indentation changes, for example when partitioning a functions into smaller helper functions the code usually mostly moved around except for a decrease in indentation. To just review the moved code ignoring the change in indentation, a mode to ignore spaces in the move detection as implemented in a previous patch would be enough. However the whole move coloring as motivated in commit 2e2d5ac (diff.c: color moved lines differently, 2017-06-30), brought up the notion of the reviewer being able to trust the move of a "block". As there are languages such as python, which depend on proper relative indentation for the control flow of the program, ignoring any white space change in a block would not uphold the promises of 2e2d5ac that allows reviewers to pay less attention to the inside of a block, as inside the reviewer wants to assume the same program flow. This new mode of white space ignorance will take this into account and will only allow the same white space changes per line in each block. This patch even allows only for the same change at the beginning of the lines. As this is a white space mode, it is made exclusive to other white space modes in the move detection. This patch brings some challenges, related to the detection of blocks. We need a wide net to catch the possible moved lines, but then need to narrow down to check if the blocks are still intact. Consider this example (ignoring block sizes): - A - B - C + A + B + C At the beginning of a block when checking if there is a counterpart for A, we have to ignore all space changes. However at the following lines we have to check if the indent change stayed the same. Checking if the indentation change did stay the same, is done by computing the indentation change by the difference in line length, and then assume the change is only in the beginning of the longer line, the common tail is the same. That is why the test contains lines like: - <TAB> A ... + A <TAB> ... As the first line starting a block is caught using a compare function that ignores white spaces unlike the rest of the block, where the white space delta is taken into account for the comparison, we also have to think about the following situation: - A - B - A - B + A + B + A + B When checking if the first A (both in the + and - lines) is a start of a block, we have to check all 'A' and record all the white space deltas such that we can find the example above to be just one block that is indented. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-07-18 21:31:55 +02:00
			`/* XDF_WHITESPACE_FLAGS regarding block detection are set at 2, 3, 4 */`
			`#define COLOR_MOVED_WS_ALLOW_INDENTATION_CHANGE (1<<5)`
diff: align move detection error handling with other options This changes the error handling for the options --color-moved-ws and --color-moved-ws to be like the rest of the options. Move the die() call out of parse_color_moved_ws into the parsing of command line options. As the function returns a bit field, change its signature to return an unsigned instead of an int; add a new bit to signal errors. Once the error is signaled, we discard the other bits, such that it doesn't matter if the error bit overlaps with any other bit. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-11-13 22:33:57 +01:00			`#define COLOR_MOVED_WS_ERROR (1<<0)`
			`unsigned color_moved_ws_handling;`
diff.c: reduce implicit dependency on the_index diff and textconv code has so widespread use that it's hard to simply update their api and all call sites at once because it would result in a big patch. For now reduce the_index references to two places: diff_setup() and fill_textconv(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-09-21 17:57:19 +02:00
			`struct repository *repo;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`};`

submodule.c: migrate diff output to use emit_diff_symbol As the submodule process is no longer attached to the same file pointer 'o->file' as the superprojects process, there is a different result in color.c::check_auto_color. That is why we need to pass coloring explicitly, such that the submodule coloring decision will be made by the child process processing the submodule. Only DIFF_SYMBOL_SUBMODULE_PIPETHROUGH contains color, the other symbols are for embedding the submodule output into the superprojects output. Remove the colors from the function signatures, as all the coloring decisions will be made either inside the child process or the final emit_diff_symbol, but not in the functions driving the submodule diff. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 02:07:00 +02:00			`void diff_emit_submodule_del(struct diff_options o, const char line);`
			`void diff_emit_submodule_add(struct diff_options o, const char line);`
			`void diff_emit_submodule_untracked(struct diff_options o, const char path);`
			`void diff_emit_submodule_modified(struct diff_options o, const char path);`
			`void diff_emit_submodule_header(struct diff_options o, const char header);`
			`void diff_emit_submodule_error(struct diff_options o, const char err);`
			`void diff_emit_submodule_pipethrough(struct diff_options *o,`
			`const char *line, int len);`

Colorize 'commit' lines in log ui When paging through the output of git-whatchanged, the color cues help to visually navigate within a diff. However, it is difficult to notice when a new commit starts, because the commit and log are shown in the "normal" color. This patch colorizes the 'commit' line, customizable through diff.colors.commit and defaulting to yellow. As a side effect, some of the diff color engine (slot enum, get_color) has become accessible outside of diff.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-23 11:24:18 +02:00			`enum color_diff {`
			`DIFF_RESET = 0,`
diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT The latter is a much more descriptive name (and we support "color.diff.context" now). This also updates the name of any local variables which were used to store the color. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-27 22:48:46 +02:00			`DIFF_CONTEXT = 1,`
Colorize 'commit' lines in log ui When paging through the output of git-whatchanged, the color cues help to visually navigate within a diff. However, it is difficult to notice when a new commit starts, because the commit and log are shown in the "normal" color. This patch colorizes the 'commit' line, customizable through diff.colors.commit and defaulting to yellow. As a side effect, some of the diff color engine (slot enum, get_color) has become accessible outside of diff.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-23 11:24:18 +02:00			`DIFF_METAINFO = 2,`
			`DIFF_FRAGINFO = 3,`
			`DIFF_FILE_OLD = 4,`
			`DIFF_FILE_NEW = 5,`
			`DIFF_COMMIT = 6,`
diff.c: second war on whitespace. This adds DIFF_WHITESPACE color class (default = reverse red) to colored diff output to let you catch common whitespace errors. - trailing whitespaces at the end of line - a space followed by a tab in the indent Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-23 07:48:39 +02:00			`DIFF_WHITESPACE = 7,`
diff.c: color moved lines differently When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] - if (!is_authorized_user()) [OM] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OM] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NM] + sensitive_stuff(spanning, [NM] + multiple, [NM] + lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] - if (!is_authorized_user()) [OMA] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OMA] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] + sensitive_stuff(spanning, [NMA] + multiple, [NMA] + lines); [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. It is useful to have moved lines colored, but there are annoying corner cases, such as a single line moved, that is very common. For example in a typical patch of C code, we have closing braces that end statement blocks or functions. While it is technically true that these lines are moved as they show up elsewhere, it is harmful for the review as the reviewers attention is drawn to such a minor side annoyance. For now let's have a simple solution of hardcoding the number of moved lines to be at least 3 before coloring them. Note, that the length is applied across all blocks to find the 'lonely' blocks that pollute new code, but do not interfere with a permutated block where each permutation has less lines than 3. Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 22:53:07 +02:00			`DIFF_FUNCINFO = 8,`
			`DIFF_FILE_OLD_MOVED = 9,`
			`DIFF_FILE_OLD_MOVED_ALT = 10,`
diff.c: add dimming to moved line detection Any lines inside a moved block of code are not interesting. Boundaries of blocks are only interesting if they are next to another block of moved code. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-06-30 22:53:09 +02:00			`DIFF_FILE_OLD_MOVED_DIM = 11,`
			`DIFF_FILE_OLD_MOVED_ALT_DIM = 12,`
			`DIFF_FILE_NEW_MOVED = 13,`
			`DIFF_FILE_NEW_MOVED_ALT = 14,`
			`DIFF_FILE_NEW_MOVED_DIM = 15,`
range-diff: use dim/bold cues to improve dual color mode It is a confusing thing to look at a diff of diffs. All too easy is it to mix up whether the -/+ markers refer to the "inner" or the "outer" diff, i.e. whether a `+` indicates that a line was added by either the old or the new diff (or both), or whether the new diff does something different than the old diff. To make things easier to process for normal developers, we introduced the dual color mode which colors the lines according to the commit diff, i.e. lines that are added by a commit (whether old, new, or both) are colored in green. In non-dual color mode, the lines would be colored according to the outer diff: if the old commit added a line, it would be colored red (because that line addition is only present in the first commit range that was specified on the command-line, i.e. the "old" commit, but not in the second commit range, i.e. the "new" commit). However, this dual color mode is still not making things clear enough, as we are looking at two levels of diffs, and we still only pick a color according to one of them (the outer diff marker is colored differently, of course, but in particular with deep indentation, it is easy to lose track of that outer diff marker's background color). Therefore, let's add another dimension to the mix. Still use green/red/normal according to the commit diffs, but now also dim the lines that were only in the old commit, and use bold face for the lines that are only in the new commit. That way, it is much easier not to lose track of, say, when we are looking at a line that was added in the previous iteration of a patch series but the new iteration adds a slightly different version: the obsolete change will be dimmed, the current version of the patch will be bold. At least this developer has a much easier time reading the range-diffs that way. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-08-13 13:33:32 +02:00			`DIFF_FILE_NEW_MOVED_ALT_DIM = 16,`
			`DIFF_CONTEXT_DIM = 17,`
			`DIFF_FILE_OLD_DIM = 18,`
			`DIFF_FILE_NEW_DIM = 19,`
			`DIFF_CONTEXT_BOLD = 20,`
			`DIFF_FILE_OLD_BOLD = 21,`
			`DIFF_FILE_NEW_BOLD = 22,`
Colorize 'commit' lines in log ui When paging through the output of git-whatchanged, the color cues help to visually navigate within a diff. However, it is difficult to notice when a new commit starts, because the commit and log are shown in the "normal" color. This patch colorizes the 'commit' line, customizable through diff.colors.commit and defaulting to yellow. As a side effect, some of the diff color engine (slot enum, get_color) has become accessible outside of diff.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-23 11:24:18 +02:00			`};`
			`const char *diff_get_color(int diff_use_color, enum color_diff ix);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`#define diff_get_color_opt(o, ix) \`
diff: refactor COLOR_DIFF from a flag into an int This lets us store more than just a bit flag for whether we want color; we can also store whether we want automatic colors. This can be useful for making the automatic-color decision closer to the point of use. This mostly just involves replacing DIFF_OPT_* calls with manipulations of the flag. The biggest exception is that calls to DIFF_OPT_TST must check for "o->use_color > 0", which lets an "unknown" value (i.e., the default) stay at "no color". In the previous code, a value of "-1" was not propagated at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-08-18 07:03:12 +02:00			`diff_get_color((o)->use_color, ix)`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00
Colorize 'commit' lines in log ui When paging through the output of git-whatchanged, the color cues help to visually navigate within a diff. However, it is difficult to notice when a new commit starts, because the commit and log are shown in the "normal" color. This patch colorizes the 'commit' line, customizable through diff.colors.commit and defaulting to yellow. As a side effect, some of the diff color engine (slot enum, get_color) has become accessible outside of diff.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-23 11:24:18 +02:00
diff: add diff_line_prefix function This is a helper function to call the diff output_prefix function and return its value as a C string, allowing us to greatly simplify everywhere that needs to get the output prefix. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-07 21:15:26 +01:00			`const char diff_line_prefix(struct diff_options );`


fmt-patch: Support --attach This patch touches a couple of files, because it adds options to print a custom text just after the subject of a commit, and just after the diffstat. [jc: made "many dashes" used as the boundary leader into a single variable, to reduce the possibility of later tweaks to miscount the number of dashes to break it.] Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 15:40:29 +02:00			`extern const char mime_boundary_leader[];`

diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`struct combine_diff_path *diff_tree_paths(`
tree-diff: convert diff_tree_paths to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-30 19:31:06 +02:00			`struct combine_diff_path p, const struct object_id oid,`
			`const struct object_id **parents_oid, int nparent,`
tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch hurts performance badly for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could skip recursing into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - \|t\| \|p1\| \|pn\| \|-\| \|--\| ... \|--\| imin = argmin(p1...pn) \| \| \| \| \| \| \|-\| \|--\| \|--\| \|.\| \|. \| \|. \| . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) \| \| v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - \|a\| \|b\| a < b -> a ∉ B -> D(A,B) += +a a↓ \|-\| \|-\| a > b -> b ∉ A -> D(A,B) += -b b↓ \| \| \| \| a = b -> investigate δ(a,b) a↓ b↓ \|-\| \|-\| \|.\| \|.\| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path ) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-06 23:46:26 +02:00			`struct strbuf base, struct diff_options opt);`
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int diff_tree_oid(const struct object_id *old_oid,`
			`const struct object_id *new_oid,`
			`const char base, struct diff_options opt);`
			`int diff_root_tree_oid(const struct object_id new_oid, const char base,`
			`struct diff_options *opt);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
diff-files: -c and --cc options. This ports the "combined diff" to diff-files so that differences to the working tree files since stage 2 and stage 3 are shown the same way as combined diff output from diff-tree for the merge commit would be shown if the current working tree files are committed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-28 09:03:38 +01:00			`struct combine_diff_path {`
			`struct combine_diff_path *next;`
			`char *path;`
combine-diff: show mode changes as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-06 21:53:07 +01:00			`unsigned int mode;`
diff: convert struct combine_diff_path to object_id Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-03-14 00:39:33 +01:00			`struct object_id oid;`
combine-diff: show mode changes as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-06 21:53:07 +01:00			`struct combine_diff_parent {`
combine-diff: Record diff status a bit more faithfully This shows "new file mode XXXX" and "deleted file mode XXXX" lines like two-way diff-patch output does, by checking the status from each parent. The diff-raw output for combined diff is made a bit uglier by showing diff status letters with each parent. While most of the case you would see "MM" in the output, an Evil Merge that touches a path that was added by inheriting from one parent is possible and it would be shown like these: $ git-diff-tree --abbrev -c HEAD 2d7ca89675eb8888b0b88a91102f096d4471f09f ::000000 000000 100644 0000000... 0000000... 31dd686... AA b ::000000 100644 100644 0000000... 6c884ae... c6d4fa8... AM d ::100644 100644 100644 4f7cbe7... f8c295c... 19d5d80... RR e Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-10 11:30:52 +01:00			`char status;`
combine-diff: show mode changes as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-06 21:53:07 +01:00			`unsigned int mode;`
diff: convert struct combine_diff_path to object_id Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-03-14 00:39:33 +01:00			`struct object_id oid;`
combine-diff: show mode changes as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-06 21:53:07 +01:00			`} parent[FLEX_ARRAY];`
diff-files: -c and --cc options. This ports the "combined diff" to diff-files so that differences to the working tree files since stage 2 and stage 3 are shown the same way as combined diff output from diff-tree for the merge commit would be shown if the current working tree files are committed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-28 09:03:38 +01:00			`};`
combine-diff: show mode changes as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-06 21:53:07 +01:00			`#define combine_diff_path_size(n, l) \`
tree-diff: catch integer overflow in combine_diff_path allocation A combine_diff_path struct has two "flex" members allocated alongside the struct: a string to hold the pathname, and an array of parent pointers. We use an "int" to compute this, meaning we may easily overflow it if the pathname is extremely long. We can fix this by using size_t, and checking for overflow with the st_add helper. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-02-19 12:21:30 +01:00			`st_add4(sizeof(struct combine_diff_path), (l), 1, \`
			`st_mult(sizeof(struct combine_diff_parent), (n)))`
diff-files: -c and --cc options. This ports the "combined diff" to diff-files so that differences to the working tree files since stage 2 and stage 3 are shown the same way as combined diff output from diff-tree for the merge commit would be shown if the current working tree files are committed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-28 09:03:38 +01:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`void show_combined_diff(struct combine_diff_path *elem, int num_parent,`
			`int dense, struct rev_info *);`
diff-files: -c and --cc options. This ports the "combined diff" to diff-files so that differences to the working tree files since stage 2 and stage 3 are shown the same way as combined diff output from diff-tree for the merge commit would be shown if the current working tree files are committed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-28 09:03:38 +01:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`void diff_tree_combined(const struct object_id oid, const struct oid_array parents, int dense, struct rev_info *rev);`
built-in diff: assorted updates. "git diff(n)" without --base, --ours, etc. defaults to --cc, which usually is the same as -p unless you are in the middle of a conflicted merge, just like the shell script version. "git diff(n) blobA blobB path" complains and dies. "git diff(n) tree0 tree1 tree2...treeN" does combined diff that shows a merge of tree1..treeN to result in tree0. Giving "-c" option to any command that defaults to "--cc" turns off dense-combined flag. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-29 10:24:49 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`void diff_tree_combined_merge(const struct commit commit, int dense, struct rev_info rev);`
diff-tree -c: show a merge commit a bit more sensibly. A new option '-c' to diff-tree changes the way a merge commit is displayed when generating a patch output. It shows a "combined diff" (hence the option letter 'c'), which looks like this: $ git-diff-tree --pretty -c -p fec9ebf1 \| head -n 18 diff-tree fec9ebf... (from parents) Merge: 0620db3... 8a263ae... Author: Junio C Hamano <junkio@cox.net> Date: Sun Jan 15 22:25:35 2006 -0800 Merge fixes up to GIT 1.1.3 diff --combined describe.c @@@ +98,7 @@@ return (a_date > b_date) ? -1 : (a_date == b_date) ? 0 : 1; } - static void describe(char arg) - static void describe(struct commit cmit, int last_one) ++ static void describe(char arg, int last_one) { + unsigned char sha1[20]; + struct commit cmit; There are a few things to note about this feature: - The '-c' option implies '-p'. It also implies '-m' halfway in the sense that "interesting" merges are shown, but not all merges. - When a blob matches one of the parents, we do not show a diff for that path at all. For a merge commit, this option shows paths with real file-level merge (aka "interesting things"). - As a concequence of the above, an "uninteresting" merge is not shown at all. You can use '-m' in addition to '-c' to show the commit log for such a merge, but there will be no combined diff output. - Unlike "gitk", the output is monochrome. A '-' character in the nth column means the line is from the nth parent and does not appear in the merge result (i.e. removed from that parent's version). A '+' character in the nth column means the line appears in the merge result, and the nth parent does not have that line (i.e. added by the merge itself or inherited from another parent). The above example output shows that the function signature was changed from either parents (hence two "-" lines and a "++" line), and "unsigned char sha1[20]", prefixed by a " +", was inherited from the first parent. The code as sent to the list was buggy in few corner cases, which I have fixed since then. It does not bother to keep track of and show the line numbers from parent commits, which it probably should. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-24 10:22:04 +01:00
diff: vary default prefix depending on what are compared With a new configuration "diff.mnemonicprefix", "git diff" shows the differences between various combinations of preimage and postimage trees with prefixes different from the standard "a/" and "b/". Hopefully this will make the distinction stand out for some people. "git diff" compares the (i)ndex and the (w)ork tree; "git diff HEAD" compares a (c)ommit and the (w)ork tree; "git diff --cached" compares a (c)ommit and the (i)ndex; "git-diff HEAD:file1 file2" compares an (o)bject and a (w)ork tree entity; "git diff --no-index a b" compares two non-git things (1) and (2). Because these mnemonics now have meanings, they are swapped when reverse diff is in effect and this feature is enabled. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-19 05:08:09 +02:00			`void diff_set_mnemonic_prefix(struct diff_options options, const char a, const char *b);`

diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int diff_can_quit_early(struct diff_options *);`
diff: futureproof "stop feeding the backend early" logic Refactor the "do not stop feeding the backend early" logic into a small helper function and use it in both run_diff_files() and diff_tree() that has the stop-early optimization. We may later add other types of diffcore transformation that require to look at the whole result like diff-filter does, and having the logic in a single place is essential for longer term maintainability. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-31 18:14:17 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`void diff_addremove(struct diff_options *,`
			`int addremove,`
			`unsigned mode,`
			`const struct object_id *oid,`
			`int oid_valid,`
			`const char *fullpath, unsigned dirty_submodule);`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`void diff_change(struct diff_options *,`
			`unsigned mode1, unsigned mode2,`
			`const struct object_id *old_oid,`
			`const struct object_id *new_oid,`
			`int old_oid_valid, int new_oid_valid,`
			`const char *fullpath,`
			`unsigned dirty_submodule1, unsigned dirty_submodule2);`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`struct diff_filepair diff_unmerge(struct diff_options , const char *path);`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00
[PATCH] Clean up diff_setup() to make it more extensible. This changes the argument of diff_setup() from an integer that says if we are feeding reversed diff to a bitmask, so that later global options can be added more easily. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:54:37 +02:00			`#define DIFF_SETUP_REVERSE 1`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`#define DIFF_SETUP_USE_SIZE_CACHE 4`
[PATCH] diff: Fix docs and add -O to diff-helper. This patch updates diff documentation and usage strings: - clarify the semantics of -R. It is not "output in reverse"; rather, it is "I will feed diff backwards". Semantically they are different when -C is involved. - describe -O in usage strings of diff-* brothers. It was implemented, documented but not described in usage text. Also it adds -O to diff-helper. Like -S (and unlike -M/-C/-B), this option can work on sanitized diff-raw output produced by the diff-* brothers. While we are at it, the call it makes to diffcore is cleaned up to use the diffcore_std() like everybody else, and the declaration for the low level diffcore routines are moved from diff.h (public) to diffcore.h (private between diff.c and diffcore backends). Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:36:43 +02:00
diff: parse separate options like -S foo Change the option parsing logic in revision.c to accept separate forms like `-S foo' in addition to `-Sfoo'. The rest of git already accepted this form, but revision.c still used its own option parsing. Short options affected are -S<string>, -l<num> and -O<orderfile>, for which an empty string wouldn't make sense, hence -<option> <arg> isn't ambiguous. This patch does not handle --stat-name-width and --stat-width, which are special-cases where diff_long_opt do not apply. They are handled in a separate patch to ease review. Original patch by Matthieu Moy, plus refactoring by Jonathan Nieder. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-05 10:22:52 +02:00			`/*`
Use the word 'stuck' instead of 'sticked' The past participle of 'stick' is 'stuck'. Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-10-31 12:08:28 +01:00			`* Poor man's alternative to parse-option, to allow both stuck form`
diff: parse separate options like -S foo Change the option parsing logic in revision.c to accept separate forms like `-S foo' in addition to `-Sfoo'. The rest of git already accepted this form, but revision.c still used its own option parsing. Short options affected are -S<string>, -l<num> and -O<orderfile>, for which an empty string wouldn't make sense, hence -<option> <arg> isn't ambiguous. This patch does not handle --stat-name-width and --stat-width, which are special-cases where diff_long_opt do not apply. They are handled in a separate patch to ease review. Original patch by Matthieu Moy, plus refactoring by Jonathan Nieder. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-05 10:22:52 +02:00			`* (--option=value) and separate form (--option value).`
			`*/`
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int parse_long_opt(const char opt, const char *argv,`
			`const char **optarg);`

			`int git_diff_basic_config(const char var, const char value, void *cb);`
			`int git_diff_heuristic_config(const char var, const char value, void *cb);`
			`void init_diff_ui_defaults(void);`
			`int git_diff_ui_config(const char var, const char value, void *cb);`
diff.c: remove implicit dependency on the_index A new variant repo_diff_setup() is added that takes 'struct repository *' and diff_setup() becomes a thin macro around it that is protected by NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to NO_THE_INDEX_.... The plan is these macros will always be defined for all library files and the macros are only accessible in builtin/ Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-09-21 17:57:24 +02:00			`#ifndef NO_THE_REPOSITORY_COMPATIBILITY_MACROS`
			`#define diff_setup(diffopts) repo_diff_setup(the_repository, diffopts)`
			`#endif`
			`void repo_diff_setup(struct repository , struct diff_options );`
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int diff_opt_parse(struct diff_options , const char , int, const char );`
			`void diff_setup_done(struct diff_options *);`
			`int git_config_rename(const char var, const char value);`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`#define DIFF_DETECT_RENAME 1`
			`#define DIFF_DETECT_COPY 2`

[PATCH] Add --pickaxe-all to diff-* brothers. When --pickaxe-all is given in addition to -S, pickaxe shows the entire diffs contained in the changeset, not just the diffs for the filepair that touched the sought-after string. This is useful to see the changes in context. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:28 +02:00			`#define DIFF_PICKAXE_ALL 1`
Support for pickaxe matching regular expressions git-diff-* --pickaxe-regex will change the -S pickaxe to match POSIX extended regular expressions instead of fixed strings. The regex.h library is a rather stupid interface and I like pcre too, but with any luck it will be everywhere we will want to run Git on, it being POSIX.2 and all. I'm not sure if we can expect platforms like AIX to conform to POSIX.2 or if win32 has regex.h. We might add a flag to Makefile if there is a portability trouble potential. Signed-off-by: Petr Baudis <pasky@suse.cz> 2006-03-29 02:16:33 +02:00			`#define DIFF_PICKAXE_REGEX 2`
[PATCH] Add -B flag to diff-* brothers. A new diffcore transformation, diffcore-break.c, is introduced. When the -B flag is given, a patch that represents a complete rewrite is broken into a deletion followed by a creation. This makes it easier to review such a complete rewrite patch. The -B flag takes the same syntax as the -M and -C flags to specify the minimum amount of non-source material the resulting file needs to have to be considered a complete rewrite, and defaults to 99% if not specified. As the new test t4008-diff-break-rewrite.sh demonstrates, if a file is a complete rewrite, it is broken into a delete/create pair, which can further be subjected to the usual rename detection if -M or -C is used. For example, if file0 gets completely rewritten to make it as if it were rather based on file1 which itself disappeared, the following happens: The original change looks like this: file0 --> file0' (quite different from file0) file1 --> /dev/null After diffcore-break runs, it would become this: file0 --> /dev/null /dev/null --> file0' file1 --> /dev/null Then diffcore-rename matches them up: file1 --> file0' The internal score values are finer grained now. Earlier maximum of 10000 has been raised to 60000; there is no user visible changes but there is no reason to waste available bits. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 09:08:37 +02:00
git log/diff: add -G<regexp> that greps in the patch text Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the "git diff" family of commands. This limits the diff queue to filepairs whose patch text actually has an added or a deleted line that matches the given regexp. Unlike "-S<regexp>", changing other parts of the line that has a substring that matches the given regexp IS counted as a change, as such a change would appear as one deletion followed by one addition in a patch text. Unlike -S (pickaxe) that is intended to be used to quickly detect a commit that changes the number of occurrences of hits between the preimage and the postimage to serve as a part of larger toolchain, this is meant to be used as the top-level Porcelain feature. The implementation unfortunately has to run "diff" twice if you are running "log" family of commands to produce patches in the final output (e.g. "git log -p" or "git format-patch"). I think we _could_ cache the result in-core if we wanted to, but that would require larger surgery to the diffcore machinery (i.e. adding an extra pointer in the filepair structure to keep a pointer to a strbuf around, stuff the textual diff to the strbuf inside diffgrep_consume(), and make use of it in later stages when it is available) and it may not be worth it. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-23 19:17:03 +02:00			`#define DIFF_PICKAXE_KIND_S 4 /* traditional plumbing counter */`
			`#define DIFF_PICKAXE_KIND_G 8 /* grep in the patch */`
diffcore: add a pickaxe option to find a specific blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) One might be tempted to extend git-describe to also work with blobs, such that `git describe <blob-id>` gives a description as '<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer number of responses (>110), it turns out this is tricky to get right. The hard part to get right is picking the correct 'commit-ish' as that could be the commit that (re-)introduced the blob or the blob that removed the blob; the blob could exist in different branches. Junio hinted at a different approach of solving this problem, which this patch implements. Teach the diff machinery another flag for restricting the information to what is shown. For example: $ ./git log --oneline --find-object=v2.0.0:Makefile b2feb64309 Revert the whole "ask curl-config" topic for now 47fbfded53 i18n: only extract comments marked with "TRANSLATORS:" we observe that the Makefile as shipped with 2.0 was appeared in v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The reason why these commits both occur prior to v2.0.0 are evil merges that are not found using this new mechanism. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-01-04 23:50:42 +01:00			`#define DIFF_PICKAXE_KIND_OBJFIND 16 /* specific object IDs */`
git log/diff: add -G<regexp> that greps in the patch text Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the "git diff" family of commands. This limits the diff queue to filepairs whose patch text actually has an added or a deleted line that matches the given regexp. Unlike "-S<regexp>", changing other parts of the line that has a substring that matches the given regexp IS counted as a change, as such a change would appear as one deletion followed by one addition in a patch text. Unlike -S (pickaxe) that is intended to be used to quickly detect a commit that changes the number of occurrences of hits between the preimage and the postimage to serve as a part of larger toolchain, this is meant to be used as the top-level Porcelain feature. The implementation unfortunately has to run "diff" twice if you are running "log" family of commands to produce patches in the final output (e.g. "git log -p" or "git format-patch"). I think we _could_ cache the result in-core if we wanted to, but that would require larger surgery to the diffcore machinery (i.e. adding an extra pointer in the filepair structure to keep a pointer to a strbuf around, stuff the textual diff to the strbuf inside diffgrep_consume(), and make use of it in later stages when it is available) and it may not be worth it. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-23 19:17:03 +02:00
diffcore: add a pickaxe option to find a specific blob Sometimes users are given a hash of an object and they want to identify it further (ex.: Use verify-pack to find the largest blobs, but what are these? or [1]) One might be tempted to extend git-describe to also work with blobs, such that `git describe <blob-id>` gives a description as '<commit-ish>:<path>'. This was implemented at [2]; as seen by the sheer number of responses (>110), it turns out this is tricky to get right. The hard part to get right is picking the correct 'commit-ish' as that could be the commit that (re-)introduced the blob or the blob that removed the blob; the blob could exist in different branches. Junio hinted at a different approach of solving this problem, which this patch implements. Teach the diff machinery another flag for restricting the information to what is shown. For example: $ ./git log --oneline --find-object=v2.0.0:Makefile b2feb64309 Revert the whole "ask curl-config" topic for now 47fbfded53 i18n: only extract comments marked with "TRANSLATORS:" we observe that the Makefile as shipped with 2.0 was appeared in v1.9.2-471-g47fbfded53 and in v2.0.0-rc1-5-gb2feb6430b. The reason why these commits both occur prior to v2.0.0 are evil merges that are not found using this new mechanism. [1] https://stackoverflow.com/questions/223678/which-commit-has-this-blob [2] https://public-inbox.org/git/20171028004419.10139-1-sbeller@google.com/ Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-01-04 23:50:42 +01:00			`#define DIFF_PICKAXE_KINDS_MASK (DIFF_PICKAXE_KIND_S \| \`
			`DIFF_PICKAXE_KIND_G \| \`
			`DIFF_PICKAXE_KIND_OBJFIND)`
diff: introduce DIFF_PICKAXE_KINDS_MASK Currently the check whether to perform pickaxing is done via checking `diffopt->pickaxe`, which contains the command line argument that we want to pickaxe for. Soon we'll introduce a new type of pickaxing, that will not store anything in the `.pickaxe` field, so let's migrate the check to be dependent on pickaxe_opts. It is not enough to just replace the check for pickaxe by pickaxe_opts, because flags might be set, but pickaxing was not requested ('-i'). To cope with that, introduce a mask to check only for the bits indicating the modes of operation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-01-04 23:50:41 +01:00
diff: migrate diff_flags.pickaxe_ignore_case to a pickaxe_opts bit Currently flags for pickaxing are found in different places. Unify the flags into the `pickaxe_opts` field, which will contain any pickaxe related flags. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-01-04 23:50:40 +01:00			`#define DIFF_PICKAXE_IGNORE_CASE 32`
git log/diff: add -G<regexp> that greps in the patch text Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the "git diff" family of commands. This limits the diff queue to filepairs whose patch text actually has an added or a deleted line that matches the given regexp. Unlike "-S<regexp>", changing other parts of the line that has a substring that matches the given regexp IS counted as a change, as such a change would appear as one deletion followed by one addition in a patch text. Unlike -S (pickaxe) that is intended to be used to quickly detect a commit that changes the number of occurrences of hits between the preimage and the postimage to serve as a part of larger toolchain, this is meant to be used as the top-level Porcelain feature. The implementation unfortunately has to run "diff" twice if you are running "log" family of commands to produce patches in the final output (e.g. "git log -p" or "git format-patch"). I think we _could_ cache the result in-core if we wanted to, but that would require larger surgery to the diffcore machinery (i.e. adding an extra pointer in the filepair structure to keep a pointer to a strbuf around, stuff the textual diff to the strbuf inside diffgrep_consume(), and make use of it in later stages when it is available) and it may not be worth it. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-23 19:17:03 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`void diffcore_std(struct diff_options *);`
			`void diffcore_fix_diff_index(struct diff_options *);`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00
[PATCH] Clean up diff option descriptions. I got tired of maintaining almost duplicated descriptions in diff-* brothers, both in usage string and documentation. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:52:35 +02:00			`#define COMMON_DIFF_OPTIONS_HELP \`
			`"\ncommon diff options:\n" \`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`" -z output diff-raw with lines terminated with NUL.\n" \`
			`" -p output patch format.\n" \`
			`" -u synonym for -p.\n" \`
Document --patch-with-raw Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 13:22:17 +02:00			`" --patch-with-raw\n" \`
			`" output both a patch and the diff-raw format.\n" \`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`" --stat show diffstat instead of patch.\n" \`
diff --numstat [jc: with documentation from Jakub] Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-12 12:01:00 +02:00			`" --numstat show numeric diffstat instead of patch.\n" \`
diff-options: add --patch-with-stat With this option, git prepends a diffstat in front of the patch. Since I really, really do not know what a diffstat of a combined diff ("merge diff") should look like, the diffstat is not generated for these. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-15 13:41:18 +02:00			`" --patch-with-stat\n" \`
			`" output a patch and prepend its diffstat.\n" \`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`" --name-only show only names of changed files.\n" \`
Diff: --name-status output format. The new output format shows only the status letter and paths. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:20:06 +02:00			`" --name-status show names and status of changed files.\n" \`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`" --full-index show full object name on index lines.\n" \`
diff --abbrev: document --abbrev=<n> form. It was implemented there but was not advertised. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-18 11:03:15 +01:00			`" --abbrev=<n> abbreviate object names in diff-tree header and diff-raw.\n" \`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`" -R swap input file pairs.\n" \`
			`" -B detect complete rewrites.\n" \`
			`" -M detect renames.\n" \`
			`" -C detect copies.\n" \`
[PATCH] Clean up diff option descriptions. I got tired of maintaining almost duplicated descriptions in diff-* brothers, both in usage string and documentation. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:52:35 +02:00			`" --find-copies-harder\n" \`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`" try unchanged files as candidate for copy detection.\n" \`
			`" -l<n> limit rename attempts up to <n> paths.\n" \`
			`" -O<file> reorder diffs according to the <file>.\n" \`
			`" -S<string> find filepair whose only one side contains the string.\n" \`
[PATCH] Clean up diff option descriptions. I got tired of maintaining almost duplicated descriptions in diff-* brothers, both in usage string and documentation. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:52:35 +02:00			`" --pickaxe-all\n" \`
Add -a and --text to common diff options help Signed-off-by: Stephan Feder <sf@b-i-t.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-07 15:57:08 +02:00			`" show all files diff when -S is used and hit is found.\n" \`
			`" -a --text treat all files as text.\n"`
[PATCH] Clean up diff option descriptions. I got tired of maintaining almost duplicated descriptions in diff-* brothers, both in usage string and documentation. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:52:35 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int diff_queue_is_empty(void);`
			`void diff_flush(struct diff_options*);`
			`void diff_warn_rename_limit(const char *varname, int needed, int degraded_cc);`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`/* diff-raw status letters */`
diff-raw: Use 'A' instead of 'N' for added files. This actually changes the diff-raw status letter from N to A for added files. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 23:31:19 +02:00			`#define DIFF_STATUS_ADDED 'A'`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`#define DIFF_STATUS_COPIED 'C'`
			`#define DIFF_STATUS_DELETED 'D'`
			`#define DIFF_STATUS_MODIFIED 'M'`
			`#define DIFF_STATUS_RENAMED 'R'`
			`#define DIFF_STATUS_TYPE_CHANGED 'T'`
			`#define DIFF_STATUS_UNKNOWN 'X'`
			`#define DIFF_STATUS_UNMERGED 'U'`

			`/* these are not diff-raw status letters proper, but used by`
			`* diffcore-filter insn to specify additional restrictions.`
			`*/`
Fix diff-filter All-Or-None mark. When we updated the marker for new files from 'N' to 'A', we forgot to notice that the letter is already taken by the All-Or-None mark. Change the All-Or-None marker to '' to resolve this conflict. git-diff-tree -r --diff-filter='R' -M shows all the changes (not just renames) that are contained in commits that have renames, in comparison with: git-diff-tree -r --diff-filter='R' -M shows the same set of changes but the diff output are limited only to renaming changes. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-05 02:44:17 +02:00			`#define DIFF_STATUS_FILTER_AON '*'`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`#define DIFF_STATUS_FILTER_BROKEN 'B'`

diff_unique_abbrev: rename to diff_aligned_abbrev The word "align" describes how the function actually differs from find_unique_abbrev, and will make it less confusing when we add more diff-specific abbrevation functions that do not do this alignment. Since this is a globally available function, let's also move its descriptive comment to the header file, where we typically document function interfaces. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-20 08:19:43 +02:00			`/*`
			`* This is different from find_unique_abbrev() in that`
			`* it stuffs the result with dots for alignment.`
			`*/`
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`const char diff_aligned_abbrev(const struct object_id sha1, int);`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`/* do not report anything on removed paths */`
			`#define DIFF_SILENT_ON_REMOVED 01`
git-add: make the entry stat-clean after re-adding the same contents Earlier in commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 03:22:52 +01:00			`/* report racily-clean paths as modified */`
			`#define DIFF_RACY_IS_MODIFIED 02`
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int run_diff_files(struct rev_info *revs, unsigned int option);`
			`int run_diff_index(struct rev_info *revs, int cached);`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int do_diff_cache(const struct object_id , struct diff_options );`
			`int diff_flush_patch_id(struct diff_options , struct object_id , int);`
add diff_flush_patch_id() to calculate the patch id Call it like this: unsigned char id[20]; if (diff_flush_patch_id(diff_options, id)) printf("And the patch id is: %s\n", sha1_to_hex(id)); Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-25 03:51:08 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int diff_result_code(struct diff_options *, int);`
diff --check: minor fixups There is no reason --exit-code and --check-diff must be mutually exclusive, so assign different bits to different results and allow them to be returned from the command. Introduce diff_result_code() to factor out the common code to decide final status code based on diffopt settings and use it everywhere. Update tests to match the above fix. Turning pager off when "diff --check" is used is a regression. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-14 08:40:27 +01:00
diff.c: remove implicit dependency on the_index A new variant repo_diff_setup() is added that takes 'struct repository *' and diff_setup() becomes a thin macro around it that is protected by NO_THE_REPOSITORY_COMPATIBILITY_MACROS, similar to NO_THE_INDEX_.... The plan is these macros will always be defined for all library files and the macros are only accessible in builtin/ Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-09-21 17:57:24 +02:00			`void diff_no_index(struct repository , struct rev_info , int, const char **);`
"git diff": do not ignore index without --no-index Even if "foo" and/or "bar" does not exist in index, "git diff foo bar" should not change behaviour drastically from "git diff foo bar baz" or "git diff foo". A feature that "sometimes works and is handy" is an unreliable cute hack. "git diff foo bar" outside a git repository continues to work as a more colourful alternative to "diff -u" as before. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-24 07:28:56 +02:00
diff-lib.c: remove the_repository references Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-11-10 06:49:04 +01:00			`int index_differs_from(struct repository r, const char def,`
			`const struct diff_flags *flags,`
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int ita_invisible_in_index);`
Generalize and libify index_is_dirty() to index_differs_from(...) index_is_dirty() in builtin-revert.c checks if the index is dirty. This patch generalizes this function to check if the index differs from a revision, i.e. the former index_is_dirty() behavior can now be achieved by index_differs_from("HEAD", 0). The second argument "diff_flags" allows to set further diff option flags like DIFF_OPT_IGNORE_SUBMODULES. See DIFF_OPT_* macros in diff.h for a list. index_differs_from() seems to be useful for more than builtin-revert.c, so it is moved into diff-lib.c and also used in builtin-commit.c. Yet to mention: - "rev.abbrev = 0;" can be safely removed. This has no impact on performance or functioning of neither setup_revisions() nor run_diff_index(). - rev.pending.objects is free()d because this fixes a leak. (Also see 295dd2ad "Fix memory leak in traverse_commit_list") Mentored-by: Daniel Barkalow <barkalow@iabervon.org> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Stephan Beyer <s-beyer@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-10 15:30:35 +01:00
diff: clarify textconv interface The memory allocation scheme for the textconv interface is a bit tricky, and not well documented. It was originally designed as an internal part of diff.c (matching fill_mmfile), but gradually was made public. Refactoring it is difficult, but we can at least improve the situation by documenting the intended flow and enforcing it with an in-code assertion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-02-22 19:28:54 +01:00			`/*`
			`* Fill the contents of the filespec "df", respecting any textconv defined by`
			`* its userdiff driver. The "driver" parameter must come from a`
			`* previous call to get_textconv(), and therefore should either be NULL or have`
			`* textconv enabled.`
			`*`
			`* Note that the memory ownership of the resulting buffer depends on whether`
			`* the driver field is NULL. If it is, then the memory belongs to the filespec`
			`* struct. If it is non-NULL, then "outbuf" points to a newly allocated buffer`
			`* that should be freed by the caller.`
			`*/`
diff.c: remove the_index dependency in textconv() functions Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-09-21 17:57:22 +02:00			`size_t fill_textconv(struct repository *r,`
			`struct userdiff_driver *driver,`
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`struct diff_filespec *df,`
			`char **outbuf);`
textconv: make the API public The textconv functionality allows one to convert a file into text before running diff. But this functionality can be useful to other features such as blame. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-07 17:23:36 +02:00
diff: clarify textconv interface The memory allocation scheme for the textconv interface is a bit tricky, and not well documented. It was originally designed as an internal part of diff.c (matching fill_mmfile), but gradually was made public. Refactoring it is difficult, but we can at least improve the situation by documenting the intended flow and enforcing it with an in-code assertion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-02-22 19:28:54 +01:00			`/*`
			`* Look up the userdiff driver for the given filespec, and return it if`
			`* and only if it has textconv enabled (otherwise return NULL). The result`
			`* can be passed to fill_textconv().`
			`*/`
notes-cache.c: remove the_repository references Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-11-10 06:49:06 +01:00			`struct userdiff_driver get_textconv(struct repository r,`
userdiff.c: remove implicit dependency on the_index [jc: squashed in missing forward decl in userdiff.h found by Ramsay] Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-09-21 17:57:33 +02:00			`struct diff_filespec *one);`
textconv: make the API public The textconv functionality allows one to convert a file into text before running diff. But this functionality can be useful to other features such as blame. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-07 17:23:36 +02:00
blame: move textconv_object with related functions textconv_object is used in places other than blame.c and should be moved to a more appropriate location. Other textconv related functions are located in diff.c so that seems as good a place as any. Signed-off-by: Jeff Smith <whydoubt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-24 07:15:10 +02:00			`/*`
			`* Prepare diff_filespec and convert it using diff textconv API`
			`* if the textconv driver exists.`
			`* Return 1 if the conversion succeeds, 0 otherwise.`
			`*/`
diff.c: remove the_index dependency in textconv() functions Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-09-21 17:57:22 +02:00			`int textconv_object(struct repository *repo,`
			`const char *path,`
			`unsigned mode,`
			`const struct object_id *oid, int oid_valid,`
			`char *buf, unsigned long buf_size);`
blame: move textconv_object with related functions textconv_object is used in places other than blame.c and should be moved to a more appropriate location. Other textconv related functions are located in diff.c so that seems as good a place as any. Signed-off-by: Jeff Smith <whydoubt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-24 07:15:10 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`int parse_rename_score(const char **cp_p);`
merge-recursive: option to specify rename threshold The recursive merge strategy turns on rename detection but leaves the rename threshold at the default. Add a strategy option to allow the user to specify a rename threshold to use. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-28 01:58:25 +02:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`long parse_algorithm_value(const char *value);`
diff: Introduce --diff-algorithm command line option Since command line options have higher priority than config file variables and taking previous commit into account, we need a way how to specify myers algorithm on command line. However, inventing `--myers` is not the right answer. We need far more general option, and that is `--diff-algorithm`. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-16 08:51:58 +01:00
diff.h: remove extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2018-06-30 11:20:26 +02:00			`void print_stat_summary(FILE *fp, int files,`
			`int insertions, int deletions);`
			`void setup_diff_pager(struct diff_options *);`
Use correct grammar in diffstat summary line "git diff --stat" and "git apply --stat" now learn to print the line "%d files changed, %d insertions(+), %d deletions(-)" in singular form whenever applicable. "0 insertions" and "0 deletions" are also omitted unless they are both zero. This matches how versions of "diffstat" that are not prehistoric produced their output, and also makes this line translatable. [jc: with help from Thomas Dickey in archaeology of "diffstat"] [jc: squashed Jonathan's updates to illustrations in tutorials and a test] Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-01 13:55:07 +01:00
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`#endif /* DIFF_H */`