mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-18 15:04:49 +01:00

323 lines

8.6 KiB

C

Raw Normal View History

Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`/*`
			`* Helper functions for tree diff generation`
			`*/`
			`#include "cache.h"`
			`#include "diff.h"`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`#include "diffcore.h"`
Use blob_, commit_, tag_, and tree_type throughout. This replaces occurences of "blob", "commit", "tag", and "tree", where they're really used as type specifiers, which we already have defined global constants for. Signed-off-by: Peter Eriksen <s022018@student.dtu.dk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-02 14:44:09 +02:00			`#include "tree.h"`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`static void show_entry(struct diff_options opt, const char prefix,`
			`struct tree_desc desc, struct strbuf base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`static int compare_tree_entry(struct tree_desc t1, struct tree_desc t2,`
			`struct strbuf base, struct diff_options opt)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
			`unsigned mode1, mode2;`
			`const char path1, path2;`
			`const unsigned char sha1, sha2;`
			`int cmp, pathlen1, pathlen2;`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`int old_baselen = base->len;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Make the "struct tree_desc" operations available to others We have operations to "extract" and "update" a "struct tree_desc", but we only used them in tree-diff.c and they were static to that file. But other tree traversal functions can use them to their advantage Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-31 23:10:56 +01:00			`sha1 = tree_entry_extract(t1, &path1, &mode1);`
			`sha2 = tree_entry_extract(t2, &path2, &mode2);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
tree-walk.c: do not leak internal structure in tree_entry_len() tree_entry_len() does not simply take two random arguments and return a tree length. The two pointers must point to a tree item structure, or struct name_entry. Passing random pointers will return incorrect value. Force callers to pass struct name_entry instead of two pointers (with hope that they don't manually construct struct name_entry themselves) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:09 +02:00			`pathlen1 = tree_entry_len(&t1->entry);`
			`pathlen2 = tree_entry_len(&t2->entry);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`cmp = base_name_compare(path1, pathlen1, mode1, path2, pathlen2, mode2);`
			`if (cmp < 0) {`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`show_entry(opt, "-", t1, base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return -1;`
			`}`
			`if (cmp > 0) {`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`show_entry(opt, "+", t2, base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 1;`
			`}`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (!DIFF_OPT_TST(opt, FIND_COPIES_HARDER) && !hashcmp(sha1, sha2) && mode1 == mode2)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 0;`

			`/*`
			`* If the filemode has changed to/from a directory from/to a regular`
			`* file, we need to consider it a remove and an add.`
			`*/`
			`if (S_ISDIR(mode1) != S_ISDIR(mode2)) {`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`show_entry(opt, "-", t1, base);`
			`show_entry(opt, "+", t2, base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 0;`
			`}`

diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`strbuf_add(base, path1, pathlen1);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (DIFF_OPT_TST(opt, RECURSIVE) && S_ISDIR(mode1)) {`
Fix buffer overflow in git diff If PATH_MAX on your system is smaller than a path stored, it may cause buffer overflow and stack corruption in diff_addremove() and diff_change() functions when running git-diff Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-16 16:54:02 +02:00			`if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE)) {`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`opt->change(opt, mode1, mode2,`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`sha1, sha2, 1, 1, base->buf, 0, 0);`
Fix buffer overflow in git diff If PATH_MAX on your system is smaller than a path stored, it may cause buffer overflow and stack corruption in diff_addremove() and diff_change() functions when running git-diff Signed-off-by: Dmitry Potapov <dpotapov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-16 16:54:02 +02:00			`}`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`strbuf_addch(base, '/');`
Remove unused variables Noticed by gcc 4.6.0. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-22 13:50:08 +01:00			`diff_tree_sha1(sha1, sha2, base->buf, opt);`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`} else {`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`opt->change(opt, mode1, mode2, sha1, sha2, 1, 1, base->buf, 0, 0);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`strbuf_setlen(base, old_baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 0;`
			`}`

			`/* A whole sub-tree went away or appeared */`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`static void show_tree(struct diff_options opt, const char prefix,`
			`struct tree_desc desc, struct strbuf base)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:10 +02:00			`enum interesting match = entry_not_interesting;`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`for (; desc->size; update_tree_entry(desc)) {`
tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:10 +02:00			`if (match != all_entries_interesting) {`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`match = tree_entry_interesting(&desc->entry, base, 0,`
			`&opt->pathspec);`
tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:10 +02:00			`if (match == all_entries_not_interesting)`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`break;`
tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:10 +02:00			`if (match == entry_not_interesting)`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`continue;`
tree_entry_interesting(): allow it to say "everything is interesting" In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-22 01:00:27 +01:00			`}`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`show_entry(opt, prefix, desc, base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
			`}`

			`/* A file entry went away or appeared */`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`static void show_entry(struct diff_options opt, const char prefix,`
			`struct tree_desc desc, struct strbuf base)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
			`unsigned mode;`
			`const char *path;`
Make the "struct tree_desc" operations available to others We have operations to "extract" and "update" a "struct tree_desc", but we only used them in tree-diff.c and they were static to that file. But other tree traversal functions can use them to their advantage Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-31 23:10:56 +01:00			`const unsigned char *sha1 = tree_entry_extract(desc, &path, &mode);`
tree-walk.c: do not leak internal structure in tree_entry_len() tree_entry_len() does not simply take two random arguments and return a tree length. The two pointers must point to a tree item structure, or struct name_entry. Passing random pointers will return incorrect value. Force callers to pass struct name_entry instead of two pointers (with hope that they don't manually construct struct name_entry themselves) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:09 +02:00			`int pathlen = tree_entry_len(&desc->entry);`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`int old_baselen = base->len;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`strbuf_add(base, path, pathlen);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (DIFF_OPT_TST(opt, RECURSIVE) && S_ISDIR(mode)) {`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`struct tree_desc inner;`
			`void *tree;`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`unsigned long size;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree = read_sha1_file(sha1, &type, &size);`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`if (!tree \|\| type != OBJ_TREE)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`die("corrupt tree sha %s", sha1_to_hex(sha1));`

diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE))`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`opt->add_remove(opt, *prefix, mode, sha1, 1, base->buf, 0);`
diff-tree -r -t: include added/removed directories in the output We used to include only the modified and typechanged directories in the ouptut, but for consistency's sake, we should also include added and removed ones as well. This makes the output more consistent, but it may break existing scripts that expect to see the current output which has long been the established behaviour. Signed-off-by: Nick Edelen <sirnot@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-14 02:06:09 +02:00
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`strbuf_addch(base, '/');`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`init_tree_desc(&inner, tree, size);`
			`show_tree(opt, prefix, &inner, base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`free(tree);`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`} else`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`opt->add_remove(opt, prefix[0], mode, sha1, 1, base->buf, 0);`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00
			`strbuf_setlen(base, old_baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`

diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`static void skip_uninteresting(struct tree_desc t, struct strbuf base,`
tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:10 +02:00			`struct diff_options *opt,`
			`enum interesting *match)`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`{`
			`while (t->size) {`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`*match = tree_entry_interesting(&t->entry, base, 0, &opt->pathspec);`
			`if (*match) {`
tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:10 +02:00			`if (*match == all_entries_not_interesting)`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`t->size = 0;`
			`break;`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`}`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`update_tree_entry(t);`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`}`
			`}`

diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`int diff_tree(struct tree_desc t1, struct tree_desc t2,`
			`const char base_str, struct diff_options opt)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`struct strbuf base;`
			`int baselen = strlen(base_str);`
tree_entry_interesting(): give meaningful names to return values It is a basic code hygiene to avoid magic constants that are unnamed. Besides, this helps extending the value later on for "interesting, but cannot decide if the entry truely matches yet" (ie. prefix matches) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:10 +02:00			`enum interesting t1_match = entry_not_interesting;`
			`enum interesting t2_match = entry_not_interesting;`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00
tree_entry_interesting(): support depth limit This is needed to replace pathspec_matches() in builtin/grep.c. max_depth == -1 means infinite depth. Depth limit is only effective when pathspec.recursive == 1. When pathspec.recursive == 0, the behavior depends on match functions: non-recursive for tree_entry_interesting() and recursive for match_pathspec{,_depth} Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:44 +01:00			`/* Enable recursion indefinitely */`
			`opt->pathspec.recursive = DIFF_OPT_TST(opt, RECURSIVE);`
			`opt->pathspec.max_depth = -1;`

diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`strbuf_init(&base, PATH_MAX);`
			`strbuf_add(&base, base_str, baselen);`

Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`for (;;) {`
diff: futureproof "stop feeding the backend early" logic Refactor the "do not stop feeding the backend early" logic into a small helper function and use it in both run_diff_files() and diff_tree() that has the stop-early optimization. We may later add other types of diffcore transformation that require to look at the whole result like diff-filter does, and having the logic in a single place is essential for longer term maintainability. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-31 18:14:17 +02:00			`if (diff_can_quit_early(opt))`
Teach --quiet to diff backends. This teaches git-diff-files, git-diff-index and git-diff-tree backends to exit early under --quiet option. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-14 19:12:51 +01:00			`break;`
Convert struct diff_options to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:38 +01:00			`if (opt->pathspec.nr) {`
Improve tree_entry_interesting() handling code t_e_i() can return -1 or 2 to early shortcut a search. Current code may use up to two variables to handle it. One for saving return value from t_e_i temporarily, one for saving return code 2. The second variable is not needed. If we make sure the first variable does not change until the next t_e_i() call, then we can do something like this: int ret = 0; while (...) { if (ret != 2) { ret = t_e_i(); if (ret < 0) /* no longer interesting / break; if (ret == 0) / skip this round / continue; } / ret > 0, interesting */ } Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-25 10:34:20 +01:00			`skip_uninteresting(t1, &base, opt, &t1_match);`
			`skip_uninteresting(t2, &base, opt, &t2_match);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
			`if (!t1->size) {`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`if (!t2->size)`
			`break;`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`show_entry(opt, "+", t2, &base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`update_tree_entry(t2);`
			`continue;`
			`}`
			`if (!t2->size) {`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`show_entry(opt, "-", t1, &base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`update_tree_entry(t1);`
			`continue;`
			`}`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00			`switch (compare_tree_entry(t1, t2, &base, opt)) {`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`case -1:`
			`update_tree_entry(t1);`
			`continue;`
			`case 0:`
			`update_tree_entry(t1);`
			`/* Fallthrough */`
			`case 1:`
			`update_tree_entry(t2);`
			`continue;`
			`}`
'git foo' program identifies itself without dash in die() messages This is a mechanical conversion of all '*.c' files with: s/((?:die\|error\|warning)\("git)-(\S+:)/$1 $2/; The result was manually inspected and no false positive was found. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-31 18:39:19 +02:00			`die("git diff-tree: internal error");`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
diff-tree: convert base+baselen to writable strbuf In traversing trees, a full path is splitted into two parts: base directory and entry. They are however quite often concatenated whenever a full path is needed. Current code allocates a new buffer, do two memcpy(), use it, then release. Instead this patch turns "base" to a writable, extendable buffer. When a concatenation is needed, the callee only needs to append "entry" to base, use it, then truncate the entry out again. "base" must remain unchanged before and after entering a function. This avoids quite a bit of malloc() and memcpy(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:42 +01:00
			`strbuf_release(&base);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 0;`
			`}`

Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`/*`
			`* Does it look like the resulting diff might be due to a rename?`
			`* - single entry`
			`* - not a valid previous file`
			`*/`
			`static inline int diff_might_be_rename(void)`
			`{`
			`return diff_queued_diff.nr == 1 &&`
			`!DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);`
			`}`

			`static void try_to_follow_renames(struct tree_desc t1, struct tree_desc t2, const char base, struct diff_options opt)`
			`{`
			`struct diff_options diff_opts;`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`struct diff_queue_struct *q = &diff_queued_diff;`
			`struct diff_filepair *choice;`
			`const char *paths[1];`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`int i;`

Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`/* Remove the file creation entry from the diff queue, and remember it */`
			`choice = q->queue[0];`
			`q->nr = 0;`

Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`diff_setup(&diff_opts);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&diff_opts, RECURSIVE);`
Make git log --follow find copies among unmodified files. 'git log --follow <path>' don't track copies from unmodified files, and this patch fix it. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-07 06:52:29 +02:00			`DIFF_OPT_SET(&diff_opts, FIND_COPIES_HARDER);`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;`
Convert struct diff_options to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:38 +01:00			`diff_opts.single_follow = opt->pathspec.raw[0];`
Fix diffcore-break total breakage Ok, so on the kernel list, some people noticed that "git log --follow" doesn't work too well with some files in the x86 merge, because a lot of files got renamed in very special ways. In particular, there was a pattern of doing single commits with renames that looked basically like - rename "filename.h" -> "filename_64.h" - create new "filename.c" that includes "filename_32.h" or "filename_64.h" depending on whether we're 32-bit or 64-bit. which was preparatory for smushing the two trees together. Now, there's two issues here: - "filename.c" remained. Yes, it was a rename, but there was a new file created with the old name in the same commit. This was important, because we wanted each commit to compile properly, so that it was bisectable, so splitting the rename into one commit and the "create helper file" into another was not an option. So we need to break associations where the contents change too much. Fine. We have the -B flag for that. When we break things up, then the rename detection will be able to figure out whether there are better alternatives. - "git log --follow" didn't with with -B. Now, the second case was really simple: we use a different "diffopt" structure for the rename detection than the basic one (which we use for showing the diffs). So that second case is trivially fixed by a trivial one-liner that just copies the break_opt values from the "real" diffopts to the one used for rename following. So now "git log -B --follow" works fine: diff --git a/tree-diff.c b/tree-diff.c index 26bdbdd..7c261fd 100644 --- a/tree-diff.c +++ b/tree-diff.c @@ -319,6 +319,7 @@ static void try_to_follow_renames(struct tree_desc t1, struct tree_desc t2, co diff_opts.detect_rename = DIFF_DETECT_RENAME; diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT; diff_opts.single_follow = opt->paths[0]; + diff_opts.break_opt = opt->break_opt; paths[0] = NULL; diff_tree_setup_paths(paths, &diff_opts); if (diff_setup_done(&diff_opts) < 0) however, the end result does not work. Because our diffcore-break.c logic is totally bogus! In particular: - it used to do if (base_size < MINIMUM_BREAK_SIZE) return 0; /* we do not break too small filepair / which basically says "don't bother to break small files". But that "base_size" is the smaller* of the two sizes, which means that if some large file was rewritten into one that just includes another file, we would look at the (small) result, and decide that it's smaller than the break size, so it cannot be worth it to break it up! Even if the other side was ten times bigger and looked nothing like the samell file! That's clearly bogus. I replaced "base_size" with "max_size", so that we compare the bigger of the filepair with the break size. - It calculated a "merge_score", which was the score needed to merge it back together if nothing else wanted it. But even if it was so different that we would never want to merge it back, we wouldn't consider it a break! That makes no sense. So I added if (merge_score_p > break_score) return 1; to make it clear that if we wouldn't want to merge it at the end, it was definitely* a break. - It compared the whole "extent of damage", counting all inserts and deletes, but it based this score on the "base_size", and generated the damage score with delta_size = src_removed + literal_added; damage_score = delta_size * MAX_SCORE / base_size; but that makes no sense either, since quite often, this will result in a number that is bigger than MAX_SCORE! Why? Because base_size is (again) the smaller of the two files we compare, and when you start out from a small file and add a lot (or start out from a large file and remove a lot), the base_size is going to be much smaller than the damage! Again, the fix was to replace "base_size" with "max_size", at which point the damage actually becomes a sane percentage of the whole. With these changes in place, not only does "git log -B --follow" work for the case that triggered this in the first place, ie now git log -B --follow arch/x86/kernel/vmlinux_64.lds.S actually gives reasonable results. But I also wanted to verify it in general, by doing a full-history git log --stat -B -C on my kernel tree with the old code and the new code. There's some tweaking to be done, but generally, the new code generates much better results wrt breaking up files (and then finding better rename candidates). Here's a few examples of the "--stat" output: - This: include/asm-x86/Kbuild \| 2 - include/asm-x86/debugreg.h \| 79 +++++++++++++++++++++++++++++++++++------ include/asm-x86/debugreg_32.h \| 64 --------------------------------- include/asm-x86/debugreg_64.h \| 65 --------------------------------- 4 files changed, 68 insertions(+), 142 deletions(-) Becomes: include/asm-x86/Kbuild \| 2 - include/asm-x86/{debugreg_64.h => debugreg.h} \| 9 +++- include/asm-x86/debugreg_32.h \| 64 ------------------------- 3 files changed, 7 insertions(+), 68 deletions(-) - This: include/asm-x86/bug.h \| 41 +++++++++++++++++++++++++++++++++++++++-- include/asm-x86/bug_32.h \| 37 ------------------------------------- include/asm-x86/bug_64.h \| 34 ---------------------------------- 3 files changed, 39 insertions(+), 73 deletions(-) Becomes include/asm-x86/{bug_64.h => bug.h} \| 20 +++++++++++++----- include/asm-x86/bug_32.h \| 37 ----------------------------------- 2 files changed, 14 insertions(+), 43 deletions(-) Now, in some other cases, it does actually turn a rename into a real "delete+create" pair, and then the diff is usually bigger, so truth in advertizing: it doesn't always generate a nicer diff. But for what -B was meant for, I think this is a big improvement, and I suspect those cases where it generates a bigger diff are tweakable. So I think this diff fixes a real bug, but we might still want to tweak the default values and perhaps the exact rules for when a break happens. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-20 21:31:31 +02:00			`diff_opts.break_opt = opt->break_opt;`
use custom rename score during --follow If you provide a custom rename score on the command line, like: git log -M50 --follow foo.c it is completely ignored, and there is no way to --follow with a looser rename score. Instead, let's use the same rename score that will be used for generating diffs. This is convenient, and mirrors what we do with the break-score. You can see an example of it being useful in git.git: $ git log --oneline --summary --follow \ Documentation/technical/api-string-list.txt 86d4b52 string-list: Add API to remove an item from an unsorted list 1d2f80f string_list: Fix argument order for string_list_append e242148 string-list: add unsorted_string_list_lookup() 0dda1d1 Fix two leftovers from path_list->string_list c455c87 Rename path_list to string_list create mode 100644 Documentation/technical/api-string-list.txt $ git log --oneline --summary -M40 --follow \ Documentation/technical/api-string-list.txt 86d4b52 string-list: Add API to remove an item from an unsorted list 1d2f80f string_list: Fix argument order for string_list_append e242148 string-list: add unsorted_string_list_lookup() 0dda1d1 Fix two leftovers from path_list->string_list c455c87 Rename path_list to string_list rename Documentation/technical/{api-path-list.txt => api-string-list.txt} (47%) 328a475 path-list documentation: document all functions and data structures 530e741 Start preparing the API documents. create mode 100644 Documentation/technical/api-path-list.txt You could have two separate rename scores, one for following and one for diff. But almost nobody is going to want that, and it would just be unnecessarily confusing. Besides which, we re-use the diff results from try_to_follow_renames for the actual diff output, which means having them as separate scores is actively wrong. E.g., with the current code, you get: $ git log --oneline --diff-filter=R --name-status \ -M90 --follow git.spec.in 27dedf0 GIT 0.99.9j aka 1.0rc3 R084 git-core.spec.in git.spec.in f85639c Rename the RPM from "git" to "git-core" R098 git.spec.in git-core.spec.in The first one should not be considered a rename by the -M score we gave, but we print it anyway, since we blindly re-use the diff information from the follow (which uses the default score). So this could also be considered simply a bug-fix, as with the current code "-M" is completely ignored when using "--follow". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-12-16 12:27:50 +01:00			`diff_opts.rename_score = opt->rename_score;`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`paths[0] = NULL;`
			`diff_tree_setup_paths(paths, &diff_opts);`
diff_setup_done(): return void diff_setup_done() has historically returned an error code, but lost the last nonzero return in 943d5b7 (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-03 14:16:24 +02:00			`diff_setup_done(&diff_opts);`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`diff_tree(t1, t2, base, &diff_opts);`
			`diffcore_std(&diff_opts);`
Fix small memory leaks induced by diff_tree_setup_paths Run diff_tree_release_paths in the appropriate places, and add a test to avoid NULL dereference. Better safe than sorry. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-11 22:59:55 +01:00			`diff_tree_release_paths(&diff_opts);`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`/* Go through the new set of filepairing, and see if we find a more interesting one */`
diff --follow: do call diffcore_std() as necessary Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier 1da6175 (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-13 21:17:45 +02:00			`opt->found_follow = 0;`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00
			`/*`
			`* Found a source? Not only do we use that for the new`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`* diff_queued_diff, we will also use that as the path in`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`* the future!`
			`*/`
Convert struct diff_options to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:38 +01:00			`if ((p->status == 'R' \|\| p->status == 'C') &&`
			`!strcmp(p->two->path, opt->pathspec.raw[0])) {`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`/* Switch the file-pairs around */`
			`q->queue[i] = choice;`
			`choice = p;`

			`/* Update the path we use from now on.. */`
Fix small memory leaks induced by diff_tree_setup_paths Run diff_tree_release_paths in the appropriate places, and add a test to avoid NULL dereference. Better safe than sorry. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-11 22:59:55 +01:00			`diff_tree_release_paths(opt);`
Convert struct diff_options to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:38 +01:00			`opt->pathspec.raw[0] = xstrdup(p->one->path);`
			`diff_tree_setup_paths(opt->pathspec.raw, opt);`
diff --follow: do call diffcore_std() as necessary Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier 1da6175 (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-13 21:17:45 +02:00
			`/*`
			`* The caller expects us to return a set of vanilla`
			`* filepairs to let a later call to diffcore_std()`
			`* it makes to sort the renames out (among other`
			`* things), but we already have found renames`
			`* ourselves; signal diffcore_std() not to muck with`
			`* rename information.`
			`*/`
			`opt->found_follow = 1;`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`break;`
			`}`
			`}`

			`/*`
Fix typos / spelling in comments Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-04-17 20:13:30 +02:00			`* Then, discard all the non-relevant file pairs...`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`*/`
			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
			`diff_free_filepair(p);`
			`}`

			`/*`
			`* .. and re-instate the one we want (which might be either the`
			`* original one, or the rename/copy we found)`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`*/`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`q->queue[0] = choice;`
			`q->nr = 1;`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`}`

Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`int diff_tree_sha1(const unsigned char old, const unsigned char new, const char base, struct diff_options opt)`
			`{`
			`void tree1, tree2;`
			`struct tree_desc t1, t2;`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`unsigned long size1, size2;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`int retval;`

Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree1 = read_object_with_reference(old, tree_type, &size1, NULL);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`if (!tree1)`
			`die("unable to read source tree (%s)", sha1_to_hex(old));`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree2 = read_object_with_reference(new, tree_type, &size2, NULL);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`if (!tree2)`
			`die("unable to read destination tree (%s)", sha1_to_hex(new));`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`init_tree_desc(&t1, tree1, size1);`
			`init_tree_desc(&t2, tree2, size2);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`retval = diff_tree(&t1, &t2, base, opt);`
diff --follow: do not waste cycles while recursing The "--follow" logic is called from diff_tree_sha1() function, but the input trees to diff_tree_sha1() are not necessarily the top-level trees (compare_tree_entry() calls it while it recursively descends into subtrees). When a newly created path lives in somewhere deep in the source hierarchy, e.g. "platform/", but the rename source is in a totally different place in the destination hierarchy, e.g. "lang-api/src/com/...", running "try_to_find_renames()" while base is set to "platform/" is a wasted call. We only need to run the rename following at the very top level. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> 2010-08-13 19:36:01 +02:00			`if (!*base && DIFF_OPT_TST(opt, FOLLOW_RENAMES) && diff_might_be_rename()) {`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`init_tree_desc(&t1, tree1, size1);`
			`init_tree_desc(&t2, tree2, size2);`
			`try_to_follow_renames(&t1, &t2, base, opt);`
			`}`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`free(tree1);`
			`free(tree2);`
			`return retval;`
			`}`

Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`int diff_root_tree_sha1(const unsigned char new, const char base, struct diff_options *opt)`
			`{`
			`int retval;`
			`void *tree;`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`unsigned long size;`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`struct tree_desc empty, real;`

Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree = read_object_with_reference(new, tree_type, &size, NULL);`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`if (!tree)`
			`die("unable to read root tree (%s)", sha1_to_hex(new));`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`init_tree_desc(&real, tree, size);`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`init_tree_desc(&empty, "", 0);`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`retval = diff_tree(&empty, &real, base, opt);`
			`free(tree);`
			`return retval;`
			`}`

tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`void diff_tree_release_paths(struct diff_options *opt)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
Convert struct diff_options to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:38 +01:00			`free_pathspec(&opt->pathspec);`
tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`}`

			`void diff_tree_setup_paths(const char *p, struct diff_options opt)`
			`{`
Convert struct diff_options to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:38 +01:00			`init_pathspec(&opt->pathspec, p);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`