mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-16 06:03:44 +01:00

443 lines

11 KiB

C

Raw Normal View History

Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`/*`
			`* Helper functions for tree diff generation`
			`*/`
			`#include "cache.h"`
			`#include "diff.h"`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`#include "diffcore.h"`
Use blob_, commit_, tag_, and tree_type throughout. This replaces occurences of "blob", "commit", "tag", and "tree", where they're really used as type specifiers, which we already have defined global constants for. Signed-off-by: Peter Eriksen <s022018@student.dtu.dk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-02 14:44:09 +02:00			`#include "tree.h"`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`static char malloc_base(const char base, int baselen, const char *path, int pathlen)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
			`char *newbase = xmalloc(baselen + pathlen + 2);`
			`memcpy(newbase, base, baselen);`
			`memcpy(newbase + baselen, path, pathlen);`
			`memcpy(newbase + baselen + pathlen, "/", 2);`
			`return newbase;`
			`}`

Make show_entry void Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-14 22:39:27 +02:00			`static void show_entry(struct diff_options opt, const char prefix, struct tree_desc *desc,`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`const char *base, int baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`static int compare_tree_entry(struct tree_desc t1, struct tree_desc t2, const char base, int baselen, struct diff_options opt)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
			`unsigned mode1, mode2;`
			`const char path1, path2;`
			`const unsigned char sha1, sha2;`
			`int cmp, pathlen1, pathlen2;`

Make the "struct tree_desc" operations available to others We have operations to "extract" and "update" a "struct tree_desc", but we only used them in tree-diff.c and they were static to that file. But other tree traversal functions can use them to their advantage Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-31 23:10:56 +01:00			`sha1 = tree_entry_extract(t1, &path1, &mode1);`
			`sha2 = tree_entry_extract(t2, &path2, &mode2);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`pathlen1 = tree_entry_len(path1, sha1);`
			`pathlen2 = tree_entry_len(path2, sha2);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`cmp = base_name_compare(path1, pathlen1, mode1, path2, pathlen2, mode2);`
			`if (cmp < 0) {`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`show_entry(opt, "-", t1, base, baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return -1;`
			`}`
			`if (cmp > 0) {`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`show_entry(opt, "+", t2, base, baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 1;`
			`}`
Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length. Introduces global inline: hashcmp(const unsigned char sha1, const unsigned char sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-17 20:54:57 +02:00			`if (!opt->find_copies_harder && !hashcmp(sha1, sha2) && mode1 == mode2)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 0;`

			`/*`
			`* If the filemode has changed to/from a directory from/to a regular`
			`* file, we need to consider it a remove and an add.`
			`*/`
			`if (S_ISDIR(mode1) != S_ISDIR(mode2)) {`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`show_entry(opt, "-", t1, base, baselen);`
			`show_entry(opt, "+", t2, base, baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 0;`
			`}`

			`if (opt->recursive && S_ISDIR(mode1)) {`
			`int retval;`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`char *newbase = malloc_base(base, baselen, path1, pathlen1);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`if (opt->tree_in_recursive)`
			`opt->change(opt, mode1, mode2,`
			`sha1, sha2, base, path1);`
			`retval = diff_tree_sha1(sha1, sha2, newbase, opt);`
			`free(newbase);`
			`return retval;`
			`}`

			`opt->change(opt, mode1, mode2, sha1, sha2, base, path1);`
			`return 0;`
			`}`

Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`/*`
			`* Is a tree entry interesting given the pathspec we have?`
			`*`
			`* Return:`
tree_entry_interesting(): allow it to say "everything is interesting" In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-22 01:00:27 +01:00			`* - 2 for "yes, and all subsequent entries will be"`
			`* - 1 for yes`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`* - zero for no`
			`* - negative for "no, and no subsequent entries will be either"`
			`*/`
			`static int tree_entry_interesting(struct tree_desc desc, const char base, int baselen, struct diff_options *opt)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
			`const char *path;`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`const unsigned char *sha1;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`unsigned mode;`
			`int i;`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`int pathlen;`
Teach tree_entry_interesting() that the tree entries are sorted. When we are looking at a tree entry with pathspecs, if all the pathspecs sort strictly earlier than the entry we are currently looking at, there is no way later entries in the same tree would match our pathspecs, because the entries are sorted. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 17:51:47 +01:00			`int never_interesting = -1;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`if (!opt->nr_paths)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`return 1;`

Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`sha1 = tree_entry_extract(desc, &path, &mode);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`pathlen = tree_entry_len(path, sha1);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Teach tree_entry_interesting() that the tree entries are sorted. When we are looking at a tree entry with pathspecs, if all the pathspecs sort strictly earlier than the entry we are currently looking at, there is no way later entries in the same tree would match our pathspecs, because the entries are sorted. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 17:51:47 +01:00			`for (i = 0; i < opt->nr_paths; i++) {`
tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`const char *match = opt->paths[i];`
			`int matchlen = opt->pathlens[i];`
tree-diff: avoid strncmp() If we already know that some of the pathspecs can match later entries in the tree we are looking at, we do not have to do more expensive strncmp() upfront before comparing the length of the match pattern and the path, as a path longer than the match pattern will not match it, and a path shorter than the match pattern will match only if the path is a directory-component wise prefix of the match pattern. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 20:34:46 +01:00			`int m = -1; /* signals that we haven't called strncmp() */`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
			`if (baselen >= matchlen) {`
			`/* If it doesn't match, move along... */`
			`if (strncmp(base, match, matchlen))`
			`continue;`

tree_entry_interesting(): allow it to say "everything is interesting" In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-22 01:00:27 +01:00			`/*`
			`* The base is a subdirectory of a path which`
			`* was specified, so all of them are interesting.`
			`*/`
			`return 2;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`

			`/* Does the base match? */`
			`if (strncmp(base, match, baselen))`
			`continue;`

			`match += baselen;`
			`matchlen -= baselen;`

tree-diff: avoid strncmp() If we already know that some of the pathspecs can match later entries in the tree we are looking at, we do not have to do more expensive strncmp() upfront before comparing the length of the match pattern and the path, as a path longer than the match pattern will not match it, and a path shorter than the match pattern will match only if the path is a directory-component wise prefix of the match pattern. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 20:34:46 +01:00			`if (never_interesting) {`
			`/*`
			`* We have not seen any match that sorts later`
			`* than the current path.`
			`*/`

			`/*`
			`* Does match sort strictly earlier than path`
			`* with their common parts?`
			`*/`
			`m = strncmp(match, path,`
			`(matchlen < pathlen) ? matchlen : pathlen);`
			`if (m < 0)`
			`continue;`
Teach tree_entry_interesting() that the tree entries are sorted. When we are looking at a tree entry with pathspecs, if all the pathspecs sort strictly earlier than the entry we are currently looking at, there is no way later entries in the same tree would match our pathspecs, because the entries are sorted. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 17:51:47 +01:00
tree-diff: avoid strncmp() If we already know that some of the pathspecs can match later entries in the tree we are looking at, we do not have to do more expensive strncmp() upfront before comparing the length of the match pattern and the path, as a path longer than the match pattern will not match it, and a path shorter than the match pattern will match only if the path is a directory-component wise prefix of the match pattern. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 20:34:46 +01:00			`/*`
			`* If we come here even once, that means there is at`
			`* least one pathspec that would sort equal to or`
			`* later than the path we are currently looking at.`
			`* In other words, if we have never reached this point`
			`* after iterating all pathspecs, it means all`
			`* pathspecs are either outside of base, or inside the`
			`* base but sorts strictly earlier than the current`
			`* one. In either case, they will never match the`
			`* subsequent entries. In such a case, we initialized`
			`* the variable to -1 and that is what will be`
			`* returned, allowing the caller to terminate early.`
			`*/`
			`never_interesting = 0;`
			`}`
Teach tree_entry_interesting() that the tree entries are sorted. When we are looking at a tree entry with pathspecs, if all the pathspecs sort strictly earlier than the entry we are currently looking at, there is no way later entries in the same tree would match our pathspecs, because the entries are sorted. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 17:51:47 +01:00
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`if (pathlen > matchlen)`
			`continue;`

			`if (matchlen > pathlen) {`
			`if (match[pathlen] != '/')`
			`continue;`
			`if (!S_ISDIR(mode))`
			`continue;`
			`}`

tree-diff: avoid strncmp() If we already know that some of the pathspecs can match later entries in the tree we are looking at, we do not have to do more expensive strncmp() upfront before comparing the length of the match pattern and the path, as a path longer than the match pattern will not match it, and a path shorter than the match pattern will match only if the path is a directory-component wise prefix of the match pattern. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 20:34:46 +01:00			`if (m == -1)`
			`/*`
			`* we cheated and did not do strncmp(), so we do`
			`* that here.`
			`*/`
			`m = strncmp(match, path, pathlen);`

Teach tree_entry_interesting() that the tree entries are sorted. When we are looking at a tree entry with pathspecs, if all the pathspecs sort strictly earlier than the entry we are currently looking at, there is no way later entries in the same tree would match our pathspecs, because the entries are sorted. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 17:51:47 +01:00			`/*`
			`* If common part matched earlier then it is a hit,`
			`* because we rejected the case where path is not a`
			`* leading directory and is shorter than match.`
			`*/`
			`if (!m)`
			`return 1;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
Teach tree_entry_interesting() that the tree entries are sorted. When we are looking at a tree entry with pathspecs, if all the pathspecs sort strictly earlier than the entry we are currently looking at, there is no way later entries in the same tree would match our pathspecs, because the entries are sorted. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 17:51:47 +01:00			`return never_interesting; /* No matches */`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`

			`/* A whole sub-tree went away or appeared */`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`static void show_tree(struct diff_options opt, const char prefix, struct tree_desc desc, const char base, int baselen)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
tree_entry_interesting(): allow it to say "everything is interesting" In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-22 01:00:27 +01:00			`int all_interesting = 0;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`while (desc->size) {`
tree_entry_interesting(): allow it to say "everything is interesting" In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-22 01:00:27 +01:00			`int show;`

			`if (all_interesting)`
			`show = 1;`
			`else {`
			`show = tree_entry_interesting(desc, base, baselen,`
			`opt);`
			`if (show == 2)`
			`all_interesting = 1;`
			`}`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`if (show < 0)`
			`break;`
			`if (show)`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`show_entry(opt, prefix, desc, base, baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`update_tree_entry(desc);`
			`}`
			`}`

			`/* A file entry went away or appeared */`
Make show_entry void Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-14 22:39:27 +02:00			`static void show_entry(struct diff_options opt, const char prefix, struct tree_desc *desc,`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`const char *base, int baselen)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
			`unsigned mode;`
			`const char *path;`
Make the "struct tree_desc" operations available to others We have operations to "extract" and "update" a "struct tree_desc", but we only used them in tree-diff.c and they were static to that file. But other tree traversal functions can use them to their advantage Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-31 23:10:56 +01:00			`const unsigned char *sha1 = tree_entry_extract(desc, &path, &mode);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
			`if (opt->recursive && S_ISDIR(mode)) {`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`int pathlen = tree_entry_len(path, sha1);`
			`char *newbase = malloc_base(base, baselen, path, pathlen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`struct tree_desc inner;`
			`void *tree;`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`unsigned long size;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree = read_sha1_file(sha1, &type, &size);`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`if (!tree \|\| type != OBJ_TREE)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`die("corrupt tree sha %s", sha1_to_hex(sha1));`

Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`init_tree_desc(&inner, tree, size);`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`show_tree(opt, prefix, &inner, newbase, baselen + 1 + pathlen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
			`free(tree);`
			`free(newbase);`
Make show_entry void Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-14 22:39:27 +02:00			`} else {`
			`opt->add_remove(opt, prefix[0], mode, sha1, base, path);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
			`}`

Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`static void skip_uninteresting(struct tree_desc t, const char base, int baselen, struct diff_options *opt)`
			`{`
tree_entry_interesting(): allow it to say "everything is interesting" In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-22 01:00:27 +01:00			`int all_interesting = 0;`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`while (t->size) {`
tree_entry_interesting(): allow it to say "everything is interesting" In addition to optimizing pathspecs that would never match, which was done earlier, this optimizes pathspecs that would always match (e.g. "arch/" while the traversal is already in "arch/i386/" hierarchy). This patch makes the worst case slightly more palatable, while improving average case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-22 01:00:27 +01:00			`int show;`

			`if (all_interesting)`
			`show = 1;`
			`else {`
			`show = tree_entry_interesting(t, base, baselen, opt);`
			`if (show == 2)`
			`all_interesting = 1;`
			`}`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`if (!show) {`
			`update_tree_entry(t);`
			`continue;`
			`}`
			`/* Skip it all? */`
			`if (show < 0)`
			`t->size = 0;`
			`return;`
			`}`
			`}`

Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`int diff_tree(struct tree_desc t1, struct tree_desc t2, const char base, struct diff_options opt)`
			`{`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`int baselen = strlen(base);`

Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`for (;;) {`
Teach --quiet to diff backends. This teaches git-diff-files, git-diff-index and git-diff-tree backends to exit early under --quiet option. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-14 19:12:51 +01:00			`if (opt->quiet && opt->has_changes)`
			`break;`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`if (opt->nr_paths) {`
			`skip_uninteresting(t1, base, baselen, opt);`
			`skip_uninteresting(t2, base, baselen, opt);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
			`if (!t1->size) {`
Set up for better tree diff optimizations This is mainly just a cleanup patch, and sets up for later changes where the tree-diff.c "interesting()" function can return more than just a yes/no value. In particular, it should be quite possible to say "no subsequent entries in this tree can possibly be interesting any more", and thus allow the callers to short-circuit the tree entirely. In fact, changing the callers to do so is trivial, and is really all this patch really does, because changing "interesting()" itself to say that nothing further is going to be interesting is definitely more complicated, considering that we may have arbitrary pathspecs. But in cleaning up the callers, this actually fixes a potential small performance issue in diff_tree(): if the second tree has a lot of uninterestign crud in it, we would keep on doing the "is it interesting?" check on the first tree for each uninteresting entry in the second one. The answer is obviously not going to change, so that was just not helping. The new code is clearer and simpler and avoids this issue entirely. I also renamed "interesting()" to "tree_entry_interesting()", because I got frustrated by the fact that - we actually had another function called "interesting()" in another file, and I couldn't tell from the profiles which one was the one that mattered more. - when rewriting it to return a ternary value, you can't just do if (interesting(...)) ... any more, but want to assign the return value to a local variable. The name of choice for that variable would normally be "interesting", so I just wanted to make the function name be more specific, and avoid that whole issue (even though I then didn't choose that name for either of the users, just to avoid confusion in the patch itself ;) In other words, this doesn't really change anything, but I think it's a good thing to do, and if somebody comes along and writes the logic for "yeah, none of the pathspecs you have are interesting", we now support that trivially. It could easily be a meaningful optimization for things like "blame", where there's just one pathspec, and stopping when you've seen it would allow you to avoid about 50% of the tree traversals on average. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 23:18:30 +01:00			`if (!t2->size)`
			`break;`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`show_entry(opt, "+", t2, base, baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`update_tree_entry(t2);`
			`continue;`
			`}`
			`if (!t2->size) {`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`show_entry(opt, "-", t1, base, baselen);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`update_tree_entry(t1);`
			`continue;`
			`}`
Avoid unnecessary strlen() calls This is a micro-optimization that grew out of the mailing list discussion about "strlen()" showing up in profiles. We used to pass regular C strings around to the low-level tree walking routines, and while this worked fine, it meant that we needed to call strlen() on strings that the caller always actually knew the size of anyway. So pass the length of the string down wih the string, and avoid unnecessary calls to strlen(). Also, when extracting a pathname from a tree entry, use "tree_entry_len()" instead of strlen(), since the length of the pathname is directly calculable from the decoded tree entry itself without having to actually do another strlen(). This shaves off another ~5-10% from some loads that are very tree intensive (notably doing commit filtering by a pathspec). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>" Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-18 04:06:24 +01:00			`switch (compare_tree_entry(t1, t2, base, baselen, opt)) {`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`case -1:`
			`update_tree_entry(t1);`
			`continue;`
			`case 0:`
			`update_tree_entry(t1);`
			`/* Fallthrough */`
			`case 1:`
			`update_tree_entry(t2);`
			`continue;`
			`}`
			`die("git-diff-tree: internal error");`
			`}`
			`return 0;`
			`}`

Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`/*`
			`* Does it look like the resulting diff might be due to a rename?`
			`* - single entry`
			`* - not a valid previous file`
			`*/`
			`static inline int diff_might_be_rename(void)`
			`{`
			`return diff_queued_diff.nr == 1 &&`
			`!DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);`
			`}`

			`static void try_to_follow_renames(struct tree_desc t1, struct tree_desc t2, const char base, struct diff_options opt)`
			`{`
			`struct diff_options diff_opts;`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`struct diff_queue_struct *q = &diff_queued_diff;`
			`struct diff_filepair *choice;`
			`const char *paths[1];`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`int i;`

Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`/* Remove the file creation entry from the diff queue, and remember it */`
			`choice = q->queue[0];`
			`q->nr = 0;`

Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`diff_setup(&diff_opts);`
			`diff_opts.recursive = 1;`
			`diff_opts.detect_rename = DIFF_DETECT_RENAME;`
			`diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;`
			`diff_opts.single_follow = opt->paths[0];`
			`paths[0] = NULL;`
			`diff_tree_setup_paths(paths, &diff_opts);`
			`if (diff_setup_done(&diff_opts) < 0)`
			`die("unable to set up diff options to follow renames");`
			`diff_tree(t1, t2, base, &diff_opts);`
			`diffcore_std(&diff_opts);`

Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`/* Go through the new set of filepairing, and see if we find a more interesting one */`
			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00
			`/*`
			`* Found a source? Not only do we use that for the new`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`* diff_queued_diff, we will also use that as the path in`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`* the future!`
			`*/`
			`if ((p->status == 'R' \|\| p->status == 'C') && !strcmp(p->two->path, opt->paths[0])) {`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`/* Switch the file-pairs around */`
			`q->queue[i] = choice;`
			`choice = p;`

			`/* Update the path we use from now on.. */`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`opt->paths[0] = xstrdup(p->one->path);`
			`diff_tree_setup_paths(opt->paths, opt);`
			`break;`
			`}`
			`}`

			`/*`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`* Then, discard all the non-relevane file pairs...`
			`*/`
			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
			`diff_free_filepair(p);`
			`}`

			`/*`
			`* .. and re-instate the one we want (which might be either the`
			`* original one, or the rename/copy we found)`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`*/`
Fix up "git log --follow" a bit.. This fixes "git log --follow" to hopefully not leak memory any more, and also cleans it up a bit to look more like some of the other functions that use "diff_queued_diff" (by not using it directly as a global in the code, but by instead just taking a pointer to the diff queue and using that). As to "diff_queued_diff", I think it would be better off not as a global at all, but as being just an entry in the "struct diff_options" structure, but that's a separate issue, and there may be some subtle reason for why it's currently a global. Anyway, no real changes. Instead of having a magical first entry in the diff-queue, we now end up just keeping the diff-queue clean, and keeping our "preferred" file pairing in an internal "choice" variable. That makes it easy to switch the choice around when we find a better one. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-21 19:22:59 +02:00			`q->queue[0] = choice;`
			`q->nr = 1;`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`}`

Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`int diff_tree_sha1(const unsigned char old, const unsigned char new, const char base, struct diff_options opt)`
			`{`
			`void tree1, tree2;`
			`struct tree_desc t1, t2;`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`unsigned long size1, size2;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`int retval;`

Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree1 = read_object_with_reference(old, tree_type, &size1, NULL);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`if (!tree1)`
			`die("unable to read source tree (%s)", sha1_to_hex(old));`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree2 = read_object_with_reference(new, tree_type, &size2, NULL);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`if (!tree2)`
			`die("unable to read destination tree (%s)", sha1_to_hex(new));`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`init_tree_desc(&t1, tree1, size1);`
			`init_tree_desc(&t2, tree2, size2);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`retval = diff_tree(&t1, &t2, base, opt);`
Finally implement "git log --follow" Ok, I've really held off doing this too damn long, because I'm lazy, and I was always hoping that somebody else would do it. But no, people keep asking for it, but nobody actually did anything, so I decided I might as well bite the bullet, and instead of telling people they could add a "--follow" flag to "git log" to do what they want to do, I decided that it looks like I just have to do it for them.. The code wasn't actually that complicated, in that the diffstat for this patch literally says "70 insertions(+), 1 deletions(-)", but I will have to admit that in order to get to this fairly simple patch, you did have to know and understand the internal git diff generation machinery pretty well, and had to really be able to follow how commit generation interacts with generating patches and generating the log. So I suspect that while I was right that it wasn't that hard, I might have been expecting too much of random people - this patch does seem to be firmly in the core "Linus or Junio" territory. To make a long story short: I'm sorry for it taking so long until I just did it. I'm not going to guarantee that this works for everybody, but you really can just look at the patch, and after the appropriate appreciative noises ("Ooh, aah") over how clever I am, you can then just notice that the code itself isn't really that complicated. All the real new code is in the new "try_to_follow_renames()" function. It really isn't rocket science: we notice that the pathname we were looking at went away, so we start a full tree diff and try to see if we can instead make that pathname be a rename or a copy from some other previous pathname. And if we can, we just continue, except we show that particular diff, and ever after we use the _previous_ pathname. One thing to look out for: the "rename detection" is considered to be a singular event in the _linear_ "git log" output! That's what people want to do, but I just wanted to point out that this patch is not carrying around a "commit,pathname" kind of pair and it's not going to be able to notice the file coming from multiple different files in earlier history. IOW, if you use "git log --follow", then you get the stupid CVS/SVN kind of "files have single identities" kind of semantics, and git log will just pick the identity based on the normal move/copy heuristics _as_if_ the history could be linearized. Put another way: I think the model is broken, but given the broken model, I think this patch does just about as well as you can do. If you have merges with the same "file" having different filenames over the two branches, git will just end up picking _one_ of the pathnames at the point where the newer one goes away. It never looks at multiple pathnames in parallel. And if you understood all that, you probably didn't need it explained, and if you didn't understand the above blathering, it doesn't really mtter to you. What matters to you is that you can now do git log -p --follow builtin-rev-list.c and it will find the point where the old "rev-list.c" got renamed to "builtin-rev-list.c" and show it as such. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-19 23:22:46 +02:00			`if (opt->follow_renames && diff_might_be_rename()) {`
			`init_tree_desc(&t1, tree1, size1);`
			`init_tree_desc(&t2, tree2, size2);`
			`try_to_follow_renames(&t1, &t2, base, opt);`
			`}`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`free(tree1);`
			`free(tree2);`
			`return retval;`
			`}`

Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`int diff_root_tree_sha1(const unsigned char new, const char base, struct diff_options *opt)`
			`{`
			`int retval;`
			`void *tree;`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`unsigned long size;`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`struct tree_desc empty, real;`

Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`tree = read_object_with_reference(new, tree_type, &size, NULL);`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`if (!tree)`
			`die("unable to read root tree (%s)", sha1_to_hex(new));`
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`init_tree_desc(&real, tree, size);`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00
Initialize tree descriptors with a helper function rather than by hand. This removes slightly more lines than it adds, but the real reason for doing this is that future optimizations will require more setup of the tree descriptor, and so we want to do it in one place. Also renamed the "desc.buf" field to "desc.buffer" just to trigger compiler errors for old-style manual initializations, making sure I didn't miss anything. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-21 18:08:25 +01:00			`init_tree_desc(&empty, "", 0);`
Make git-cherry handle root trees This patch on top of 'next' makes built-in git-cherry handle root commits. It moves the static function log-tree.c::diff_root_tree() to tree-diff.c and makes it more similar to diff_tree_sha1() by shuffling around arguments and factoring out the call to log_tree_diff_flush(). Consequently the name is changed to diff_root_tree_sha1(). It is a version of diff_tree_sha1() that compares the empty tree (= root tree) against a single 'real' tree. This function is then used in get_patch_id() to compute patch IDs for initial commits instead of SEGFAULTing, as the current code does if confronted with parentless commits. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-26 18:52:39 +02:00			`retval = diff_tree(&empty, &real, base, opt);`
			`free(tree);`
			`return retval;`
			`}`

Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`static int count_paths(const char **paths)`
			`{`
			`int i = 0;`
			`while (*paths++)`
			`i++;`
			`return i;`
			`}`

tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`void diff_tree_release_paths(struct diff_options *opt)`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`{`
tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`free(opt->pathlens);`
			`}`

			`void diff_tree_setup_paths(const char *p, struct diff_options opt)`
			`{`
			`opt->nr_paths = 0;`
			`opt->pathlens = NULL;`
			`opt->paths = NULL;`

Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`if (p) {`
			`int i;`

tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`opt->paths = p;`
			`opt->nr_paths = count_paths(p);`
			`if (opt->nr_paths == 0) {`
			`opt->pathlens = NULL;`
avoid asking ?alloc() for zero bytes. Avoid asking for zero bytes when that change simplifies overall logic. Later we would change the wrapper to ask for 1 byte on platforms that return NULL for zero byte request. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-26 21:34:56 +01:00			`return;`
			`}`
tree-diff: do not assume we use only one pathspec The way tree-diff was set up assumed we would use only one set of pathspec during the entire life of the program. Move the pathspec related static variables out to diff_options structure so that we can filter commits with one set of paths while show the actual diffs using different set of paths. I suspect this breaks blame.c, and makes "git log paths..." to default to the --full-diff, the latter of which is dealt with the next commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 01:39:11 +02:00			`opt->pathlens = xmalloc(opt->nr_paths * sizeof(int));`
			`for (i=0; i < opt->nr_paths; i++)`
			`opt->pathlens[i] = strlen(p[i]);`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00			`}`
			`}`