mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-05 16:52:59 +01:00

957 lines

22 KiB

C

Raw Normal View History

Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". 2005-04-24 04:04:40 +02:00			`#include "cache.h"`
upload-pack: Do not choke on too many heads request. Cloning from a repository with more than 256 refs (heads and tags included) will choke, because upload-pack has a built-in limit of feeding not more than MAX_NEEDS (currently 256) heads to underlying git-rev-list. This is a problem when cloning a repository with many tags, like http://www.linux-mips.org/pub/scm/linux.git, which has 290+ tags. This commit introduces a new flag, --all, to git-rev-list, to include all refs in the repository. Updated upload-pack detects requests that ask more than MAX_NEEDS refs, and sends everything back instead. We may probably want to tweak the definitions of MAX_NEEDS and MAX_HAS, but that is a separate topic. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-05 23:49:54 +02:00			`#include "refs.h"`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`#include "tag.h"`
Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". 2005-04-24 04:04:40 +02:00			`#include "commit.h"`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`#include "tree.h"`
			`#include "blob.h"`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`#include "epoch.h"`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`#include "diff.h"`
Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". 2005-04-24 04:04:40 +02:00
git-rev-list: use proper lazy reachability analysis This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4. 2005-05-31 03:46:32 +02:00			`#define SEEN (1u << 0)`
			`#define INTERESTING (1u << 1)`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`#define COUNTED (1u << 2)`
Remove insane overlapping bit ranges from epoch.c ..and move the DUPCHECK to rev-list.c since both the merge-order and the upcoming topo-sort get confused by dups. 2005-07-06 18:56:16 +02:00			`#define SHOWN (1u << 3)`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`#define TREECHANGE (1u << 4)`
rev-list: omit duplicated parents. Showing the same parent more than once for a commit does not make much sense downstream, so stop it. This can happen with an incorrectly made merge commit that merges the same parent twice, but can happen in an otherwise sane development history while squishing the history by taking into account only commits that touch specified paths. For example, $ git rev-list --max-count=1 --parents addafaf -- rev-list.c would have to show this commit ancestry graph: .---o---. / \ .------o---. / 93b74bc \ ------o---o-----o---o-----o addafaf d8f6b34 \ / .---o---o---. \ / .---*---. 3815f42 where 5 independent development tracks, only two of which have changes in the specified paths since they forked. The last change for the other three development tracks was done by the same commit before they forked, and we were showing that three times. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-30 00:24:42 +01:00			`#define TMP_MARK (1u << 5) /* for isolated cases; clean after use */`
git-rev-list: use proper lazy reachability analysis This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4. 2005-05-31 03:46:32 +02:00
git-rev-list: add "end" commit and "--header" flag The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc. 2005-05-26 03:29:09 +02:00			`static const char rev_list_usage[] =`
Update usage string and documentation for git-rev-list. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-30 10:03:45 +01:00			`"git-rev-list [OPTION] <commit-id>... [ -- paths... ]\n"`
			`" limiting output:\n"`
			`" --max-count=nr\n"`
			`" --max-age=epoch\n"`
			`" --min-age=epoch\n"`
			`" --sparse\n"`
			`" --no-merges\n"`
rev-list --remove-empty: add minimum help and doc entry. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-27 10:39:24 +01:00			`" --remove-empty\n"`
Update usage string and documentation for git-rev-list. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-30 10:03:45 +01:00			`" --all\n"`
			`" ordering output:\n"`
			`" --merge-order [ --show-breaks ]\n"`
			`" --topo-order\n"`
			`" formatting output:\n"`
			`" --parents\n"`
			`" --objects\n"`
			`" --unpacked\n"`
			`" --header \| --pretty\n"`
rev-list: default to abbreviate merge parent names under --pretty. When we prettyprint commit log messages, merge parent names were often very long and there was no way to abbreviate it. This changes them to be abbreviated by default, and non-default abbreviations can be specified with --no-abbrev or --abbrev=<n> options. Note that this affects only the prettyprinted parent names. The output from --show-parents is meant for machine consumption and is not affected by this flag. 2006-02-10 20:56:42 +01:00			`" --abbrev=nr \| --no-abbrev\n"`
Update usage string and documentation for git-rev-list. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-30 10:03:45 +01:00			`" special purpose:\n"`
			`" --bisect"`
			`;`
git-rev-list: add "end" commit and "--header" flag The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc. 2005-05-26 03:29:09 +02:00
git-rev-list: make --dense the default (and introduce "--sparse") This actually does three things: - make "--dense" the default for git-rev-list. Since dense is a no-op if no filenames are given, this doesn't actually change any historical behaviour, but it's logically the right default (if we want to prune on filenames, do it fully. The sparse "merge-only" thing may be useful, but it's not what you'd normally expect) - make "git-rev-parse" show the default revision control before it shows any pathnames. This was a real bug, but nobody would ever have noticed, because the default thing tends to only make sense for git-rev-list, and git-rev-list didn't use to take pathnames. - it changes "git-rev-list" to match the other commands that take a mix of revisions and filenames - it no longer requires the "--" before filenames (although you still need to do it if a filename could be confused with a revision name, eg "gitk" in the git archive) This all just makes for much more pleasant and obvous usage. Just doing a gitk t/ does the obvious thing: it will show the history as it concerns the "t/" subdirectory. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 00:24:55 +02:00			`static int dense = 1;`
"git rev-list --unpacked" shows only unpacked commits More infrastructure to do efficient incremental packs. 2005-07-03 22:29:54 +02:00			`static int unpacked = 0;`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`static int bisect_list = 0;`
Prepare git-rev-list for tracking tag objects too We want to be able to just say "give a difference between these objects", rather than limiting it to commits only. This isn't there yet, but it sets things up to be a bit easier. 2005-06-29 19:40:14 +02:00			`static int tag_objects = 0;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`static int tree_objects = 0;`
			`static int blob_objects = 0;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`static int verbose_header = 0;`
rev-list: default to abbreviate merge parent names under --pretty. When we prettyprint commit log messages, merge parent names were often very long and there was no way to abbreviate it. This changes them to be abbreviated by default, and non-default abbreviations can be specified with --no-abbrev or --abbrev=<n> options. Note that this affects only the prettyprinted parent names. The output from --show-parents is meant for machine consumption and is not affected by this flag. 2006-02-10 20:56:42 +01:00			`static int abbrev = DEFAULT_ABBREV;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`static int show_parents = 0;`
			`static int hdr_termination = 0;`
[PATCH] Fix "prefix" mixup in git-rev-list Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-24 23:58:42 +02:00			`static const char *commit_prefix = "";`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`static unsigned long max_age = -1;`
			`static unsigned long min_age = -1;`
			`static int max_count = -1;`
pretty_print_commit: add different formats You can ask to print out "raw" format (full headers, full body), "medium" format (author and date, full body) or "short" format (author only, condensed body). Use "git-rev-list --pretty=short HEAD \| less -S" for an example. 2005-06-05 18:02:03 +02:00			`static enum cmit_fmt commit_format = CMIT_FMT_RAW;`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`static int merge_order = 0;`
			`static int show_breaks = 0;`
[PATCH] Fix for --merge-order, --max-age interaction issue This patch fixes a problem reported by Paul Mackerras regarding the interaction of the --merge-order and --max-age switches of git-rev-list. This patch applies to the current Linus HEAD. A cleaner fix for the same problem in my current HEAD will follow later. With this change, --merge-order produces the same result as no --merge-order on the linux-2.6 git repository, to wit: $> git-rev-list --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 $> git-rev-list --merge-order --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-20 04:29:41 +02:00			`static int stop_traversal = 0;`
Add "--topo-order" flag to use new topological sort 2005-07-06 19:25:04 +02:00			`static int topo_order = 0;`
[PATCH] add --no-merges flag to suppress display of merge commits As requested by Junio (who suggested --single-parents-only, but this could forget a no-parent root). Also, adds a few missing options to the usage string. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-08 11:37:21 +02:00			`static int no_merges = 0;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`static const char **paths = NULL;`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`static int remove_empty_trees = 0;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00
			`static void show_commit(struct commit *commit)`
			`{`
[PATCH] Prevent git-rev-list without --merge-order producing duplicates in output If b is reachable from a, then: git-rev-list a b argument would print one of the commits twice. This patch fixes that problem. A previous problem fixed it for the --merge-order switch. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-20 04:29:38 +02:00			`commit->object.flags \|= SHOWN;`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`if (show_breaks) {`
[PATCH] Fix "prefix" mixup in git-rev-list Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-24 23:58:42 +02:00			`commit_prefix = "\| ";`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`if (commit->object.flags & DISCONTINUITY) {`
[PATCH] Fix "prefix" mixup in git-rev-list Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-24 23:58:42 +02:00			`commit_prefix = "^ ";`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`} else if (commit->object.flags & BOUNDARY) {`
[PATCH] Fix "prefix" mixup in git-rev-list Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-24 23:58:42 +02:00			`commit_prefix = "= ";`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`}`
			`}`
[PATCH] Fix "prefix" mixup in git-rev-list Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-24 23:58:42 +02:00			`printf("%s%s", commit_prefix, sha1_to_hex(commit->object.sha1));`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`if (show_parents) {`
			`struct commit_list *parents = commit->parents;`
			`while (parents) {`
rev-list: omit duplicated parents. Showing the same parent more than once for a commit does not make much sense downstream, so stop it. This can happen with an incorrectly made merge commit that merges the same parent twice, but can happen in an otherwise sane development history while squishing the history by taking into account only commits that touch specified paths. For example, $ git rev-list --max-count=1 --parents addafaf -- rev-list.c would have to show this commit ancestry graph: .---o---. / \ .------o---. / 93b74bc \ ------o---o-----o---o-----o addafaf d8f6b34 \ / .---o---o---. \ / .---*---. 3815f42 where 5 independent development tracks, only two of which have changes in the specified paths since they forked. The last change for the other three development tracks was done by the same commit before they forked, and we were showing that three times. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-30 00:24:42 +01:00			`struct object *o = &(parents->item->object);`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`parents = parents->next;`
rev-list: omit duplicated parents. Showing the same parent more than once for a commit does not make much sense downstream, so stop it. This can happen with an incorrectly made merge commit that merges the same parent twice, but can happen in an otherwise sane development history while squishing the history by taking into account only commits that touch specified paths. For example, $ git rev-list --max-count=1 --parents addafaf -- rev-list.c would have to show this commit ancestry graph: .---o---. / \ .------o---. / 93b74bc \ ------o---o-----o---o-----o addafaf d8f6b34 \ / .---o---o---. \ / .---*---. 3815f42 where 5 independent development tracks, only two of which have changes in the specified paths since they forked. The last change for the other three development tracks was done by the same commit before they forked, and we were showing that three times. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-30 00:24:42 +01:00			`if (o->flags & TMP_MARK)`
			`continue;`
			`printf(" %s", sha1_to_hex(o->sha1));`
			`o->flags \|= TMP_MARK;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`}`
rev-list: omit duplicated parents. Showing the same parent more than once for a commit does not make much sense downstream, so stop it. This can happen with an incorrectly made merge commit that merges the same parent twice, but can happen in an otherwise sane development history while squishing the history by taking into account only commits that touch specified paths. For example, $ git rev-list --max-count=1 --parents addafaf -- rev-list.c would have to show this commit ancestry graph: .---o---. / \ .------o---. / 93b74bc \ ------o---o-----o---o-----o addafaf d8f6b34 \ / .---o---o---. \ / .---*---. 3815f42 where 5 independent development tracks, only two of which have changes in the specified paths since they forked. The last change for the other three development tracks was done by the same commit before they forked, and we were showing that three times. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-30 00:24:42 +01:00			`/* TMP_MARK is a general purpose flag that can`
			`* be used locally, but the user should clean`
			`* things up after it is done with them.`
			`*/`
			`for (parents = commit->parents;`
			`parents;`
			`parents = parents->next)`
			`parents->item->object.flags &= ~TMP_MARK;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`}`
Introduce --pretty=oneline format. This introduces --pretty=oneline to git-rev-tree and git-rev-list commands to show only the first line of the commit message, without frills. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-09 07:15:40 +02:00			`if (commit_format == CMIT_FMT_ONELINE)`
			`putchar(' ');`
			`else`
			`putchar('\n');`

git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`if (verbose_header) {`
pretty_print_commit: add different formats You can ask to print out "raw" format (full headers, full body), "medium" format (author and date, full body) or "short" format (author only, condensed body). Use "git-rev-list --pretty=short HEAD \| less -S" for an example. 2005-06-05 18:02:03 +02:00			`static char pretty_header[16384];`
rev-list: default to abbreviate merge parent names under --pretty. When we prettyprint commit log messages, merge parent names were often very long and there was no way to abbreviate it. This changes them to be abbreviated by default, and non-default abbreviations can be specified with --no-abbrev or --abbrev=<n> options. Note that this affects only the prettyprinted parent names. The output from --show-parents is meant for machine consumption and is not affected by this flag. 2006-02-10 20:56:42 +01:00			`pretty_print_commit(commit_format, commit, ~0, pretty_header, sizeof(pretty_header), abbrev);`
pretty_print_commit: add different formats You can ask to print out "raw" format (full headers, full body), "medium" format (author and date, full body) or "short" format (author only, condensed body). Use "git-rev-list --pretty=short HEAD \| less -S" for an example. 2005-06-05 18:02:03 +02:00			`printf("%s%c", pretty_header, hdr_termination);`
Make rev-list flush the stdio buffers after each rev. We'd rather get the revisions in a slow but timely manner than have to wait for them. 2005-07-05 01:36:48 +02:00			`}`
			`fflush(stdout);`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`}`

git-rev-list: fix "--dense" flag Right now --dense will _always_ show the root commit. I didn't do the logic that does the diff against an empty tree. I was lazy. This patch does that. The first round was incorrect but this patch is even slightly tested, and might do a better job. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-25 20:50:46 +02:00			`static int rewrite_one(struct commit **pp)`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`{`
			`for (;;) {`
			`struct commit p = pp;`
			`if (p->object.flags & (TREECHANGE \| UNINTERESTING))`
git-rev-list: fix "--dense" flag Right now --dense will _always_ show the root commit. I didn't do the logic that does the diff against an empty tree. I was lazy. This patch does that. The first round was incorrect but this patch is even slightly tested, and might do a better job. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-25 20:50:46 +02:00			`return 0;`
			`if (!p->parents)`
			`return -1;`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`*pp = p->parents->item;`
			`}`
			`}`

			`static void rewrite_parents(struct commit *commit)`
			`{`
git-rev-list: fix "--dense" flag Right now --dense will _always_ show the root commit. I didn't do the logic that does the diff against an empty tree. I was lazy. This patch does that. The first round was incorrect but this patch is even slightly tested, and might do a better job. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-25 20:50:46 +02:00			`struct commit_list **pp = &commit->parents;`
			`while (*pp) {`
			`struct commit_list parent = pp;`
			`if (rewrite_one(&parent->item) < 0) {`
			`*pp = parent->next;`
			`continue;`
			`}`
			`pp = &parent->next;`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`}`
			`}`

[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`static int filter_commit(struct commit * commit)`
			`{`
[PATCH] Tidy up - slight simplification of rev-list.c This patch implements a small tidy up of rev-list.c to reduce (but not eliminate) the amount of ugliness associated with the merge_order flag. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-06 18:39:34 +02:00			`if (stop_traversal && (commit->object.flags & BOUNDARY))`
[PATCH] Fix for --merge-order, --max-age interaction issue This patch fixes a problem reported by Paul Mackerras regarding the interaction of the --merge-order and --max-age switches of git-rev-list. This patch applies to the current Linus HEAD. A cleaner fix for the same problem in my current HEAD will follow later. With this change, --merge-order produces the same result as no --merge-order on the linux-2.6 git repository, to wit: $> git-rev-list --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 $> git-rev-list --merge-order --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-20 04:29:41 +02:00			`return STOP;`
[PATCH] Prevent git-rev-list without --merge-order producing duplicates in output If b is reachable from a, then: git-rev-list a b argument would print one of the commits twice. This patch fixes that problem. A previous problem fixed it for the --merge-order switch. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-20 04:29:38 +02:00			`if (commit->object.flags & (UNINTERESTING\|SHOWN))`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`return CONTINUE;`
			`if (min_age != -1 && (commit->date > min_age))`
			`return CONTINUE;`
[PATCH] Fix for --merge-order, --max-age interaction issue This patch fixes a problem reported by Paul Mackerras regarding the interaction of the --merge-order and --max-age switches of git-rev-list. This patch applies to the current Linus HEAD. A cleaner fix for the same problem in my current HEAD will follow later. With this change, --merge-order produces the same result as no --merge-order on the linux-2.6 git repository, to wit: $> git-rev-list --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 $> git-rev-list --merge-order --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-20 04:29:41 +02:00			`if (max_age != -1 && (commit->date < max_age)) {`
[PATCH] Tidy up - slight simplification of rev-list.c This patch implements a small tidy up of rev-list.c to reduce (but not eliminate) the amount of ugliness associated with the merge_order flag. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-06 18:39:34 +02:00			`stop_traversal=1;`
Make time-based commit filtering work with topological ordering. The trick is to consider the time-based filtering a limiter, the same way we do for release ranges. That means that the time-based filtering runs _before_ the topological sorting, which makes it meaningful again. It also simplifies the code logic. This makes "gitk" useful with time ranges. [ Second version: --merge-order now unaffected by the re-org ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 02:55:46 +02:00			`return CONTINUE;`
[PATCH] Fix for --merge-order, --max-age interaction issue This patch fixes a problem reported by Paul Mackerras regarding the interaction of the --merge-order and --max-age switches of git-rev-list. This patch applies to the current Linus HEAD. A cleaner fix for the same problem in my current HEAD will follow later. With this change, --merge-order produces the same result as no --merge-order on the linux-2.6 git repository, to wit: $> git-rev-list --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 $> git-rev-list --merge-order --max-age=1116330140 bcfff0b471a60df350338bcd727fc9b8a6aa54b2 \| wc -l 655 Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-20 04:29:41 +02:00			`}`
[PATCH] add --no-merges flag to suppress display of merge commits As requested by Junio (who suggested --single-parents-only, but this could forget a no-parent root). Also, adds a few missing options to the usage string. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-08 11:37:21 +02:00			`if (no_merges && (commit->parents && commit->parents->next))`
			`return CONTINUE;`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`if (paths && dense) {`
			`if (!(commit->object.flags & TREECHANGE))`
			`return CONTINUE;`
			`rewrite_parents(commit);`
			`}`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`return DO;`
			`}`

			`static int process_commit(struct commit * commit)`
			`{`
			`int action=filter_commit(commit);`

			`if (action == STOP) {`
			`return STOP;`
			`}`

			`if (action == CONTINUE) {`
			`return CONTINUE;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`}`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00
max-count in terms of intersection When a path designation is given, max-count counts the number of commits therein (intersection), not globally. This avoids the case where in case path has been inactive for the last N commits, --max-count=N and path designation at git-rev-list is given, would give no commits. Signed-off-by: Luben Tuikov <ltuikov@yahoo.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-18 22:29:04 +01:00			`if (max_count != -1 && !max_count--)`
			`return STOP;`

[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`show_commit(commit);`

			`return CONTINUE;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`}`

Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`static struct object_list *add_object(struct object obj, struct object_list *p, const char name)`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`{`
			`struct object_list entry = xmalloc(sizeof(entry));`
			`entry->item = obj;`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`entry->next = *p;`
Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`entry->name = name;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`*p = entry;`
			`return &entry->next;`
			`}`

Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`static struct object_list *process_blob(struct blob blob, struct object_list *p, const char name)`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`{`
			`struct object *obj = &blob->object;`

			`if (!blob_objects)`
			`return p;`
			`if (obj->flags & (UNINTERESTING \| SEEN))`
			`return p;`
			`obj->flags \|= SEEN;`
Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`return add_object(obj, p, name);`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`}`

Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`static struct object_list *process_tree(struct tree tree, struct object_list *p, const char name)`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`{`
			`struct object *obj = &tree->object;`
			`struct tree_entry_list *entry;`

			`if (!tree_objects)`
			`return p;`
			`if (obj->flags & (UNINTERESTING \| SEEN))`
			`return p;`
			`if (parse_tree(tree) < 0)`
			`die("bad tree object %s", sha1_to_hex(obj->sha1));`
			`obj->flags \|= SEEN;`
Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`p = add_object(obj, p, name);`
[PATCH] Improve git-rev-list memory usage further This avoids keeping tree entries around, and free's them as it traverses the list. This avoids building up a huge memory footprint just for these small but very common allocations. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42934minor)pagefaults 0swaps 59124 After: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 Note how the minor fault numbers - which ends up being how many pages we needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is: Before: 42934 minor pagefaults After: 26718 minor pagefaults This is all in _addition_ to the previous fixes. It used to be ~48,000 pagefaults. That's still a honking big memory footprint, but it's about half of what it was just a day or two ago (and this is the object list for a pretty big update - almost 60,000 objects. Smaller updates need less memory). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 23:32:48 +02:00			`entry = tree->entries;`
			`tree->entries = NULL;`
			`while (entry) {`
			`struct tree_entry_list *next = entry->next;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`if (entry->directory)`
Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`p = process_tree(entry->item.tree, p, entry->name);`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`else`
Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`p = process_blob(entry->item.blob, p, entry->name);`
[PATCH] Improve git-rev-list memory usage further This avoids keeping tree entries around, and free's them as it traverses the list. This avoids building up a huge memory footprint just for these small but very common allocations. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42934minor)pagefaults 0swaps 59124 After: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 Note how the minor fault numbers - which ends up being how many pages we needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is: Before: 42934 minor pagefaults After: 26718 minor pagefaults This is all in _addition_ to the previous fixes. It used to be ~48,000 pagefaults. That's still a honking big memory footprint, but it's about half of what it was just a day or two ago (and this is the object list for a pretty big update - almost 60,000 objects. Smaller updates need less memory). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 23:32:48 +02:00			`free(entry);`
			`entry = next;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`}`
			`return p;`
			`}`

Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`static struct object_list *pending_objects = NULL;`

git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`static void show_commit_list(struct commit_list *list)`
			`{`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`struct object_list objects = NULL, p = &objects, pending;`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`while (list) {`
			`struct commit *commit = pop_most_recent_commit(&list, SEEN);`

Ooh. Make git-rev-list --object associate a name with objects. The name isn't unique, it's just the first name that object is reached through, so it's really nothing more than a hint. 2005-06-27 00:26:05 +02:00			`p = process_tree(commit->tree, p, "");`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`if (process_commit(commit) == STOP)`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`break;`
			`}`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`for (pending = pending_objects; pending; pending = pending->next) {`
			`struct object *obj = pending->item;`
			`const char *name = pending->name;`
			`if (obj->flags & (UNINTERESTING \| SEEN))`
			`continue;`
			`if (obj->type == tag_type) {`
			`obj->flags \|= SEEN;`
			`p = add_object(obj, p, name);`
			`continue;`
			`}`
			`if (obj->type == tree_type) {`
			`p = process_tree((struct tree *)obj, p, name);`
			`continue;`
			`}`
			`if (obj->type == blob_type) {`
			`p = process_blob((struct blob *)obj, p, name);`
			`continue;`
			`}`
			`die("unknown pending object %s (%s)", sha1_to_hex(obj->sha1), name);`
			`}`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`while (objects) {`
Fix minor DOS in rev-list. A carefully crafted pathname can be used to disrupt downstream git-pack-objects that uses 'git-rev-list --objects' output. Prevent this. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-03 02:29:21 +02:00			`/* An object with name "foo\n0000000000000000000000000000000000000000"`
			`* can be used confuse downstream git-pack-objects very badly.`
			`*/`
			`const char *ep = strchr(objects->name, '\n');`
			`if (ep) {`
			`printf("%s %.*s\n", sha1_to_hex(objects->item->sha1),`
			`(int) (ep - objects->name),`
			`objects->name);`
			`}`
			`else`
			`printf("%s %s\n", sha1_to_hex(objects->item->sha1), objects->name);`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`objects = objects->next;`
			`}`
			`}`

			`static void mark_blob_uninteresting(struct blob *blob)`
			`{`
			`if (!blob_objects)`
			`return;`
			`if (blob->object.flags & UNINTERESTING)`
			`return;`
			`blob->object.flags \|= UNINTERESTING;`
			`}`

			`static void mark_tree_uninteresting(struct tree *tree)`
			`{`
			`struct object *obj = &tree->object;`
			`struct tree_entry_list *entry;`

			`if (!tree_objects)`
			`return;`
			`if (obj->flags & UNINTERESTING)`
			`return;`
			`obj->flags \|= UNINTERESTING;`
git-rev-list: allow missing objects when the parent is marked UNINTERESTING We still want the "top-most" uninteresting object to exist, so that we know that we have reached it. 2005-07-11 00:09:46 +02:00			`if (!has_sha1_file(obj->sha1))`
			`return;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`if (parse_tree(tree) < 0)`
			`die("bad tree %s", sha1_to_hex(obj->sha1));`
			`entry = tree->entries;`
[PATCH] Improve git-rev-list memory usage further This avoids keeping tree entries around, and free's them as it traverses the list. This avoids building up a huge memory footprint just for these small but very common allocations. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42934minor)pagefaults 0swaps 59124 After: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 Note how the minor fault numbers - which ends up being how many pages we needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is: Before: 42934 minor pagefaults After: 26718 minor pagefaults This is all in _addition_ to the previous fixes. It used to be ~48,000 pagefaults. That's still a honking big memory footprint, but it's about half of what it was just a day or two ago (and this is the object list for a pretty big update - almost 60,000 objects. Smaller updates need less memory). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 23:32:48 +02:00			`tree->entries = NULL;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`while (entry) {`
[PATCH] Improve git-rev-list memory usage further This avoids keeping tree entries around, and free's them as it traverses the list. This avoids building up a huge memory footprint just for these small but very common allocations. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42934minor)pagefaults 0swaps 59124 After: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 Note how the minor fault numbers - which ends up being how many pages we needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is: Before: 42934 minor pagefaults After: 26718 minor pagefaults This is all in _addition_ to the previous fixes. It used to be ~48,000 pagefaults. That's still a honking big memory footprint, but it's about half of what it was just a day or two ago (and this is the object list for a pretty big update - almost 60,000 objects. Smaller updates need less memory). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 23:32:48 +02:00			`struct tree_entry_list *next = entry->next;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`if (entry->directory)`
			`mark_tree_uninteresting(entry->item.tree);`
			`else`
			`mark_blob_uninteresting(entry->item.blob);`
[PATCH] Improve git-rev-list memory usage further This avoids keeping tree entries around, and free's them as it traverses the list. This avoids building up a huge memory footprint just for these small but very common allocations. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 11.65user 0.38system 0:12.65elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42934minor)pagefaults 0swaps 59124 After: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 Note how the minor fault numbers - which ends up being how many pages we needed to map - go down from 42934 (167 MB) to 26718 (104 MB). That is: Before: 42934 minor pagefaults After: 26718 minor pagefaults This is all in _addition_ to the previous fixes. It used to be ~48,000 pagefaults. That's still a honking big memory footprint, but it's about half of what it was just a day or two ago (and this is the object list for a pretty big update - almost 60,000 objects. Smaller updates need less memory). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 23:32:48 +02:00			`free(entry);`
			`entry = next;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`}`
git-rev-list: factor out the commit printing from "main()" Functions that do many things are bad. We should basically just parse the arguments in main(). We're not quite there yet, but it's a step in the right direction. 2005-06-02 18:19:53 +02:00			`}`

git-rev-list: use proper lazy reachability analysis This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4. 2005-05-31 03:46:32 +02:00			`static void mark_parents_uninteresting(struct commit *commit)`
			`{`
			`struct commit_list *parents = commit->parents;`

			`while (parents) {`
			`struct commit *commit = parents->item;`
			`commit->object.flags \|= UNINTERESTING;`
git-rev-list: allow missing objects when the parent is marked UNINTERESTING We still want the "top-most" uninteresting object to exist, so that we know that we have reached it. 2005-07-11 00:09:46 +02:00
[PATCH] Fix interesting git-rev-list corner case This corner-case was triggered by a kernel commit that was not in date order, due to a misconfigured time zone that made the commit appear three hours older than it was. That caused git-rev-list to traverse the commit tree in a non-obvious order, and made it parse several of the _parents_ of the misplaced commit before it actually parsed the commit itself. That's fine, but it meant that the grandparents of the commit didn't get marked uninteresting, because they had been reached through an "interesting" branch. The reason was that "mark_parents_uninteresting()" (which is supposed to mark all existing parents as being uninteresting - duh) didn't actually traverse more than one level down the parent chain. NORMALLY this is fine, since with the date-based traversal order, grandparents won't ever even have been looked at before their parents (so traversing the chain down isn't needed, because the next time around when we pick out the parent we'll mark _its_ parents uninteresting), but since we'd gotten out of order, we'd already seen the parent and thus never got around to mark the grandparents. Anyway, the fix is simple. Just traverse parent chains recursively. Normally the chain won't even exist (since the parent hasn't been parsed yet), so this is not actually going to trigger except in this strange corner-case. Add a comment to the simple one-liner, since this was a bit subtle, and I had to really think things through to understand how it could happen. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-30 00:50:30 +02:00			`/*`
			`* Normally we haven't parsed the parent`
			`* yet, so we won't have a parent of a parent`
			`* here. However, it may turn out that we've`
			`* reached this commit some other way (where it`
			`* wasn't uninteresting), in which case we need`
			`* to mark its parents recursively too..`
			`*/`
			`if (commit->parents)`
			`mark_parents_uninteresting(commit);`

git-rev-list: allow missing objects when the parent is marked UNINTERESTING We still want the "top-most" uninteresting object to exist, so that we know that we have reached it. 2005-07-11 00:09:46 +02:00			`/*`
			`* A missing commit is ok iff its parent is marked`
			`* uninteresting.`
			`*`
			`* We just mark such a thing parsed, so that when`
			`* it is popped next time around, we won't be trying`
			`* to parse it and get an error.`
			`*/`
			`if (!has_sha1_file(commit->object.sha1))`
			`commit->object.parsed = 1;`
git-rev-list: use proper lazy reachability analysis This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4. 2005-05-31 03:46:32 +02:00			`parents = parents->next;`
			`}`
			`}`

Be more aggressive about marking trees uninteresting We'll mark all the trees at the edges (as deep as we had to go to realize that we have all the commits needed) as uninteresting. Otherwise we'll occasionally list a lot of objects that were actually available at the edge in a commit that we just never ended up parsing because we could determine early that we had all relevant commits. NOTE! The object listing is still just a _heuristic_. It's guaranteed to list a superset of the actual new objects, but there might be the occasional old object in the list, just because the commit that referenced it was much further back in the history. For example, let's say that a recent commit is a revert of part of the tree to much older state: since we didn't walk _that_ far back in the commit history tree to list the commits necessary, git-rev-tree will never have marked the old objects uninteresting, and we'll end up listing them as "new". That's ok. 2005-07-23 19:01:49 +02:00			`static int everybody_uninteresting(struct commit_list *orig)`
git-rev-list: use proper lazy reachability analysis This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4. 2005-05-31 03:46:32 +02:00			`{`
Be more aggressive about marking trees uninteresting We'll mark all the trees at the edges (as deep as we had to go to realize that we have all the commits needed) as uninteresting. Otherwise we'll occasionally list a lot of objects that were actually available at the edge in a commit that we just never ended up parsing because we could determine early that we had all relevant commits. NOTE! The object listing is still just a _heuristic_. It's guaranteed to list a superset of the actual new objects, but there might be the occasional old object in the list, just because the commit that referenced it was much further back in the history. For example, let's say that a recent commit is a revert of part of the tree to much older state: since we didn't walk _that_ far back in the commit history tree to list the commits necessary, git-rev-tree will never have marked the old objects uninteresting, and we'll end up listing them as "new". That's ok. 2005-07-23 19:01:49 +02:00			`struct commit_list *list = orig;`
git-rev-list: use proper lazy reachability analysis This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4. 2005-05-31 03:46:32 +02:00			`while (list) {`
			`struct commit *commit = list->item;`
			`list = list->next;`
			`if (commit->object.flags & UNINTERESTING)`
			`continue;`
			`return 0;`
			`}`
			`return 1;`
			`}`

git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`/*`
			`* This is a truly stupid algorithm, but it's only`
			`* used for bisection, and we just don't care enough.`
			`*`
			`* We care just barely enough to avoid recursing for`
			`* non-merge entries.`
			`*/`
			`static int count_distance(struct commit_list *entry)`
			`{`
			`int nr = 0;`

			`while (entry) {`
			`struct commit *commit = entry->item;`
			`struct commit_list *p;`

			`if (commit->object.flags & (UNINTERESTING \| COUNTED))`
			`break;`
bisect: limit the searchspace by pathspecs It was surprisingly easy to do. git bisect start <pathspec> followed by all the normal "git bisect good/bad" stuff. Almost totally untested, and I guarantee that if your pathnames have spaces in them (or your GIT_DIR has spaces in it) this won't work. I don't know how to fix that, my shell programming isn't good enough. This involves small changes to make "git-rev-list --bisect" work in the presense of a pathspec limiter, and then truly trivial (and that's the broken part) changes to make "git bisect" save away and use the pathspec. I tried one bisection, and a "git bisect visualize", and it all looked correct. But hey, don't be surprised if it has problems. Linus Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-27 20:32:03 +01:00			`if (!paths \|\| (commit->object.flags & TREECHANGE))`
			`nr++;`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`commit->object.flags \|= COUNTED;`
			`p = commit->parents;`
			`entry = p;`
			`if (p) {`
			`p = p->next;`
			`while (p) {`
			`nr += count_distance(p);`
			`p = p->next;`
			`}`
			`}`
			`}`
bisect: limit the searchspace by pathspecs It was surprisingly easy to do. git bisect start <pathspec> followed by all the normal "git bisect good/bad" stuff. Almost totally untested, and I guarantee that if your pathnames have spaces in them (or your GIT_DIR has spaces in it) this won't work. I don't know how to fix that, my shell programming isn't good enough. This involves small changes to make "git-rev-list --bisect" work in the presense of a pathspec limiter, and then truly trivial (and that's the broken part) changes to make "git bisect" save away and use the pathspec. I tried one bisection, and a "git bisect visualize", and it all looked correct. But hey, don't be surprised if it has problems. Linus Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-27 20:32:03 +01:00
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`return nr;`
			`}`

Avoid warning about function without return. Strangely, this warning only shows up when not compiling with "-O2", which is why I didn't see it originally. 2005-06-19 05:02:49 +02:00			`static void clear_distance(struct commit_list *list)`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`{`
			`while (list) {`
			`struct commit *commit = list->item;`
			`commit->object.flags &= ~COUNTED;`
			`list = list->next;`
			`}`
			`}`

			`static struct commit_list find_bisection(struct commit_list list)`
			`{`
			`int nr, closest;`
			`struct commit_list p, best;`

			`nr = 0;`
			`p = list;`
			`while (p) {`
bisect: limit the searchspace by pathspecs It was surprisingly easy to do. git bisect start <pathspec> followed by all the normal "git bisect good/bad" stuff. Almost totally untested, and I guarantee that if your pathnames have spaces in them (or your GIT_DIR has spaces in it) this won't work. I don't know how to fix that, my shell programming isn't good enough. This involves small changes to make "git-rev-list --bisect" work in the presense of a pathspec limiter, and then truly trivial (and that's the broken part) changes to make "git bisect" save away and use the pathspec. I tried one bisection, and a "git bisect visualize", and it all looked correct. But hey, don't be surprised if it has problems. Linus Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-27 20:32:03 +01:00			`if (!paths \|\| (p->item->object.flags & TREECHANGE))`
			`nr++;`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`p = p->next;`
			`}`
			`closest = 0;`
			`best = list;`

bisect: limit the searchspace by pathspecs It was surprisingly easy to do. git bisect start <pathspec> followed by all the normal "git bisect good/bad" stuff. Almost totally untested, and I guarantee that if your pathnames have spaces in them (or your GIT_DIR has spaces in it) this won't work. I don't know how to fix that, my shell programming isn't good enough. This involves small changes to make "git-rev-list --bisect" work in the presense of a pathspec limiter, and then truly trivial (and that's the broken part) changes to make "git bisect" save away and use the pathspec. I tried one bisection, and a "git bisect visualize", and it all looked correct. But hey, don't be surprised if it has problems. Linus Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-27 20:32:03 +01:00			`for (p = list; p; p = p->next) {`
			`int distance;`

			`if (paths && !(p->item->object.flags & TREECHANGE))`
			`continue;`

			`distance = count_distance(p);`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`clear_distance(list);`
			`if (nr - distance < distance)`
			`distance = nr - distance;`
			`if (distance > closest) {`
			`best = p;`
			`closest = distance;`
			`}`
			`}`
			`if (best)`
			`best->next = NULL;`
			`return best;`
			`}`

[PATCH] Re-organize "git-rev-list --objects" logic The logic to calculate the full object list used to be very inter-twined with the logic that looked up the commits. For no good reason - it's actually a lot simpler to just do that logic as a separate pass. This improves performance a bit, and uses slightly less memory in my tests, but more importantly it makes the code simpler to work with and follow what it does. The performance win is less than I had hoped for, but I get: Before: [torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 13.64user 0.42system 0:14.13elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+47947minor)pagefaults 0swaps 58945 After: [torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 11.80user 0.36system 0:12.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42684minor)pagefaults 0swaps 58945 ie it improved by 2 seconds, and took a 5000+ fewer pages (hey, that's 20MB out of 174MB to go). And got the same number of objects (in theory, the more expensive one might find some more shared objects to avoid. In practice it obviously doesn't). I know how to make it use _lots_ less memory, which will probably speed it up. But that's for another time, and I'd prefer to see this go in first. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 00:14:29 +02:00			`static void mark_edges_uninteresting(struct commit_list *list)`
			`{`
			`for ( ; list; list = list->next) {`
			`struct commit_list *parents = list->item->parents;`

			`for ( ; parents; parents = parents->next) {`
			`struct commit *commit = parents->item;`
			`if (commit->object.flags & UNINTERESTING)`
			`mark_tree_uninteresting(commit->tree);`
			`}`
			`}`
			`}`

rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`#define TREE_SAME 0`
			`#define TREE_NEW 1`
			`#define TREE_DIFFERENT 2`
			`static int tree_difference = TREE_SAME;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00
			`static void file_add_remove(struct diff_options *options,`
			`int addremove, unsigned mode,`
			`const unsigned char *sha1,`
			`const char base, const char path)`
			`{`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`int diff = TREE_DIFFERENT;`

			`/*`
			`* Is it an add of a new file? It means that`
			`* the old tree didn't have it at all, so we`
			`* will turn "TREE_SAME" -> "TREE_NEW", but`
			`* leave any "TREE_DIFFERENT" alone (and if`
			`* it already was "TREE_NEW", we'll keep it`
			`* "TREE_NEW" of course).`
			`*/`
			`if (addremove == '+') {`
			`diff = tree_difference;`
			`if (diff != TREE_SAME)`
			`return;`
			`diff = TREE_NEW;`
			`}`
			`tree_difference = diff;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`}`

			`static void file_change(struct diff_options *options,`
			`unsigned old_mode, unsigned new_mode,`
			`const unsigned char *old_sha1,`
			`const unsigned char *new_sha1,`
			`const char base, const char path)`
			`{`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`tree_difference = TREE_DIFFERENT;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`}`

			`static struct diff_options diff_opt = {`
			`.recursive = 1,`
			`.add_remove = file_add_remove,`
			`.change = file_change,`
			`};`

rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`static int compare_tree(struct tree t1, struct tree t2)`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`{`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`if (!t1)`
			`return TREE_NEW;`
			`if (!t2)`
			`return TREE_DIFFERENT;`
			`tree_difference = TREE_SAME;`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`if (diff_tree_sha1(t1->object.sha1, t2->object.sha1, "", &diff_opt) < 0)`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`return TREE_DIFFERENT;`
			`return tree_difference;`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`}`

git-rev-list: fix "--dense" flag Right now --dense will _always_ show the root commit. I didn't do the logic that does the diff against an empty tree. I was lazy. This patch does that. The first round was incorrect but this patch is even slightly tested, and might do a better job. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-25 20:50:46 +02:00			`static int same_tree_as_empty(struct tree *t1)`
			`{`
			`int retval;`
			`void *tree;`
			`struct tree_desc empty, real;`

			`if (!t1)`
			`return 0;`

			`tree = read_object_with_reference(t1->object.sha1, "tree", &real.size, NULL);`
			`if (!tree)`
			`return 0;`
			`real.buf = tree;`

			`empty.buf = "";`
			`empty.size = 0;`

rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`tree_difference = 0;`
git-rev-list: fix "--dense" flag Right now --dense will _always_ show the root commit. I didn't do the logic that does the diff against an empty tree. I was lazy. This patch does that. The first round was incorrect but this patch is even slightly tested, and might do a better job. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-25 20:50:46 +02:00			`retval = diff_tree(&empty, &real, "", &diff_opt);`
			`free(tree);`

rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`return retval >= 0 && !tree_difference;`
git-rev-list: fix "--dense" flag Right now --dense will _always_ show the root commit. I didn't do the logic that does the diff against an empty tree. I was lazy. This patch does that. The first round was incorrect but this patch is even slightly tested, and might do a better job. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-25 20:50:46 +02:00			`}`

rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`static void try_to_simplify_commit(struct commit *commit)`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`{`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`struct commit_list *pp, parent;`

Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`if (!commit->tree)`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`return;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`if (!commit->parents) {`
			`if (!same_tree_as_empty(commit->tree))`
			`commit->object.flags \|= TREECHANGE;`
			`return;`
			`}`

			`pp = &commit->parents;`
			`while ((parent = *pp) != NULL) {`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`struct commit *p = parent->item;`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00
			`if (p->object.flags & UNINTERESTING) {`
			`pp = &parent->next;`
			`continue;`
			`}`

Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`parse_commit(p);`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`switch (compare_tree(p->tree, commit->tree)) {`
			`case TREE_SAME:`
			`parent->next = NULL;`
			`commit->parents = parent;`
			`return;`

			`case TREE_NEW:`
			`if (remove_empty_trees && same_tree_as_empty(p->tree)) {`
			`*pp = parent->next;`
			`continue;`
			`}`
			`/* fallthrough */`
			`case TREE_DIFFERENT:`
			`pp = &parent->next;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`continue;`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`}`
			`die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1));`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`}`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`commit->object.flags \|= TREECHANGE;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`}`

			`static void add_parents_to_list(struct commit commit, struct commit_list *list)`
			`{`
			`struct commit_list *parent = commit->parents;`

			`/*`
			`* If the commit is uninteresting, don't try to`
			`* prune parents - we want the maximal uninteresting`
			`* set.`
			`*`
			`* Normally we haven't parsed the parent`
			`* yet, so we won't have a parent of a parent`
			`* here. However, it may turn out that we've`
			`* reached this commit some other way (where it`
			`* wasn't uninteresting), in which case we need`
			`* to mark its parents recursively too..`
			`*/`
			`if (commit->object.flags & UNINTERESTING) {`
			`while (parent) {`
			`struct commit *p = parent->item;`
			`parent = parent->next;`
			`parse_commit(p);`
			`p->object.flags \|= UNINTERESTING;`
			`if (p->parents)`
			`mark_parents_uninteresting(p);`
			`if (p->object.flags & SEEN)`
			`continue;`
			`p->object.flags \|= SEEN;`
			`insert_by_date(p, list);`
			`}`
			`return;`
			`}`

			`/*`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`* Ok, the commit wasn't uninteresting. Try to`
			`* simplify the commit history and find the parent`
			`* that has no differences in the path set if one exists.`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`*/`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`if (paths)`
			`try_to_simplify_commit(commit);`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`parent = commit->parents;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`while (parent) {`
			`struct commit *p = parent->item;`

			`parent = parent->next;`

			`parse_commit(p);`
			`if (p->object.flags & SEEN)`
			`continue;`
			`p->object.flags \|= SEEN;`
			`insert_by_date(p, list);`
			`}`
			`}`

Fix sparse warnings. Mainly making a lot of local functions and variables be marked "static", but there was a "zero as NULL" warning in there too. 2005-07-03 19:10:45 +02:00			`static struct commit_list limit_list(struct commit_list list)`
git-rev-list: split out commit limiting from main() too. Ok, now I'm happier. 2005-06-02 18:25:44 +02:00			`{`
			`struct commit_list *newlist = NULL;`
			`struct commit_list **p = &newlist;`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`while (list) {`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`struct commit_list *entry = list;`
			`struct commit *commit = list->item;`
git-rev-list: split out commit limiting from main() too. Ok, now I'm happier. 2005-06-02 18:25:44 +02:00			`struct object *obj = &commit->object;`

Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`list = list->next;`
			`free(entry);`

Make time-based commit filtering work with topological ordering. The trick is to consider the time-based filtering a limiter, the same way we do for release ranges. That means that the time-based filtering runs _before_ the topological sorting, which makes it meaningful again. It also simplifies the code logic. This makes "gitk" useful with time ranges. [ Second version: --merge-order now unaffected by the re-org ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 02:55:46 +02:00			`if (max_age != -1 && (commit->date < max_age))`
			`obj->flags \|= UNINTERESTING;`
"git rev-list --unpacked" shows only unpacked commits More infrastructure to do efficient incremental packs. 2005-07-03 22:29:54 +02:00			`if (unpacked && has_sha1_pack(obj->sha1))`
			`obj->flags \|= UNINTERESTING;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`add_parents_to_list(commit, &list);`
git-rev-list: allow arbitrary head selections, use git-rev-tree syntax This makes git-rev-list use the same command line syntax to mark the commits as git-rev-tree does, and instead of just allowing a start and end commit, it allows an arbitrary list of "interesting" and "uninteresting" commits. For example, imagine that you had three branches (a, b and c) that you are interested in, but you don't want to see stuff that already exists in another persons three releases (x, y and z). You can do git-rev-list a b c ^x ^y ^z (order doesn't matter, btw - feel free to put the uninteresting ones first or otherwise swithc them around), and it will show all the commits that are reachable from a/b/c but not reachable from x/y/z. The old syntax "git-rev-list start end" would not be written as "git-rev-list start ^end", or "git-rev-list ^end start". There's no limit to the number of heads you can specify (unlike git-rev-tree, which can handle a maximum of 16 heads). 2005-06-04 23:38:28 +02:00			`if (obj->flags & UNINTERESTING) {`
git-rev-list: split out commit limiting from main() too. Ok, now I'm happier. 2005-06-02 18:25:44 +02:00			`mark_parents_uninteresting(commit);`
			`if (everybody_uninteresting(list))`
			`break;`
			`continue;`
			`}`
Make time-based commit filtering work with topological ordering. The trick is to consider the time-based filtering a limiter, the same way we do for release ranges. That means that the time-based filtering runs _before_ the topological sorting, which makes it meaningful again. It also simplifies the code logic. This makes "gitk" useful with time ranges. [ Second version: --merge-order now unaffected by the re-org ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 02:55:46 +02:00			`if (min_age != -1 && (commit->date > min_age))`
			`continue;`
git-rev-list: split out commit limiting from main() too. Ok, now I'm happier. 2005-06-02 18:25:44 +02:00			`p = &commit_list_insert(commit, p)->next;`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`}`
[PATCH] Re-organize "git-rev-list --objects" logic The logic to calculate the full object list used to be very inter-twined with the logic that looked up the commits. For no good reason - it's actually a lot simpler to just do that logic as a separate pass. This improves performance a bit, and uses slightly less memory in my tests, but more importantly it makes the code simpler to work with and follow what it does. The performance win is less than I had hoped for, but I get: Before: [torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 13.64user 0.42system 0:14.13elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+47947minor)pagefaults 0swaps 58945 After: [torvalds@g5 linux]$ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 11.80user 0.36system 0:12.16elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+42684minor)pagefaults 0swaps 58945 ie it improved by 2 seconds, and took a 5000+ fewer pages (hey, that's 20MB out of 174MB to go). And got the same number of objects (in theory, the more expensive one might find some more shared objects to avoid. In practice it obviously doesn't). I know how to make it use _lots_ less memory, which will probably speed it up. But that's for another time, and I'd prefer to see this go in first. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 00:14:29 +02:00			`if (tree_objects)`
			`mark_edges_uninteresting(newlist);`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`if (bisect_list)`
			`newlist = find_bisection(newlist);`
git-rev-list: split out commit limiting from main() too. Ok, now I'm happier. 2005-06-02 18:25:44 +02:00			`return newlist;`
			`}`

Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`static void add_pending_object(struct object obj, const char name)`
			`{`
			`add_object(obj, &pending_objects, name);`
			`}`

git-rev-list: do not forget non-commit refs What happens is that the new logic decides that if it can't look up a commit reference (ie "get_commit_reference()" returns NULL), the thing must be a pathname. Fair enough. But wrong. The thing is, it may be a perfectly fine ref that _isn't_ a commit. In git, you have a tag that points to your PGP key, and in the kernel, I have a tag that points to a tree (and a direct ref that points to that tree too, for that matter). So the rule is (as for all the other programs that mix revs and pathnames) not that we only accept commit references, but _any_ valid object ref. If the object then isn't a commit ref, git-rev-list will either ignore it, or add it to the list of non-commit objects (if using "--objects"). The solution is to move the "get_sha1()" out of get_commit_reference(), and into the callers. In fact, we already _have_ the SHA1 in the case of the handle_all() loop, since for_each_ref() will have done it for us, so this is the correct thing to do anyway. This patch (on top of the original one) does exactly that. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 17:18:13 +02:00			`static struct commit get_commit_reference(const char name, const unsigned char *sha1, unsigned int flags)`
Prepare git-rev-list for tracking tag objects too We want to be able to just say "give a difference between these objects", rather than limiting it to commits only. This isn't there yet, but it sets things up to be a bit easier. 2005-06-29 19:40:14 +02:00			`{`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`struct object *object;`
Prepare git-rev-list for tracking tag objects too We want to be able to just say "give a difference between these objects", rather than limiting it to commits only. This isn't there yet, but it sets things up to be a bit easier. 2005-06-29 19:40:14 +02:00
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`object = parse_object(sha1);`
			`if (!object)`
			`die("bad object %s", name);`

			`/*`
			`* Tag object? Look what it points to..`
			`*/`
[PATCH] Dereference tag repeatedly until we get a non-tag. When we allow a tag object in place of a commit object, we only dereferenced the given tag once, which causes a tag that points at a tag that points at a commit to be rejected. Instead, dereference tag repeatedly until we get a non-tag. This patch makes change to two functions: - commit.c::lookup_commit_reference() is used by merge-base, rev-tree and rev-parse to convert user supplied SHA1 to that of a commit. - rev-list uses its own get_commit_reference() to do the same. Dereferencing tags this way helps both of these uses. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-11 08:55:56 +02:00			`while (object->type == tag_type) {`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`struct tag tag = (struct tag ) object;`
			`object->flags \|= flags;`
			`if (tag_objects && !(object->flags & UNINTERESTING))`
			`add_pending_object(object, tag->tag);`
[PATCH] Dereference tag repeatedly until we get a non-tag. When we allow a tag object in place of a commit object, we only dereferenced the given tag once, which causes a tag that points at a tag that points at a commit to be rejected. Instead, dereference tag repeatedly until we get a non-tag. This patch makes change to two functions: - commit.c::lookup_commit_reference() is used by merge-base, rev-tree and rev-parse to convert user supplied SHA1 to that of a commit. - rev-list uses its own get_commit_reference() to do the same. Dereferencing tags this way helps both of these uses. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-11 08:55:56 +02:00			`object = parse_object(tag->tagged->sha1);`
[PATCH] git-rev-list: avoid crash on broken repository When following tags, check for parse_object() success and error out properly instead of segfaulting. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-19 20:28:35 +02:00			`if (!object)`
			`die("bad object %s", sha1_to_hex(tag->tagged->sha1));`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`}`

			`/*`
			`* Commit object? Just return it, we'll do all the complex`
			`* reachability crud.`
			`*/`
			`if (object->type == commit_type) {`
			`struct commit commit = (struct commit )object;`
			`object->flags \|= flags;`
			`if (parse_commit(commit) < 0)`
			`die("unable to parse commit %s", name);`
git-rev-list: allow missing objects when the parent is marked UNINTERESTING We still want the "top-most" uninteresting object to exist, so that we know that we have reached it. 2005-07-11 00:09:46 +02:00			`if (flags & UNINTERESTING)`
			`mark_parents_uninteresting(commit);`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`return commit;`
			`}`

			`/*`
			`* Tree object? Either mark it uniniteresting, or add it`
			`* to the list of objects to look at later..`
			`*/`
			`if (object->type == tree_type) {`
			`struct tree tree = (struct tree )object;`
			`if (!tree_objects)`
Add "--all" flag to rev-parse that shows all refs And make git-rev-list just silently ignore non-commit refs if we're not asking for all objects. 2005-07-03 22:07:52 +02:00			`return NULL;`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`if (flags & UNINTERESTING) {`
			`mark_tree_uninteresting(tree);`
			`return NULL;`
			`}`
			`add_pending_object(object, "");`
			`return NULL;`
			`}`

			`/*`
			`* Blob object? You know the drill by now..`
			`*/`
			`if (object->type == blob_type) {`
			`struct blob blob = (struct blob )object;`
			`if (!blob_objects)`
Add "--all" flag to rev-parse that shows all refs And make git-rev-list just silently ignore non-commit refs if we're not asking for all objects. 2005-07-03 22:07:52 +02:00			`return NULL;`
Teach git-rev-list about non-commit objects Now you can give git-rev-list tags, trees and blobs, and it will do the proper reachability for them all. Knock wood. Of course, you need the "--objects" flag to do anything but plain commits. 2005-06-29 20:30:24 +02:00			`if (flags & UNINTERESTING) {`
			`mark_blob_uninteresting(blob);`
			`return NULL;`
			`}`
			`add_pending_object(object, "");`
			`return NULL;`
			`}`
			`die("%s is unknown object", name);`
Prepare git-rev-list for tracking tag objects too We want to be able to just say "give a difference between these objects", rather than limiting it to commits only. This isn't there yet, but it sets things up to be a bit easier. 2005-06-29 19:40:14 +02:00			`}`

Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`static void handle_one_commit(struct commit com, struct commit_list *lst)`
			`{`
			`if (!com \|\| com->object.flags & SEEN)`
			`return;`
			`com->object.flags \|= SEEN;`
			`commit_list_insert(com, lst);`
			`}`

upload-pack: Do not choke on too many heads request. Cloning from a repository with more than 256 refs (heads and tags included) will choke, because upload-pack has a built-in limit of feeding not more than MAX_NEEDS (currently 256) heads to underlying git-rev-list. This is a problem when cloning a repository with many tags, like http://www.linux-mips.org/pub/scm/linux.git, which has 290+ tags. This commit introduces a new flag, --all, to git-rev-list, to include all refs in the repository. Updated upload-pack detects requests that ask more than MAX_NEEDS refs, and sends everything back instead. We may probably want to tweak the definitions of MAX_NEEDS and MAX_HAS, but that is a separate topic. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-05 23:49:54 +02:00			`/* for_each_ref() callback does not allow user data -- Yuck. */`
			`static struct commit_list **global_lst;`

			`static int include_one_commit(const char path, const unsigned char sha1)`
			`{`
git-rev-list: do not forget non-commit refs What happens is that the new logic decides that if it can't look up a commit reference (ie "get_commit_reference()" returns NULL), the thing must be a pathname. Fair enough. But wrong. The thing is, it may be a perfectly fine ref that _isn't_ a commit. In git, you have a tag that points to your PGP key, and in the kernel, I have a tag that points to a tree (and a direct ref that points to that tree too, for that matter). So the rule is (as for all the other programs that mix revs and pathnames) not that we only accept commit references, but _any_ valid object ref. If the object then isn't a commit ref, git-rev-list will either ignore it, or add it to the list of non-commit objects (if using "--objects"). The solution is to move the "get_sha1()" out of get_commit_reference(), and into the callers. In fact, we already _have_ the SHA1 in the case of the handle_all() loop, since for_each_ref() will have done it for us, so this is the correct thing to do anyway. This patch (on top of the original one) does exactly that. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 17:18:13 +02:00			`struct commit *com = get_commit_reference(path, sha1, 0);`
upload-pack: Do not choke on too many heads request. Cloning from a repository with more than 256 refs (heads and tags included) will choke, because upload-pack has a built-in limit of feeding not more than MAX_NEEDS (currently 256) heads to underlying git-rev-list. This is a problem when cloning a repository with many tags, like http://www.linux-mips.org/pub/scm/linux.git, which has 290+ tags. This commit introduces a new flag, --all, to git-rev-list, to include all refs in the repository. Updated upload-pack detects requests that ask more than MAX_NEEDS refs, and sends everything back instead. We may probably want to tweak the definitions of MAX_NEEDS and MAX_HAS, but that is a separate topic. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-05 23:49:54 +02:00			`handle_one_commit(com, global_lst);`
			`return 0;`
			`}`

			`static void handle_all(struct commit_list **lst)`
			`{`
			`global_lst = lst;`
			`for_each_ref(include_one_commit);`
			`global_lst = NULL;`
			`}`
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`int main(int argc, const char **argv)`
Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". 2005-04-24 04:04:40 +02:00			`{`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`const char *prefix = setup_git_directory();`
Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". 2005-04-24 04:04:40 +02:00			`struct commit_list *list = NULL;`
git-rev-list: allow arbitrary head selections, use git-rev-tree syntax This makes git-rev-list use the same command line syntax to mark the commits as git-rev-tree does, and instead of just allowing a start and end commit, it allows an arbitrary list of "interesting" and "uninteresting" commits. For example, imagine that you had three branches (a, b and c) that you are interested in, but you don't want to see stuff that already exists in another persons three releases (x, y and z). You can do git-rev-list a b c ^x ^y ^z (order doesn't matter, btw - feel free to put the uninteresting ones first or otherwise swithc them around), and it will show all the commits that are reachable from a/b/c but not reachable from x/y/z. The old syntax "git-rev-list start end" would not be written as "git-rev-list start ^end", or "git-rev-list ^end start". There's no limit to the number of heads you can specify (unlike git-rev-tree, which can handle a maximum of 16 heads). 2005-06-04 23:38:28 +02:00			`int i, limited = 0;`
Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". 2005-04-24 04:04:40 +02:00
[PATCH] control/limit output of git-rev-list gitweb.cgi's default view is the log of the last day and git-rev-list can stop crawling the whole repo if we have all our data to display in the browser. Also the rss-feed query needs only the last 20 items. This will speeds up these queries dramatically. usage: rev-list [OPTION] commit-id --max-count=nr --max-age=epoch --min-age=epoch Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 10:00:11 +02:00			`for (i = 1 ; i < argc; i++) {`
git-rev-list: allow arbitrary head selections, use git-rev-tree syntax This makes git-rev-list use the same command line syntax to mark the commits as git-rev-tree does, and instead of just allowing a start and end commit, it allows an arbitrary list of "interesting" and "uninteresting" commits. For example, imagine that you had three branches (a, b and c) that you are interested in, but you don't want to see stuff that already exists in another persons three releases (x, y and z). You can do git-rev-list a b c ^x ^y ^z (order doesn't matter, btw - feel free to put the uninteresting ones first or otherwise swithc them around), and it will show all the commits that are reachable from a/b/c but not reachable from x/y/z. The old syntax "git-rev-list start end" would not be written as "git-rev-list start ^end", or "git-rev-list ^end start". There's no limit to the number of heads you can specify (unlike git-rev-tree, which can handle a maximum of 16 heads). 2005-06-04 23:38:28 +02:00			`int flags;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`const char *arg = argv[i];`
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`char *dotdot;`
git-rev-list: allow arbitrary head selections, use git-rev-tree syntax This makes git-rev-list use the same command line syntax to mark the commits as git-rev-tree does, and instead of just allowing a start and end commit, it allows an arbitrary list of "interesting" and "uninteresting" commits. For example, imagine that you had three branches (a, b and c) that you are interested in, but you don't want to see stuff that already exists in another persons three releases (x, y and z). You can do git-rev-list a b c ^x ^y ^z (order doesn't matter, btw - feel free to put the uninteresting ones first or otherwise swithc them around), and it will show all the commits that are reachable from a/b/c but not reachable from x/y/z. The old syntax "git-rev-list start end" would not be written as "git-rev-list start ^end", or "git-rev-list ^end start". There's no limit to the number of heads you can specify (unlike git-rev-tree, which can handle a maximum of 16 heads). 2005-06-04 23:38:28 +02:00			`struct commit *commit;`
git-rev-list: do not forget non-commit refs What happens is that the new logic decides that if it can't look up a commit reference (ie "get_commit_reference()" returns NULL), the thing must be a pathname. Fair enough. But wrong. The thing is, it may be a perfectly fine ref that _isn't_ a commit. In git, you have a tag that points to your PGP key, and in the kernel, I have a tag that points to a tree (and a direct ref that points to that tree too, for that matter). So the rule is (as for all the other programs that mix revs and pathnames) not that we only accept commit references, but _any_ valid object ref. If the object then isn't a commit ref, git-rev-list will either ignore it, or add it to the list of non-commit objects (if using "--objects"). The solution is to move the "get_sha1()" out of get_commit_reference(), and into the callers. In fact, we already _have_ the SHA1 in the case of the handle_all() loop, since for_each_ref() will have done it for us, so this is the correct thing to do anyway. This patch (on top of the original one) does exactly that. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 17:18:13 +02:00			`unsigned char sha1[20];`
[PATCH] control/limit output of git-rev-list gitweb.cgi's default view is the log of the last day and git-rev-list can stop crawling the whole repo if we have all our data to display in the browser. Also the rss-feed query needs only the last 20 items. This will speeds up these queries dramatically. usage: rev-list [OPTION] commit-id --max-count=nr --max-age=epoch --min-age=epoch Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 10:00:11 +02:00
rev-list: allow -<n> as shorthand for --max-count=<n> This builds on top of the previous one. Traditionally, head(1) and tail(1) allow their line limits to be parsed this way. Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-30 01:28:02 +01:00			`/* accept -<digit>, like traditilnal "head" */`
			`if ((*arg == '-') && isdigit(arg[1])) {`
			`max_count = atoi(arg + 1);`
			`continue;`
			`}`
rev-list: allow -n<n> as shorthand for --max-count=<n> Both -n<n> and -n <n> are supported. POSIX versions of head(1) and tail(1) allow their line limits to be parsed this way. I find --max-count to be a commonly used option, and also similar in spirit to head/tail, so I decided to make life easier on my worn out (and lazy :) fingers with this patch. Signed-off-by: Eric Wong <normalperson@yhbt.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-30 01:26:40 +01:00			`if (!strcmp(arg, "-n")) {`
			`if (++i >= argc)`
			`die("-n requires an argument");`
			`max_count = atoi(argv[i]);`
			`continue;`
			`}`
			`if (!strncmp(arg,"-n",2)) {`
			`max_count = atoi(arg + 2);`
			`continue;`
			`}`
[PATCH] control/limit output of git-rev-list gitweb.cgi's default view is the log of the last day and git-rev-list can stop crawling the whole repo if we have all our data to display in the browser. Also the rss-feed query needs only the last 20 items. This will speeds up these queries dramatically. usage: rev-list [OPTION] commit-id --max-count=nr --max-age=epoch --min-age=epoch Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 10:00:11 +02:00			`if (!strncmp(arg, "--max-count=", 12)) {`
			`max_count = atoi(arg + 12);`
git-rev-list: add "end" commit and "--header" flag The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc. 2005-05-26 03:29:09 +02:00			`continue;`
			`}`
			`if (!strncmp(arg, "--max-age=", 10)) {`
[PATCH] control/limit output of git-rev-list gitweb.cgi's default view is the log of the last day and git-rev-list can stop crawling the whole repo if we have all our data to display in the browser. Also the rss-feed query needs only the last 20 items. This will speeds up these queries dramatically. usage: rev-list [OPTION] commit-id --max-count=nr --max-age=epoch --min-age=epoch Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 10:00:11 +02:00			`max_age = atoi(arg + 10);`
Make time-based commit filtering work with topological ordering. The trick is to consider the time-based filtering a limiter, the same way we do for release ranges. That means that the time-based filtering runs _before_ the topological sorting, which makes it meaningful again. It also simplifies the code logic. This makes "gitk" useful with time ranges. [ Second version: --merge-order now unaffected by the re-org ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 02:55:46 +02:00			`limited = 1;`
git-rev-list: add "end" commit and "--header" flag The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc. 2005-05-26 03:29:09 +02:00			`continue;`
			`}`
			`if (!strncmp(arg, "--min-age=", 10)) {`
[PATCH] control/limit output of git-rev-list gitweb.cgi's default view is the log of the last day and git-rev-list can stop crawling the whole repo if we have all our data to display in the browser. Also the rss-feed query needs only the last 20 items. This will speeds up these queries dramatically. usage: rev-list [OPTION] commit-id --max-count=nr --max-age=epoch --min-age=epoch Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 10:00:11 +02:00			`min_age = atoi(arg + 10);`
Make time-based commit filtering work with topological ordering. The trick is to consider the time-based filtering a limiter, the same way we do for release ranges. That means that the time-based filtering runs _before_ the topological sorting, which makes it meaningful again. It also simplifies the code logic. This makes "gitk" useful with time ranges. [ Second version: --merge-order now unaffected by the re-org ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 02:55:46 +02:00			`limited = 1;`
git-rev-list: add "end" commit and "--header" flag The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc. 2005-05-26 03:29:09 +02:00			`continue;`
[PATCH] control/limit output of git-rev-list gitweb.cgi's default view is the log of the last day and git-rev-list can stop crawling the whole repo if we have all our data to display in the browser. Also the rss-feed query needs only the last 20 items. This will speeds up these queries dramatically. usage: rev-list [OPTION] commit-id --max-count=nr --max-age=epoch --min-age=epoch Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 10:00:11 +02:00			`}`
git-rev-list: add "end" commit and "--header" flag The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc. 2005-05-26 03:29:09 +02:00			`if (!strcmp(arg, "--header")) {`
			`verbose_header = 1;`
			`continue;`
			`}`
rev-list: default to abbreviate merge parent names under --pretty. When we prettyprint commit log messages, merge parent names were often very long and there was no way to abbreviate it. This changes them to be abbreviated by default, and non-default abbreviations can be specified with --no-abbrev or --abbrev=<n> options. Note that this affects only the prettyprinted parent names. The output from --show-parents is meant for machine consumption and is not affected by this flag. 2006-02-10 20:56:42 +01:00			`if (!strcmp(arg, "--no-abbrev")) {`
			`abbrev = 0;`
			`continue;`
			`}`
			`if (!strncmp(arg, "--abbrev=", 9)) {`
			`abbrev = strtoul(arg + 9, NULL, 10);`
			`if (abbrev && abbrev < MINIMUM_ABBREV)`
			`abbrev = MINIMUM_ABBREV;`
			`else if (40 < abbrev)`
			`abbrev = 40;`
			`continue;`
			`}`
pretty_print_commit: add different formats You can ask to print out "raw" format (full headers, full body), "medium" format (author and date, full body) or "short" format (author only, condensed body). Use "git-rev-list --pretty=short HEAD \| less -S" for an example. 2005-06-05 18:02:03 +02:00			`if (!strncmp(arg, "--pretty", 8)) {`
			`commit_format = get_commit_format(arg+8);`
git-rev-list: add "--pretty" command line option That pretty-prints the resulting commit messages, so git-rev-list --pretty HEAD v2.6.12-rc5 \| less -S basically ends up being a log of the changes between -rc5 and current head. It uses the pretty-printing helper function I just extracted from diff-tree.c. 2005-06-01 17:42:22 +02:00			`verbose_header = 1;`
			`hdr_termination = '\n';`
Introduce --pretty=oneline format. This introduces --pretty=oneline to git-rev-tree and git-rev-list commands to show only the first line of the commit message, without frills. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-09 07:15:40 +02:00			`if (commit_format == CMIT_FMT_ONELINE)`
[PATCH] Fix "prefix" mixup in git-rev-list Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-24 23:58:42 +02:00			`commit_prefix = "";`
Introduce --pretty=oneline format. This introduces --pretty=oneline to git-rev-tree and git-rev-list commands to show only the first line of the commit message, without frills. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-09 07:15:40 +02:00			`else`
[PATCH] Fix "prefix" mixup in git-rev-list Recent changes in git have broken cg-log. git-rev-list no longer prints "commit" in front of commit hashes. It turn out a local "prefix" variable in main() shadows a file-scoped "prefix" variable. The patch removed the local "prefix" variable since its value is never used (in the intended way, that is). The call to setup_git_directory() is kept since it has useful side effects. The file-scoped "prefix" variable is renamed to "commit_prefix" just in case someone reintroduces "prefix" to hold the return value of setup_git_directory(). Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-24 23:58:42 +02:00			`commit_prefix = "commit ";`
git-rev-list: add "--pretty" command line option That pretty-prints the resulting commit messages, so git-rev-list --pretty HEAD v2.6.12-rc5 \| less -S basically ends up being a log of the changes between -rc5 and current head. It uses the pretty-printing helper function I just extracted from diff-tree.c. 2005-06-01 17:42:22 +02:00			`continue;`
			`}`
[PATCH] add --no-merges flag to suppress display of merge commits As requested by Junio (who suggested --single-parents-only, but this could forget a no-parent root). Also, adds a few missing options to the usage string. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-08 11:37:21 +02:00			`if (!strncmp(arg, "--no-merges", 11)) {`
			`no_merges = 1;`
			`continue;`
			`}`
git-rev-list: add "--parents" command line flag It makes rev-list show the list of parents, the same way git-rev-tree does (but without the expense). 2005-05-31 04:30:07 +02:00			`if (!strcmp(arg, "--parents")) {`
			`show_parents = 1;`
			`continue;`
			`}`
git-rev-list: add "--bisect" flag to find the "halfway" point This is useful for doing binary searching for problems. You start with a known good and known bad point, and you then test the "halfway" point in between: git-rev-list --bisect bad ^good and you test that. If that one tests good, you now still have a known bad case, but two known good points, and you can bisect again: git-rev-list --bisect bad ^good1 ^good2 and test that point. If that point is bad, you now use that as your known-bad starting point: git-rev-list --bisect newbad ^good1 ^good2 and basically at every iteration you shrink your list of commits by half: you're binary searching for the point where the troubles started, even though there isn't a nice linear ordering. 2005-06-18 07:54:50 +02:00			`if (!strcmp(arg, "--bisect")) {`
			`bisect_list = 1;`
			`continue;`
			`}`
upload-pack: Do not choke on too many heads request. Cloning from a repository with more than 256 refs (heads and tags included) will choke, because upload-pack has a built-in limit of feeding not more than MAX_NEEDS (currently 256) heads to underlying git-rev-list. This is a problem when cloning a repository with many tags, like http://www.linux-mips.org/pub/scm/linux.git, which has 290+ tags. This commit introduces a new flag, --all, to git-rev-list, to include all refs in the repository. Updated upload-pack detects requests that ask more than MAX_NEEDS refs, and sends everything back instead. We may probably want to tweak the definitions of MAX_NEEDS and MAX_HAS, but that is a separate topic. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-05 23:49:54 +02:00			`if (!strcmp(arg, "--all")) {`
			`handle_all(&list);`
			`continue;`
			`}`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`if (!strcmp(arg, "--objects")) {`
Prepare git-rev-list for tracking tag objects too We want to be able to just say "give a difference between these objects", rather than limiting it to commits only. This isn't there yet, but it sets things up to be a bit easier. 2005-06-29 19:40:14 +02:00			`tag_objects = 1;`
git-rev-list: add option to list all objects (not just commits) When you do git-rev-list --objects $(git-rev-parse HEAD^..HEAD) it now lists not only the "commit difference" between the parent of HEAD and HEAD itself (which is normally just the parent, but in the case of a merge will be all the newly merged commits), but also all the new tree and blob objects that weren't in the original. NOTE! It doesn't walk all the way to the root, so it doesn't do a full object search in the full old history. Instead, it will only look as far back in the history as it needs to resolve the commits. Thus, if the commit reverts a blob (or tree) back to a state much further back in history, we may end up listing some blobs (or trees) as "new" even though they exist further back. Regardless, the list of objects will be a superset (usually exact) list of objects needed to go from the beginning commit to ending commit. As a particularly obvious special case, git-rev-list --objects HEAD will end up listing every single object that is reachable from the HEAD commit. Side note: the objects are sorted by "recency", with commits first. 2005-06-25 07:56:58 +02:00			`tree_objects = 1;`
			`blob_objects = 1;`
			`continue;`
			`}`
"git rev-list --unpacked" shows only unpacked commits More infrastructure to do efficient incremental packs. 2005-07-03 22:29:54 +02:00			`if (!strcmp(arg, "--unpacked")) {`
			`unpacked = 1;`
			`limited = 1;`
			`continue;`
			`}`
Remove unnecessary usage of strncmp() in git-rev-list arg parsing. Not only is it unnecessary, it incorrectly allows extraneous characters at the end of the argument. Junio noticed the --merge-order thing, and Jon points out that if we fix that one, we should fix --show-breaks too. 2005-07-05 21:12:50 +02:00			`if (!strcmp(arg, "--merge-order")) {`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`merge_order = 1;`
			`continue;`
			`}`
Remove unnecessary usage of strncmp() in git-rev-list arg parsing. Not only is it unnecessary, it incorrectly allows extraneous characters at the end of the argument. Junio noticed the --merge-order thing, and Jon points out that if we fix that one, we should fix --show-breaks too. 2005-07-05 21:12:50 +02:00			`if (!strcmp(arg, "--show-breaks")) {`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`show_breaks = 1;`
			`continue;`
			`}`
Add "--topo-order" flag to use new topological sort 2005-07-06 19:25:04 +02:00			`if (!strcmp(arg, "--topo-order")) {`
			`topo_order = 1;`
Make sure we generate the whole commit list before trying to sort it topologically This was my cherry-pickng merge bug. But topo-order still shows strange behaviour with multiple heads, so keep gitk using --merge-order for now. 2005-07-06 19:51:43 +02:00			`limited = 1;`
Add "--topo-order" flag to use new topological sort 2005-07-06 19:25:04 +02:00			`continue;`
			`}`
git-rev-list: add "--dense" flag This is what the recent git-rev-list changes have all been gearing up for. When we use a path filter to git-rev-list, the new "--dense" flag asks git-rev-list to compress the history so that it _only_ contains commits that change files in the path filter. It also rewrites the parent information so that tools like "gitk" will see the result as a dense history tree. For example, on the current kernel archive: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9904 [torvalds@g5 linux]$ git-rev-list HEAD -- kernel \| wc -l 5442 [torvalds@g5 linux]$ git-rev-list --dense HEAD -- kernel \| wc -l 356 which shows that while we have almost ten thousand commits, we can prune down the work to slightly more than half by only following the merges that are interesting. But further, we can then compress the history to just 356 entries that actually make changes to the kernel subdirectory. To see this in action, try something like gitk --dense -- gitk to see just the history that affects gitk. Or, to show that true parallel development still remains parallel, do gitk --dense -- daemon.c which shows some parallel commits in the current git tree. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 01:40:54 +02:00			`if (!strcmp(arg, "--dense")) {`
			`dense = 1;`
			`continue;`
			`}`
git-rev-list: make --dense the default (and introduce "--sparse") This actually does three things: - make "--dense" the default for git-rev-list. Since dense is a no-op if no filenames are given, this doesn't actually change any historical behaviour, but it's logically the right default (if we want to prune on filenames, do it fully. The sparse "merge-only" thing may be useful, but it's not what you'd normally expect) - make "git-rev-parse" show the default revision control before it shows any pathnames. This was a real bug, but nobody would ever have noticed, because the default thing tends to only make sense for git-rev-list, and git-rev-list didn't use to take pathnames. - it changes "git-rev-list" to match the other commands that take a mix of revisions and filenames - it no longer requires the "--" before filenames (although you still need to do it if a filename could be confused with a revision name, eg "gitk" in the git archive) This all just makes for much more pleasant and obvous usage. Just doing a gitk t/ does the obvious thing: it will show the history as it concerns the "t/" subdirectory. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 00:24:55 +02:00			`if (!strcmp(arg, "--sparse")) {`
			`dense = 0;`
			`continue;`
			`}`
rev-list: stop when the file disappears The one thing I've considered doing (I really should) is to add a "stop when you don't find the file" option to "git-rev-list". This patch does some of the work towards that: it removes the "parent" thing when the file disappears, so a "git annotate" could do do something like git-rev-list --remove-empty --parents HEAD -- "$filename" and it would get a good graph that stops when the filename disappears (it's not perfect though: it won't remove all the unintersting commits). It also simplifies the logic of finding tree differences a bit, at the cost of making it a tad less efficient. The old logic was two-phase: it would first simplify _only_ merges tree as it traversed the tree, and then simplify the linear parts of the remainder independently. That was pretty optimal from an efficiency standpoint because it avoids doing any comparisons that we can see are unnecessary, but it made it much harder to understand than it really needed to be. The new logic is a lot more straightforward, and compares the trees as it traverses the graph (ie everything is a single phase). That makes it much easier to stop graph traversal at any point where a file disappears. As an example, let's say that you have a git repository that has had a file called "A" some time in the past. That file gets renamed to B, and then gets renamed back again to A. The old "git-rev-list" would show two commits: the commit that renames B to A (because it changes A) _and_ as its parent the commit that renames A to B (because it changes A). With the new --remove-empty flag, git-rev-list will show just the commit that renames B to A as the "root" commit, and stop traversal there (because that's what you want for "annotate" - you want to stop there, and for every "root" commit you then separately see if it really is a new file, or if the paths history disappeared because it was renamed from some other file). With this patch, you should be able to basically do a "poor mans 'git annotate'" with a fairly simple loop: push("HEAD", "$filename") while (revision,filename = pop()) { for each i in $(git-rev-list --parents --remove-empty $revision -- "$filename") pseudo-parents($i) = git-rev-list parents for that line if (pseudo-parents($i) is non-empty) { show diff of $i against pseudo-parents continue } /* See if the _real_ parents of $i had a rename */ parent($i) = real-parent($i) if (find-rename in $parent($i)->$i) push $parent($i), "old-name" } which should be doable in perl or something (doing stacks in shell is just too painful to be worth it, so I'm not going to do this). Anybody want to try? Linus 2006-01-18 23:47:30 +01:00			`if (!strcmp(arg, "--remove-empty")) {`
			`remove_empty_trees = 1;`
			`continue;`
			`}`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`if (!strcmp(arg, "--")) {`
git-rev-list: make --dense the default (and introduce "--sparse") This actually does three things: - make "--dense" the default for git-rev-list. Since dense is a no-op if no filenames are given, this doesn't actually change any historical behaviour, but it's logically the right default (if we want to prune on filenames, do it fully. The sparse "merge-only" thing may be useful, but it's not what you'd normally expect) - make "git-rev-parse" show the default revision control before it shows any pathnames. This was a real bug, but nobody would ever have noticed, because the default thing tends to only make sense for git-rev-list, and git-rev-list didn't use to take pathnames. - it changes "git-rev-list" to match the other commands that take a mix of revisions and filenames - it no longer requires the "--" before filenames (although you still need to do it if a filename could be confused with a revision name, eg "gitk" in the git archive) This all just makes for much more pleasant and obvous usage. Just doing a gitk t/ does the obvious thing: it will show the history as it concerns the "t/" subdirectory. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 00:24:55 +02:00			`i++;`
Teach git-rev-list to follow just a specified set of files This is the first cut at a git-rev-list that knows to ignore commits that don't change a certain file (or set of files). NOTE! For now it only prunes _merge_ commits, and follows the parent where there are no differences in the set of files specified. In the long run, I'd like to make it re-write the straight-line history too, but for now the merge simplification is much more fundamentally important (the rewriting of straight-line history is largely a separate simplification phase, but the merge simplification needs to happen early if we want to optimize away unnecessary commit parsing). If all parents of a merge change some of the files, the merge is left as is, so the end result is in no way guaranteed to be a linear history, but it will often be a lot /more/ linear than the full tree, since it prunes out parents that didn't matter for that set of files. As an example from the current kernel: [torvalds@g5 linux]$ git-rev-list HEAD \| wc -l 9885 [torvalds@g5 linux]$ git-rev-list HEAD -- Makefile \| wc -l 4084 [torvalds@g5 linux]$ git-rev-list HEAD -- drivers/usb \| wc -l 5206 and you can also use 'gitk' to more visually see the pruning of the history tree, with something like gitk -- drivers/usb showing a simplified history that tries to follow the first parent in a merge that is the parent that fully defines drivers/usb/. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:25:09 +02:00			`break;`
			`}`
git-rev-list: add "end" commit and "--header" flag The "end" commit is just faking it right now, it's sorting things purely by date, so this is _not_ a reachability analysis. Some day. The "--header" flag causes the commit message to be printed out, with a NUL character separator after it for parseability. This allows you to do things like use "grep -z" to grep for certain authors etc. 2005-05-26 03:29:09 +02:00
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`if (show_breaks && !merge_order)`
			`usage(rev_list_usage);`

git-rev-list: allow arbitrary head selections, use git-rev-tree syntax This makes git-rev-list use the same command line syntax to mark the commits as git-rev-tree does, and instead of just allowing a start and end commit, it allows an arbitrary list of "interesting" and "uninteresting" commits. For example, imagine that you had three branches (a, b and c) that you are interested in, but you don't want to see stuff that already exists in another persons three releases (x, y and z). You can do git-rev-list a b c ^x ^y ^z (order doesn't matter, btw - feel free to put the uninteresting ones first or otherwise swithc them around), and it will show all the commits that are reachable from a/b/c but not reachable from x/y/z. The old syntax "git-rev-list start end" would not be written as "git-rev-list start ^end", or "git-rev-list ^end start". There's no limit to the number of heads you can specify (unlike git-rev-tree, which can handle a maximum of 16 heads). 2005-06-04 23:38:28 +02:00			`flags = 0;`
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`dotdot = strstr(arg, "..");`
			`if (dotdot) {`
git-rev-list: do not forget non-commit refs What happens is that the new logic decides that if it can't look up a commit reference (ie "get_commit_reference()" returns NULL), the thing must be a pathname. Fair enough. But wrong. The thing is, it may be a perfectly fine ref that _isn't_ a commit. In git, you have a tag that points to your PGP key, and in the kernel, I have a tag that points to a tree (and a direct ref that points to that tree too, for that matter). So the rule is (as for all the other programs that mix revs and pathnames) not that we only accept commit references, but _any_ valid object ref. If the object then isn't a commit ref, git-rev-list will either ignore it, or add it to the list of non-commit objects (if using "--objects"). The solution is to move the "get_sha1()" out of get_commit_reference(), and into the callers. In fact, we already _have_ the SHA1 in the case of the handle_all() loop, since for_each_ref() will have done it for us, so this is the correct thing to do anyway. This patch (on top of the original one) does exactly that. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 17:18:13 +02:00			`unsigned char from_sha1[20];`
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`char *next = dotdot + 2;`
			`*dotdot = 0;`
[PATCH] Fix "git-rev-list" revision range parsing There were two bugs in there: - if the range didn't end up working, we restored the '.' character in the wrong place. - an empty end-of-range should be interpreted as HEAD. See rev-parse.c for the reference implementation of this. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-17 02:53:19 +02:00			`if (!*next)`
			`next = "HEAD";`
git-rev-list: do not forget non-commit refs What happens is that the new logic decides that if it can't look up a commit reference (ie "get_commit_reference()" returns NULL), the thing must be a pathname. Fair enough. But wrong. The thing is, it may be a perfectly fine ref that _isn't_ a commit. In git, you have a tag that points to your PGP key, and in the kernel, I have a tag that points to a tree (and a direct ref that points to that tree too, for that matter). So the rule is (as for all the other programs that mix revs and pathnames) not that we only accept commit references, but _any_ valid object ref. If the object then isn't a commit ref, git-rev-list will either ignore it, or add it to the list of non-commit objects (if using "--objects"). The solution is to move the "get_sha1()" out of get_commit_reference(), and into the callers. In fact, we already _have_ the SHA1 in the case of the handle_all() loop, since for_each_ref() will have done it for us, so this is the correct thing to do anyway. This patch (on top of the original one) does exactly that. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 17:18:13 +02:00			`if (!get_sha1(arg, from_sha1) && !get_sha1(next, sha1)) {`
			`struct commit *exclude;`
			`struct commit *include;`

			`exclude = get_commit_reference(arg, from_sha1, UNINTERESTING);`
			`include = get_commit_reference(next, sha1, 0);`
			`if (!exclude \|\| !include)`
			`die("Invalid revision range %s..%s", arg, next);`
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`limited = 1;`
			`handle_one_commit(exclude, &list);`
			`handle_one_commit(include, &list);`
			`continue;`
			`}`
[PATCH] Fix "git-rev-list" revision range parsing There were two bugs in there: - if the range didn't end up working, we restored the '.' character in the wrong place. - an empty end-of-range should be interpreted as HEAD. See rev-parse.c for the reference implementation of this. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-17 02:53:19 +02:00			`*dotdot = '.';`
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`}`
git-rev-list: allow arbitrary head selections, use git-rev-tree syntax This makes git-rev-list use the same command line syntax to mark the commits as git-rev-tree does, and instead of just allowing a start and end commit, it allows an arbitrary list of "interesting" and "uninteresting" commits. For example, imagine that you had three branches (a, b and c) that you are interested in, but you don't want to see stuff that already exists in another persons three releases (x, y and z). You can do git-rev-list a b c ^x ^y ^z (order doesn't matter, btw - feel free to put the uninteresting ones first or otherwise swithc them around), and it will show all the commits that are reachable from a/b/c but not reachable from x/y/z. The old syntax "git-rev-list start end" would not be written as "git-rev-list start ^end", or "git-rev-list ^end start". There's no limit to the number of heads you can specify (unlike git-rev-tree, which can handle a maximum of 16 heads). 2005-06-04 23:38:28 +02:00			`if (*arg == '^') {`
			`flags = UNINTERESTING;`
			`arg++;`
			`limited = 1;`
			`}`
Make git-rev-list and git-rev-parse argument parsing stricter If you pass it a filename without the "--" marker to separate it from revision information and flags, we now require that the file in question actually exists. This makes mis-typed revision information not be silently just considered a strange filename. With the "--" marker, you can continue to pass in filenames that do not actually exists - useful for querying what happened to a file that you no longer have in the repository. [ All scripts should use the "--" format regardless, to make things unambiguous. So this change should not affect any existing tools ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-25 23:00:37 +01:00			`if (get_sha1(arg, sha1) < 0) {`
			`struct stat st;`
			`if (lstat(arg, &st) < 0)`
			`die("'%s': %s", arg, strerror(errno));`
git-rev-list: make --dense the default (and introduce "--sparse") This actually does three things: - make "--dense" the default for git-rev-list. Since dense is a no-op if no filenames are given, this doesn't actually change any historical behaviour, but it's logically the right default (if we want to prune on filenames, do it fully. The sparse "merge-only" thing may be useful, but it's not what you'd normally expect) - make "git-rev-parse" show the default revision control before it shows any pathnames. This was a real bug, but nobody would ever have noticed, because the default thing tends to only make sense for git-rev-list, and git-rev-list didn't use to take pathnames. - it changes "git-rev-list" to match the other commands that take a mix of revisions and filenames - it no longer requires the "--" before filenames (although you still need to do it if a filename could be confused with a revision name, eg "gitk" in the git archive) This all just makes for much more pleasant and obvous usage. Just doing a gitk t/ does the obvious thing: it will show the history as it concerns the "t/" subdirectory. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 00:24:55 +02:00			`break;`
Make git-rev-list and git-rev-parse argument parsing stricter If you pass it a filename without the "--" marker to separate it from revision information and flags, we now require that the file in question actually exists. This makes mis-typed revision information not be silently just considered a strange filename. With the "--" marker, you can continue to pass in filenames that do not actually exists - useful for querying what happened to a file that you no longer have in the repository. [ All scripts should use the "--" format regardless, to make things unambiguous. So this change should not affect any existing tools ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-25 23:00:37 +01:00			`}`
git-rev-list: do not forget non-commit refs What happens is that the new logic decides that if it can't look up a commit reference (ie "get_commit_reference()" returns NULL), the thing must be a pathname. Fair enough. But wrong. The thing is, it may be a perfectly fine ref that _isn't_ a commit. In git, you have a tag that points to your PGP key, and in the kernel, I have a tag that points to a tree (and a direct ref that points to that tree too, for that matter). So the rule is (as for all the other programs that mix revs and pathnames) not that we only accept commit references, but _any_ valid object ref. If the object then isn't a commit ref, git-rev-list will either ignore it, or add it to the list of non-commit objects (if using "--objects"). The solution is to move the "get_sha1()" out of get_commit_reference(), and into the callers. In fact, we already _have_ the SHA1 in the case of the handle_all() loop, since for_each_ref() will have done it for us, so this is the correct thing to do anyway. This patch (on top of the original one) does exactly that. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 17:18:13 +02:00			`commit = get_commit_reference(arg, sha1, flags);`
Teach rev-list since..til notation. The King Penguin says: Now, for extra bonus points, maybe you should make "git-rev-list" also understand the "rev..rev" format (which you can't do with just the get_sha1() interface, since it expands into more). The faithful servant makes it so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 11:31:15 +02:00			`handle_one_commit(commit, &list);`
[PATCH] control/limit output of git-rev-list gitweb.cgi's default view is the log of the last day and git-rev-list can stop crawling the whole repo if we have all our data to display in the browser. Also the rss-feed query needs only the last 20 items. This will speeds up these queries dramatically. usage: rev-list [OPTION] commit-id --max-count=nr --max-age=epoch --min-age=epoch Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 10:00:11 +02:00			`}`

rev-list --objects: fix object list without commit. Earlier, "rev-list --objects <sha1>" for an object chain that does not have any commit failed with a usage message. This fixes "send-pack remote $tag" where tag points at a non-commit (e.g. a blob). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 01:16:49 +01:00			`if (!list &&`
			`(!(tag_objects\|\|tree_objects\|\|blob_objects) && !pending_objects))`
git-rev-list: make --dense the default (and introduce "--sparse") This actually does three things: - make "--dense" the default for git-rev-list. Since dense is a no-op if no filenames are given, this doesn't actually change any historical behaviour, but it's logically the right default (if we want to prune on filenames, do it fully. The sparse "merge-only" thing may be useful, but it's not what you'd normally expect) - make "git-rev-parse" show the default revision control before it shows any pathnames. This was a real bug, but nobody would ever have noticed, because the default thing tends to only make sense for git-rev-list, and git-rev-list didn't use to take pathnames. - it changes "git-rev-list" to match the other commands that take a mix of revisions and filenames - it no longer requires the "--" before filenames (although you still need to do it if a filename could be confused with a revision name, eg "gitk" in the git archive) This all just makes for much more pleasant and obvous usage. Just doing a gitk t/ does the obvious thing: it will show the history as it concerns the "t/" subdirectory. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-26 00:24:55 +02:00			`usage(rev_list_usage);`

			`paths = get_pathspec(prefix, argv + i);`
			`if (paths) {`
			`limited = 1;`
			`diff_tree_setup_paths(paths);`
			`}`

[PATCH] Avoid wasting memory in git-rev-list As pointed out on the list, git-rev-list can use a lot of memory. One low-hanging fruit is to free the commit buffer for commits that we parse. By default, parse_commit() will save away the buffer, since a lot of cases do want it, and re-reading it continually would be unnecessary. However, in many cases the buffer isn't actually necessary and saving it just wastes memory. We could just free the buffer ourselves, but especially in git-rev-list, we actually end up using the helper functions that automatically add parent commits to the commit lists, so we don't actually control the commit parsing directly. Instead, just make this behaviour of "parse_commit()" a global flag. Maybe this is a bit tasteless, but it's very simple, and it makes a noticable difference in memory usage. Before the change: [torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null 0.26user 0.02system 0:00.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+3714minor)pagefaults 0swaps after the change: [torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null 0.26user 0.00system 0:00.27elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2433minor)pagefaults 0swaps note how the minor faults have decreased from 3714 pages to 2433 pages. That's all due to the fewer anonymous pages allocated to hold the comment buffers and their metadata. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-15 23:43:17 +02:00			`save_commit_buffer = verbose_header;`
[PATCH] Avoid building object ref lists when not needed The object parsing code builds a generic "this object references that object" because doing a full connectivity check for fsck requires it. However, nothing else really needs it, and it's quite expensive for git-rev-list that can have tons of objects in flight. So, exactly like the commit buffer save thing, add a global flag to disable it, and use it in git-rev-list. Before: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 12.28user 0.29system 0:12.57elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+26718minor)pagefaults 0swaps 59124 After this change: $ /usr/bin/time git-rev-list --objects v2.6.12..HEAD \| wc -l 10.33user 0.18system 0:10.54elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+18509minor)pagefaults 0swaps 59124 and note how the number of pages touched by git-rev-list for this particular object list has shrunk from 26,718 (104 MB) to 18,509 (72 MB). Calculating the total object difference between two git revisions is still clearly the most expensive git operation (both in memory and CPU time), but it's now less than 40% of what it used to be. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 23:55:33 +02:00			`track_object_refs = 0;`
[PATCH] Avoid wasting memory in git-rev-list As pointed out on the list, git-rev-list can use a lot of memory. One low-hanging fruit is to free the commit buffer for commits that we parse. By default, parse_commit() will save away the buffer, since a lot of cases do want it, and re-reading it continually would be unnecessary. However, in many cases the buffer isn't actually necessary and saving it just wastes memory. We could just free the buffer ourselves, but especially in git-rev-list, we actually end up using the helper functions that automatically add parent commits to the commit lists, so we don't actually control the commit parsing directly. Instead, just make this behaviour of "parse_commit()" a global flag. Maybe this is a bit tasteless, but it's very simple, and it makes a noticable difference in memory usage. Before the change: [torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null 0.26user 0.02system 0:00.28elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+3714minor)pagefaults 0swaps after the change: [torvalds@g5 linux]$ /usr/bin/time git-rev-list v2.6.12..HEAD > /dev/null 0.26user 0.00system 0:00.27elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+2433minor)pagefaults 0swaps note how the minor faults have decreased from 3714 pages to 2433 pages. That's all due to the fewer anonymous pages allocated to hold the comment buffers and their metadata. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-15 23:43:17 +02:00
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`if (!merge_order) {`
[PATCH] Ensure list insertion method does not depend on position of --merge-order argument This change ensures that git-rev-list --merge-order produces the same result irrespective of what position the --merge-order argument appears in the argument list. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-07 02:59:13 +02:00			`sort_by_date(&list);`
Optimize common case of git-rev-list I took a look at webgit, and it looks like at least for the "projects" page, the most common operation ends up being basically git-rev-list --header --parents --max-count=1 HEAD Now, the thing is, the way "git-rev-list" works, it always keeps on popping the parents and parsing them in order to build the list of parents, and it turns out that even though we just want a single commit, git-rev-list will invariably look up _three_ generations of commits. It will parse: - the commit we want (it obviously needs this) - it's parent(s) as part of the "pop_most_recent_commit()" logic - it will then pop one of the parents before it notices that it doesn't need any more - and as part of popping the parent, it will parse the grandparent (again due to "pop_most_recent_commit()". Now, I've strace'd it, and it really is pretty efficient on the whole, but if things aren't nicely cached, and with long-latency IO, doing those two extra objects (at a minimum - if the parent is a merge it will be more) is just wasted time, and potentially a lot of it. So here's a quick special-case for the trivial case of "just one commit, and no date-limits or other special rules". Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-19 03:29:17 +02:00			`if (list && !limited && max_count == 1 &&`
			`!tag_objects && !tree_objects && !blob_objects) {`
			`show_commit(list->item);`
			`return 0;`
			`}`
[PATCH] Tidy up some rev-list-related stuff This patch tidies up the git-rev-list documentation and epoch.c, which are in severe clash with the unwritten coding style now, and quite unreadable. It also fixes up compile failures with older compilers due to variable declarations after code. The patch mostly wraps lines before or on the 80th column, removes plenty of superfluous empty lines and changes comments from // to /* */. Signed-off-by: Petr Baudis <pasky@ucw.cz> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-08 22:59:43 +02:00			`if (limited)`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`list = limit_list(list);`
Add "--topo-order" flag to use new topological sort 2005-07-06 19:25:04 +02:00			`if (topo_order)`
			`sort_in_topological_order(&list);`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`show_commit_list(list);`
			`} else {`
[PATCH] Support for NO_OPENSSL Support for completely OpenSSL-less builds. FSF considers distributing GPL binaries with OpenSSL linked in as a legal problem so this is trouble e.g. for Debian, or some people might not want to install OpenSSL anyway. If you make NO_OPENSSL=1 you get completely OpenSSL-less build, disabling --merge-order and using Mozilla's SHA1 implementation. Ported from Cogito. Signed-off-by: Petr Baudis <pasky@ucw.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-29 17:50:51 +02:00			`#ifndef NO_OPENSSL`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`if (sort_list_in_merge_order(list, &process_commit)) {`
[PATCH] Support for NO_OPENSSL Support for completely OpenSSL-less builds. FSF considers distributing GPL binaries with OpenSSL linked in as a legal problem so this is trouble e.g. for Debian, or some people might not want to install OpenSSL anyway. If you make NO_OPENSSL=1 you get completely OpenSSL-less build, disabling --merge-order and using Mozilla's SHA1 implementation. Ported from Cogito. Signed-off-by: Petr Baudis <pasky@ucw.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-29 17:50:51 +02:00			`die("merge order sort failed\n");`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`}`
[PATCH] Support for NO_OPENSSL Support for completely OpenSSL-less builds. FSF considers distributing GPL binaries with OpenSSL linked in as a legal problem so this is trouble e.g. for Debian, or some people might not want to install OpenSSL anyway. If you make NO_OPENSSL=1 you get completely OpenSSL-less build, disabling --merge-order and using Mozilla's SHA1 implementation. Ported from Cogito. Signed-off-by: Petr Baudis <pasky@ucw.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-29 17:50:51 +02:00			`#else`
			`die("merge order sort unsupported, OpenSSL not linked");`
			`#endif`
[PATCH] Modify git-rev-list to linearise the commit history in merge order. This patch linearises the GIT commit history graph into merge order which is defined by invariants specified in Documentation/git-rev-list.txt. The linearisation produced by this patch is superior in an objective sense to that produced by the existing git-rev-list implementation in that the linearisation produced is guaranteed to have the minimum number of discontinuities, where a discontinuity is defined as an adjacent pair of commits in the output list which are not related in a direct child-parent relationship. With this patch a graph like this: a4 --- \| \ \ \| b4 \| \|/ \| \| a3 \| \| \| \| \| a2 \| \| \| \| c3 \| \| \| \| \| c2 \| b3 \| \| \| /\| \| b2 \| \| \| c1 \| \| / \| b1 a1 \| \| \| a0 \| \| / root Sorts like this: = a4 \| c3 \| c2 \| c1 ^ b4 \| b3 \| b2 \| b1 ^ a3 \| a2 \| a1 \| a0 = root Instead of this: = a4 \| c3 ^ b4 \| a3 ^ c2 ^ b3 ^ a2 ^ b2 ^ c1 ^ a1 ^ b1 ^ a0 = root A test script, t/t6000-rev-list.sh, includes a test which demonstrates that the linearisation produced by --merge-order has less discontinuities than the linearisation produced by git-rev-list without the --merge-order flag specified. To see this, do the following: cd t ./t6000-rev-list.sh cd trash cat actual-default-order cat actual-merge-order The existing behaviour of git-rev-list is preserved, by default. To obtain the modified behaviour, specify --merge-order or --merge-order --show-breaks on the command line. This version of the patch has been tested on the git repository and also on the linux-2.6 repository and has reasonable performance on both - ~50-100% slower than the original algorithm. This version of the patch has incorporated a functional equivalent of the Linus' output limiting algorithm into the merge-order algorithm itself. This operates per the notes associated with Linus' commit 337cb3fb8da45f10fe9a0c3cf571600f55ead2ce. This version has incorporated Linus' feedback regarding proposed changes to rev-list.c. (see: [PATCH] Factor out filtering in rev-list.c) This version has improved the way sort_first_epoch marks commits as uninteresting. For more details about this change, refer to Documentation/git-rev-list.txt and http://blackcubes.dyndns.org/epoch/. Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 17:39:40 +02:00			`}`
git-rev-list: use proper lazy reachability analysis This mean sthat you can give a beginning/end pair to git-rev-list, and it will show all entries that are reachable from the beginning but not the end. For example git-rev-list v2.6.12-rc5 v2.6.12-rc4 shows all commits that are in -rc5 but are not in -rc4. 2005-05-31 03:46:32 +02:00
Add "rev-list" program that uses the new time-based commit listing. This is probably what you'd want to see for "git log". 2005-04-24 04:04:40 +02:00			`return 0;`
			`}`