mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-16 06:03:44 +01:00

2811 lines

75 KiB

C

Raw Normal View History

git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`/*`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`* Blame`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* Copyright (c) 2006, 2014 by its authors`
			`* See COPYING for licensing conditions`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*/`

			`#include "cache.h"`
			`#include "builtin.h"`
			`#include "blob.h"`
			`#include "commit.h"`
			`#include "tag.h"`
			`#include "tree-walk.h"`
			`#include "diff.h"`
			`#include "diffcore.h"`
			`#include "revision.h"`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`#include "quote.h"`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`#include "xdiff-interface.h"`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`#include "cache-tree.h"`
Rename path_list to string_list The name path_list was correct for the first usage of that data structure, but it really is a general-purpose string list. $ perl -i -pe 's/path-list/string-list/g' $(git grep -l path-list) $ perl -i -pe 's/path_list/string_list/g' $(git grep -l path_list) $ git mv path-list.h string-list.h $ git mv path-list.c string-list.c $ perl -i -pe 's/has_path/has_string/g' $(git grep -l has_path) $ perl -i -pe 's/path/string/g' string-list.[ch] $ git mv Documentation/technical/api-path-list.txt \ Documentation/technical/api-string-list.txt $ perl -i -pe 's/strdup_paths/strdup_strings/g' $(git grep -l strdup_paths) ... and then fix all users of string-list to access the member "string" instead of "path". Documentation/technical/api-string-list.txt needed some rewrapping, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-21 20:03:49 +02:00			`#include "string-list.h"`
Apply mailmap in git-blame output. This makes git-blame to use the same mailmap used by git-shortlog. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-27 09:42:15 +02:00			`#include "mailmap.h"`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`#include "mergesort.h"`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`#include "parse-options.h"`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`#include "prio-queue.h"`
builtin-blame.c: Use utf8_strwidth for author's names git blame misaligns output if a author's name has a differing display width and strlen; for instance, an accented Latin letter that takes two bytes to encode will cause the rest of the line to be shifted to the left by one. To fix this, use utf8_strwidth instead of strlen (and compute the padding ourselves, since printf doesn't know about UTF-8). Signed-off-by: Geoffrey Thomas <geofft@mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 10:41:29 +01:00			`#include "utf8.h"`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`#include "userdiff.h"`
Refactor parse_loc We want to use the same style of -L n,m argument for 'git log -L' as for git-blame. Refactor the argument parsing of the range arguments from builtin/blame.c to the (new) file that will hold the 'git log -L' logic. To accommodate different data structures in blame and log -L, the file contents are abstracted away; parse_range_arg takes a callback that it uses to get the contents of a line of the (notional) file. The new test is for a case that made me pause during debugging: the 'blame -L with invalid end' test was the only one that noticed an outright failure to parse the end at all. So make a more explicit test for that. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 17:47:30 +01:00			`#include "line-range.h"`
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`#include "line-log.h"`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
i18n: blame: mark parseopt strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-20 14:31:54 +02:00			`static char blame_usage[] = N_("git blame [options] [rev-opts] [rev] [--] file");`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00
			`static const char *blame_opt_usage[] = {`
			`blame_usage,`
			`"",`
i18n: blame: mark parseopt strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-20 14:31:54 +02:00			`N_("[rev-opts] are documented in git-rev-list(1)"),`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`NULL`
			`};`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`static int longest_file;`
			`static int longest_author;`
			`static int max_orig_digits;`
			`static int max_digits;`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`static int max_score_digits;`
blame: -b (blame.blankboundary) and --root (blame.showroot) When blame.blankboundary is set (or -b option is given), commit object names are blanked out in the "human readable" output format for boundary commits. When blame.showroot is not set (or --root is not given), the root commits are treated as boundary commits. The code still attributes the lines to them, but with -b their object names are not shown. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 23:04:38 +01:00			`static int show_root;`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`static int reverse;`
blame: -b (blame.blankboundary) and --root (blame.showroot) When blame.blankboundary is set (or -b option is given), commit object names are blanked out in the "human readable" output format for boundary commits. When blame.showroot is not set (or --root is not given), the root commits are treated as boundary commits. The code still attributes the lines to them, but with -b their object names are not shown. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 23:04:38 +01:00			`static int blank_boundary;`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`static int incremental;`
git diff too slow for a file Ever since the xdiff library had been introduced to git, all its callers have used the flag XDF_NEED_MINIMAL. It makes sure that the smallest possible diff is produced, but that takes quite some time if there are lots of differences that can be expressed in multiple ways. This flag makes a difference for only 0.1% of the non-merge commits in the git repo of Linux, both in terms of diff size and execution time. The patches there are mostly nice and small. SungHyun Nam however reported a case in a different repo where a diff took more than 20 times longer to generate with XDF_NEED_MINIMAL than without. Rebasing became really slow. This patch removes this flag from all callers. The default of xdiff is saner because it has minimal to no impact in the normal case of small diffs and doesn't incur that much of a speed penalty for large ones. A follow-up patch may introduce a command line option to set the flag if the user needs it, similar to GNU diff's -d/--minimal. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-02 15:04:41 +02:00			`static int xdl_opts;`
blame: add --abbrev command line option and make it honor core.abbrev If user sets config.abbrev option, use it as if --abbrev was given. This is the default value and user can override different abbrev length by specifying the --abbrev=N command line option. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-04-06 04:20:50 +02:00			`static int abbrev = -1;`
blame: pay attention to --no-follow If you know your history did not have renames, or if you care only about the history after a large rename that happened some time ago, "git blame --no-follow $path" is a way to tell the command not to bother about renames. When you use -C, the lines that came from the renamed file will still be found without the whole-file rename detection, so it is not all that interesting either way, though. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-21 22:52:25 +02:00			`static int no_whole_file_rename;`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00
			`static enum date_mode blame_date_mode = DATE_ISO8601;`
			`static size_t blame_date_width;`

Rename path_list to string_list The name path_list was correct for the first usage of that data structure, but it really is a general-purpose string list. $ perl -i -pe 's/path-list/string-list/g' $(git grep -l path-list) $ perl -i -pe 's/path_list/string_list/g' $(git grep -l path_list) $ git mv path-list.h string-list.h $ git mv path-list.c string-list.c $ perl -i -pe 's/has_path/has_string/g' $(git grep -l has_path) $ perl -i -pe 's/path/string/g' string-list.[ch] $ git mv Documentation/technical/api-path-list.txt \ Documentation/technical/api-string-list.txt $ perl -i -pe 's/strdup_paths/strdup_strings/g' $(git grep -l strdup_paths) ... and then fix all users of string-list to access the member "string" instead of "path". Documentation/technical/api-string-list.txt needed some rewrapping, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-21 20:03:49 +02:00			`static struct string_list mailmap;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`#ifndef DEBUG`
			`#define DEBUG 0`
			`#endif`

git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`/* stats */`
			`static int num_read_blob;`
			`static int num_get_patch;`
			`static int num_commits;`

git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`#define PICKAXE_BLAME_MOVE 01`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`#define PICKAXE_BLAME_COPY 02`
			`#define PICKAXE_BLAME_COPY_HARDER 04`
blame: -C -C -C When you do this, existing "blame -C -C" would not find that the latter half of the file2 came from the existing file1: ... both file1 and file2 are tracked ... $ cat file1 >>file2 $ git add file1 file2 $ git commit This is because we avoid the expensive find-copies-harder code that makes unchanged file (in this case, file1) as a candidate for copy & paste source when annotating an existing file (file2). The third -C now allows it. However, this obviously makes the process very expensive. We've actually seen this patch before, but I dismissed it because it covers such a narrow (and arguably stupid) corner case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-06 06:18:57 +02:00			`#define PICKAXE_BLAME_COPY_HARDEST 010`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00
git-pickaxe: introduce heuristics to avoid "trivial" chunks This adds scoring logic to blame_entry to prevent blames on very trivial chunks (e.g. lots of empty lines, indent followed by a closing brace) from being passed down to unrelated lines in the parent. The current heuristics are quite simple and may need to be tweaked later, but we need to start somewhere. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 00:37:12 +02:00			`/*`
			`* blame for a blame_entry with score lower than these thresholds`
			`* is not passed to the parent using move/copy logic.`
			`*/`
			`static unsigned blame_move_score;`
			`static unsigned blame_copy_score;`
			`#define BLAME_DEFAULT_MOVE_SCORE 20`
			`#define BLAME_DEFAULT_COPY_SCORE 40`

object.h: centralize object flag allocation While the field "flags" is mainly used by the revision walker, it is also used in many other places. Centralize the whole flag allocation to one place for a better overview (and easier to move flags if we have too). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-03-25 14:23:26 +01:00			`/* Remember to update object flag allocation in object.h */`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`#define METAINFO_SHOWN (1u<<12)`
			`#define MORE_THAN_ONE_PATH (1u<<13)`

			`/*`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`* One blob in a commit that is being suspected`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*/`
			`struct origin {`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`int refcnt;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* Record preceding blame record for this blob */`
blame: show "previous" information in --porcelain/--incremental format When the final blame is laid for a line to a <commit, path> pair, it also gives a "previous" information to --porcelain and --incremental output format. It gives the parent commit of the blamed commit, _and_ a path in that parent commit that corresponds to the blamed path --- in short, it is the origin that would have been blamed (or passed blame through) for the line _if_ the blamed commit did not change that line. This unfortunately makes sanity checking of refcount quite complex, so I ripped it out for now. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:58:40 +02:00			`struct origin *previous;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			/* origins are put in a list linked via `next' hanging off the
			`* corresponding commit's util field in order to make finding`
			`* them fast. The presence in this chain does not count`
			`* towards the origin's reference count. It is tempting to`
			`* let it count as long as the commit is pending examination,`
			`* but even under circumstances where the commit will be`
			`* present multiple times in the priority queue of unexamined`
			`* commits, processing the first instance will not leave any`
			`* work requiring the origin data for the second instance. An`
			`* interspersed commit changing that would have to be`
			`* preexisting with a different ancestry and with the same`
			`* commit date in order to wedge itself between two instances`
			`* of the same commit in the priority queue _and_ produce`
			`* blame entries relevant for it. While we don't want to let`
			`* us get tripped up by this case, it certainly does not seem`
			`* worth optimizing for.`
			`*/`
			`struct origin *next;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct commit *commit;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			/* `suspects' contains blame entries that may be attributed to
			`* this origin's commit or to parent commits. When a commit`
			`* is being processed, all suspects will be moved, either by`
			`* assigning them to an origin in a different commit, or by`
			`* shipping them to the scoreboard's ent list because they`
			`* cannot be attributed to a different commit.`
			`*/`
			`struct blame_entry *suspects;`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`mmfile_t file;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`unsigned char blob_sha1[20];`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`unsigned mode;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* guilty gets set when shipping any suspects to the final`
			`* blame list instead of other commits`
			`*/`
			`char guilty;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`char path[FLEX_ARRAY];`
			`};`

blame: factor out helper for calling xdi_diff() Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:23:39 +02:00			`static int diff_hunks(mmfile_t file_a, mmfile_t file_b, long ctxlen,`
			`xdl_emit_hunk_consume_func_t hunk_func, void *cb_data)`
			`{`
			`xpparam_t xpp = {0};`
			`xdemitconf_t xecfg = {0};`
builtin/blame.c: Fix a "Using plain integer as NULL pointer" warning Plain gcc may not but sparse catches and complains about this sort of stuff. Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-13 23:41:16 +02:00			`xdemitcb_t ecb = {NULL};`
blame: factor out helper for calling xdi_diff() Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:23:39 +02:00
			`xpp.flags = xdl_opts;`
			`xecfg.ctxlen = ctxlen;`
			`xecfg.hunk_func = hunk_func;`
			`ecb.priv = cb_data;`
			`return xdi_diff(file_a, file_b, &xpp, &xecfg, &ecb);`
			`}`

textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`/*`
			`* Prepare diff_filespec and convert it using diff textconv API`
			`* if the textconv driver exists.`
			`* Return 1 if the conversion succeeds, 0 otherwise.`
			`*/`
textconv: support for cat_file Make the textconv_object function public, and add --textconv option to cat-file to perform conversion on blob objects. Using --textconv implies that we are working on a blob. As files drivers need to be initialized, a new config is required in addition to git_default_config. Therefore git_cat_file_config() is introduced Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 17:50:28 +02:00			`int textconv_object(const char *path,`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`unsigned mode,`
textconv: support for cat_file Make the textconv_object function public, and add --textconv option to cat-file to perform conversion on blob objects. Using --textconv implies that we are working on a blob. As files drivers need to be initialized, a new config is required in addition to git_default_config. Therefore git_cat_file_config() is introduced Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 17:50:28 +02:00			`const unsigned char *sha1,`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`int sha1_valid,`
textconv: support for cat_file Make the textconv_object function public, and add --textconv option to cat-file to perform conversion on blob objects. Using --textconv implies that we are working on a blob. As files drivers need to be initialized, a new config is required in addition to git_default_config. Therefore git_cat_file_config() is introduced Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 17:50:28 +02:00			`char **buf,`
			`unsigned long *buf_size)`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`{`
			`struct diff_filespec *df;`
			`struct userdiff_driver *textconv;`

			`df = alloc_filespec(path);`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`fill_filespec(df, sha1, sha1_valid, mode);`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`textconv = get_textconv(df);`
			`if (!textconv) {`
			`free_filespec(df);`
			`return 0;`
			`}`

			`*buf_size = fill_textconv(textconv, df, buf);`
			`free_filespec(df);`
			`return 1;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Given an origin, prepare mmfile_t structure to be used by the`
			`* diff machinery`
			`*/`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`static void fill_origin_blob(struct diff_options *opt,`
			`struct origin o, mmfile_t file)`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`{`
			`if (!o->file.ptr) {`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`unsigned long file_size;`

git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`num_read_blob++;`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`if (DIFF_OPT_TST(opt, ALLOW_TEXTCONV) &&`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`textconv_object(o->path, o->mode, o->blob_sha1, 1, &file->ptr, &file_size))`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`;`
			`else`
			`file->ptr = read_sha1_file(o->blob_sha1, &type, &file_size);`
			`file->size = file_size;`

blame: check return value from read_sha1_file() Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-25 10:26:20 +02:00			`if (!file->ptr)`
			`die("Cannot read blob %s for path %s",`
			`sha1_to_hex(o->blob_sha1),`
			`o->path);`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`o->file = *file;`
			`}`
			`else`
			`*file = o->file;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Origin is refcounted and usually we keep the blob contents to be`
			`* reused.`
			`*/`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`static inline struct origin origin_incref(struct origin o)`
			`{`
			`if (o)`
			`o->refcnt++;`
			`return o;`
			`}`

			`static void origin_decref(struct origin *o)`
			`{`
			`if (o && --o->refcnt <= 0) {`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct origin p, l = NULL;`
blame: show "previous" information in --porcelain/--incremental format When the final blame is laid for a line to a <commit, path> pair, it also gives a "previous" information to --porcelain and --incremental output format. It gives the parent commit of the blamed commit, _and_ a path in that parent commit that corresponds to the blamed path --- in short, it is the origin that would have been blamed (or passed blame through) for the line _if_ the blamed commit did not change that line. This unfortunately makes sanity checking of refcount quite complex, so I ripped it out for now. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:58:40 +02:00			`if (o->previous)`
			`origin_decref(o->previous);`
Avoid unnecessary "if-before-free" tests. This change removes all obvious useless if-before-free tests. E.g., it replaces code like this: if (some_expression) free (some_expression); with the now-equivalent: free (some_expression); It is equivalent not just because POSIX has required free(NULL) to work for a long time, but simply because it has worked for so long that no reasonable porting target fails the test. Here's some evidence from nearly 1.5 years ago: http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html FYI, the change below was prepared by running the following: git ls-files -z \| xargs -0 \ perl -0x3b -pi -e \ 's/\bif\s\(\s(\S+?)(?:\s!=\sNULL)?\s\)\s+(free\s\(\s\1\s\))/$2/s' Note however, that it doesn't handle brace-enclosed blocks like "if (x) { free (x); }". But that's ok, since there were none like that in git sources. Beware: if you do use the above snippet, note that it can produce syntactically invalid C code. That happens when the affected "if"-statement has a matching "else". E.g., it would transform this if (x) free (x); else foo (); into this: free (x); else foo (); There were none of those here, either. If you're interested in automating detection of the useless tests, you might like the useless-if-before-free script in gnulib: [it does detect brace-enclosed free statements, and has a --name=S option to make it detect free-like functions with different names] http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free Addendum: Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 18:26:32 +01:00			`free(o->file.ptr);`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* Should be present exactly once in commit chain */`
			`for (p = o->commit->util; p; l = p, p = p->next) {`
			`if (p == o) {`
			`if (l)`
			`l->next = p->next;`
			`else`
			`o->commit->util = p->next;`
			`free(o);`
			`return;`
			`}`
			`}`
			`die("internal error in blame::origin_decref");`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`}`
			`}`

blame: drop blob data after passing blame to the parent We used to keep the blob data for each origin that has any remaining line in the result, but this will get very costly with a huge file that has a deep history. This patch releases the blob after we ran diff between the child rev and its parents. When passing blame from a parent to its parent (i.e. the grandparent), the blob data for the parent may need to be read again, but it should be relatively cheap, thanks to delta-base cache. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-12 01:05:50 +01:00			`static void drop_origin_blob(struct origin *o)`
			`{`
			`if (o->file.ptr) {`
			`free(o->file.ptr);`
			`o->file.ptr = NULL;`
			`}`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Each group of lines is described by a blame_entry; it can be split`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* as we pass blame to the parents. They are arranged in linked lists`
			* kept as `suspects' of some unprocessed origin, or entered (when the
			`* blame origin has been finalized) into the scoreboard structure.`
			`* While the scoreboard structure is only sorted at the end of`
			`* processing (according to final image line number), the lists`
			`* attached to an origin are sorted by the target line number.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct blame_entry {`
			`struct blame_entry *next;`

			`/* the first line of this group in the final image;`
			`* internally all line numbers are 0 based.`
			`*/`
			`int lno;`

			`/* how many lines this group has */`
			`int num_lines;`

			`/* the commit that introduced this group into the final image */`
			`struct origin *suspect;`

			`/* the line number of the first line of this group in the`
			`* suspect's file; internally all line numbers are 0 based.`
			`*/`
			`int s_lno;`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00
			`/* how significant this entry is -- cached to avoid`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`* scanning the lines over and over.`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`*/`
			`unsigned score;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`};`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/*`
			`* Any merge of blames happens on lists of blames that arrived via`
			`* different parents in a single suspect. In this case, we want to`
			`* sort according to the suspect line numbers as opposed to the final`
			`* image line numbers. The function body is somewhat longish because`
			`* it avoids unnecessary writes.`
			`*/`

			`static struct blame_entry blame_merge(struct blame_entry list1,`
			`struct blame_entry *list2)`
			`{`
			`struct blame_entry p1 = list1, p2 = list2,`
			`**tail = &list1;`

			`if (!p1)`
			`return p2;`
			`if (!p2)`
			`return p1;`

			`if (p1->s_lno <= p2->s_lno) {`
			`do {`
			`tail = &p1->next;`
			`if ((p1 = *tail) == NULL) {`
			`*tail = p2;`
			`return list1;`
			`}`
			`} while (p1->s_lno <= p2->s_lno);`
			`}`
			`for (;;) {`
			`*tail = p2;`
			`do {`
			`tail = &p2->next;`
			`if ((p2 = *tail) == NULL) {`
			`*tail = p1;`
			`return list1;`
			`}`
			`} while (p1->s_lno > p2->s_lno);`
			`*tail = p1;`
			`do {`
			`tail = &p1->next;`
			`if ((p1 = *tail) == NULL) {`
			`*tail = p2;`
			`return list1;`
			`}`
			`} while (p1->s_lno <= p2->s_lno);`
			`}`
			`}`

			`static void get_next_blame(const void p)`
			`{`
			`return ((struct blame_entry *)p)->next;`
			`}`

			`static void set_next_blame(void p1, void p2)`
			`{`
			`((struct blame_entry *)p1)->next = p2;`
			`}`

			`/*`
			`* Final image line numbers are all different, so we don't need a`
			`* three-way comparison here.`
			`*/`

			`static int compare_blame_final(const void p1, const void p2)`
			`{`
			`return ((struct blame_entry )p1)->lno > ((struct blame_entry )p2)->lno`
			`? 1 : -1;`
			`}`

			`static int compare_blame_suspect(const void p1, const void p2)`
			`{`
			`const struct blame_entry s1 = p1, s2 = p2;`
			`/*`
			`* to allow for collating suspects, we sort according to the`
			`* respective pointer value as the primary sorting criterion.`
			`* The actual relation is pretty unimportant as long as it`
			`* establishes a total order. Comparing as integers gives us`
			`* that.`
			`*/`
			`if (s1->suspect != s2->suspect)`
			`return (intptr_t)s1->suspect > (intptr_t)s2->suspect ? 1 : -1;`
			`if (s1->s_lno == s2->s_lno)`
			`return 0;`
			`return s1->s_lno > s2->s_lno ? 1 : -1;`
			`}`

			`static struct blame_entry blame_sort(struct blame_entry head,`
			`int (compare_fn)(const void , const void *))`
			`{`
			`return llist_mergesort (head, get_next_blame, set_next_blame, compare_fn);`
			`}`

			`static int compare_commits_by_reverse_commit_date(const void *a,`
			`const void *b,`
			`void *c)`
			`{`
			`return -compare_commits_by_commit_date(a, b, c);`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* The current state of the blame assignment.`
			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct scoreboard {`
			`/* the final commit (i.e. where we started digging from) */`
			`struct commit *final;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* Priority queue for commits with unassigned blame records */`
			`struct prio_queue commits;`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`struct rev_info *revs;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`const char *path;`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* The contents in the final image.`
			`* Used by many functions to obtain contents of the nth line,`
			`* indexed with scoreboard.lineno[blame_entry.lno].`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*/`
			`const char *final_buf;`
			`unsigned long final_buf_size;`

			`/* linked list of blames */`
			`struct blame_entry *ent;`

git-pickaxe: do not keep commit buffer. We need the commit buffer data while generating the final result, but until then we do not need them. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 08:49:31 +02:00			`/* look-up a line in the final buffer */`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`int num_lines;`
			`int *lineno;`
			`};`

git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`static void sanity_check_refcnt(struct scoreboard *);`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* If two blame entries that are next to each other came from`
			`* contiguous lines in the same origin (i.e. <commit, path> pair),`
			`* merge them together.`
			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`static void coalesce(struct scoreboard *sb)`
			`{`
			`struct blame_entry ent, next;`

			`for (ent = sb->ent; ent && (next = ent->next); ent = next) {`
builtin/blame.c: eliminate same_suspect() Since the origin pointers are "interned" and reference-counted, comparing the pointers rather than the content is enough. The only uninterned origins are cached values kept in commit->util, but same_suspect is not called on them. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-01-22 01:20:15 +01:00			`if (ent->suspect == next->suspect &&`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`ent->s_lno + ent->num_lines == next->s_lno) {`
			`ent->num_lines += next->num_lines;`
			`ent->next = next->next;`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`origin_decref(next->suspect);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`free(next);`
git-pickaxe: do not confuse two origins that are the same. It used to be that we can compare the address of the origin structure to determine if they are the same because they are always registered with scoreboard. After introduction of the loop to try finding the best split, that is not true anymore. The current code has rather serious leaks with origin structure, but more importantly it gets confused when two origins that points at the same commit and same path. We might eventually have to refcount and gc origin, but let's fix the correctness issue first. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 09:41:38 +02:00			`ent->score = 0;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`next = ent; /* again */`
			`}`
			`}`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00
			`if (DEBUG) /* sanity */`
			`sanity_check_refcnt(sb);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/*`
			`* Merge the given sorted list of blames into a preexisting origin.`
			`* If there were no previous blames to that commit, it is entered into`
			`* the commit priority queue of the score board.`
			`*/`

			`static void queue_blames(struct scoreboard sb, struct origin porigin,`
			`struct blame_entry *sorted)`
			`{`
			`if (porigin->suspects)`
			`porigin->suspects = blame_merge(porigin->suspects, sorted);`
			`else {`
			`struct origin *o;`
			`for (o = porigin->commit->util; o; o = o->next) {`
			`if (o->suspects) {`
			`porigin->suspects = sorted;`
			`return;`
			`}`
			`}`
			`porigin->suspects = sorted;`
			`prio_queue_put(&sb->commits, porigin->commit);`
			`}`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Given a commit and a path in it, create a new origin structure.`
			`* The callers that add blame to the scoreboard should use`
			`* get_origin() to obtain shared, refcounted copy instead of calling`
			`* this function directly.`
			`*/`
git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 04:18:50 +01:00			`static struct origin make_origin(struct commit commit, const char *path)`
			`{`
			`struct origin *o;`
			`o = xcalloc(1, sizeof(*o) + strlen(path) + 1);`
			`o->commit = commit;`
			`o->refcnt = 1;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`o->next = commit->util;`
			`commit->util = o;`
git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 04:18:50 +01:00			`strcpy(o->path, path);`
			`return o;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Locate an existing origin or create a new one.`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* This moves the origin to front position in the commit util list.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`static struct origin get_origin(struct scoreboard sb,`
			`struct commit *commit,`
			`const char *path)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct origin o, l;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`for (o = commit->util, l = NULL; o; l = o, o = o->next) {`
			`if (!strcmp(o->path, path)) {`
			`/* bump to front */`
			`if (l) {`
			`l->next = o->next;`
			`o->next = commit->util;`
			`commit->util = o;`
			`}`
			`return origin_incref(o);`
			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 04:18:50 +01:00			`return make_origin(commit, path);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Fill the blob_sha1 field of an origin if it hasn't, so that later`
			`* call to fill_origin_blob() can use it to locate the data. blob_sha1`
			`* for an origin is also used to pass the blame for the entire file to`
			`* the parent to detect the case where a child's blob is identical to`
			`* that of its parent's.`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`*`
			`* This also fills origin->mode for corresponding tree path.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`static int fill_blob_sha1_and_mode(struct origin *origin)`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`{`
			`if (!is_null_sha1(origin->blob_sha1))`
			`return 0;`
			`if (get_tree_entry(origin->commit->object.sha1,`
			`origin->path,`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`origin->blob_sha1, &origin->mode))`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`goto error_out;`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`if (sha1_object_info(origin->blob_sha1, NULL) != OBJ_BLOB)`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`goto error_out;`
			`return 0;`
			`error_out:`
			`hashclr(origin->blob_sha1);`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`origin->mode = S_IFINVALID;`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`return -1;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* We have an origin -- check if the same path exists in the`
			`* parent and return an origin structure to represent it.`
			`*/`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`static struct origin find_origin(struct scoreboard sb,`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct commit *parent,`
			`struct origin *origin)`
			`{`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct origin *porigin;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct diff_options diff_opts;`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`const char *paths[2];`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* First check any existing origins */`
			`for (porigin = parent->util; porigin; porigin = porigin->next)`
			`if (!strcmp(porigin->path, origin->path)) {`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* The same path between origin and its parent`
			`* without renaming -- the most common case.`
			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`return origin_incref (porigin);`
git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 04:18:50 +01:00			`}`
git-pickaxe: cache one already found path per commit. Depending on how bushy the commit DAG is, this saves calls to the internal diff-tree for fork-point commits. For example, annotating Makefile in the kernel repository saves about a third of such diff-tree calls. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 10:00:01 +01:00
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`/* See if the origin->path is different between parent`
			`* and origin first. Most of the time they are the`
			`* same and diff-tree is fairly efficient about this.`
			`*/`
			`diff_setup(&diff_opts);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&diff_opts, RECURSIVE);`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`diff_opts.detect_rename = 0;`
			`diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;`
			`paths[0] = origin->path;`
			`paths[1] = NULL;`

pathspec: stop ---pathspecs impact on internal parse_pathspec() uses Normally parse_pathspec() is used on command line arguments where it can do fancy thing like parsing magic on each argument or adding magic for all pathspecs based on ---pathspecs options. There's another use of parse_pathspec(), where pathspec is needed, but the input is known to be pure paths. In this case we usually don't want ---pathspecs to interfere. And we definitely do not want to parse magic in these paths, regardless of --literal-pathspecs. Add new flag PATHSPEC_LITERAL_PATH for this purpose. When it's set, ---pathspecs are ignored, no magic is parsed. And if the caller allows PATHSPEC_LITERAL (i.e. the next calls can take literal magic), then PATHSPEC_LITERAL will be set. This fixes cases where git chokes when GIT__PATHSPECS are set because parse_pathspec() indicates it won't take any magic. But GIT__PATHSPECS add them anyway. These are export GIT_LITERAL_PATHSPECS=1 git blame -- something git log --follow something git log --merge "git ls-files --with-tree=path" (aka parse_pathspec() in overlay_tree_on_cache()) is safe because the input is empty, and producing one pathspec due to PATHSPEC_PREFER_CWD does not take any magic into account. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-10-26 04:09:20 +02:00			`parse_pathspec(&diff_opts.pathspec,`
			`PATHSPEC_ALL_MAGIC & ~PATHSPEC_LITERAL,`
			`PATHSPEC_LITERAL_PATH, "", paths);`
diff_setup_done(): return void diff_setup_done() has historically returned an error code, but lost the last nonzero return in 943d5b7 (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-03 14:16:24 +02:00			`diff_setup_done(&diff_opts);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`if (is_null_sha1(origin->commit->object.sha1))`
			`do_diff_cache(parent->tree->object.sha1, &diff_opts);`
			`else`
			`diff_tree_sha1(parent->tree->object.sha1,`
			`origin->commit->tree->object.sha1,`
			`"", &diff_opts);`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`diffcore_std(&diff_opts);`

			`if (!diff_queued_diff.nr) {`
			`/* The path is the same as parent */`
			`porigin = get_origin(sb, parent, origin->path);`
			`hashcpy(porigin->blob_sha1, origin->blob_sha1);`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`porigin->mode = origin->mode;`
blame: correctly handle a path that used to be a directory When trying to see if the same path exists in the parent, we ran "diff-tree" with pathspec set to the path we are interested in with the parent, and expect either to have exactly one resulting filepair (either "changed from the parent", "created when there was none") or nothing (when there is no change from the parent). If the path used to be a directory, however, we will also see unbounded number of entries that talk about the files that used to exist underneath the directory in question. Correctly pick only the entry that describes the path we are interested in in such a case (namely, the creation of the path as a regular file). Noticed by Ben Willard. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-03 09:43:22 +02:00			`} else {`
			`/*`
			`* Since origin->path is a pathspec, if the parent`
			`* commit had it as a directory, we will see a whole`
			`* bunch of deletion of files in the directory that we`
			`* do not care about.`
			`*/`
			`int i;`
			`struct diff_filepair *p = NULL;`
			`for (i = 0; i < diff_queued_diff.nr; i++) {`
			`const char *name;`
			`p = diff_queued_diff.queue[i];`
			`name = p->one->path ? p->one->path : p->two->path;`
			`if (!strcmp(name, origin->path))`
			`break;`
			`}`
			`if (!p)`
			`die("internal error in blame::find_origin");`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`switch (p->status) {`
			`default:`
git-pickaxe: retire pickaxe Just make it take over blame's place. Documentation and command have all stopped mentioning "git-pickaxe". The built-in synonym is left in the command table, so you can still say "git pickaxe", but it probably is a good idea to retire it as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-09 03:47:54 +01:00			`die("internal error in blame::find_origin (%c)",`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`p->status);`
			`case 'M':`
			`porigin = get_origin(sb, parent, origin->path);`
			`hashcpy(porigin->blob_sha1, p->one->sha1);`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`porigin->mode = p->one->mode;`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`break;`
			`case 'A':`
			`case 'T':`
			`/* Did not exist in parent, or type changed */`
			`break;`
			`}`
			`}`
			`diff_flush(&diff_opts);`
remove diff_tree_{setup,release}_paths Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-07-14 10:35:58 +02:00			`free_pathspec(&diff_opts.pathspec);`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00			`return porigin;`
			`}`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* We have an origin -- find the path that corresponds to it in its`
			`* parent and return an origin structure to represent it.`
			`*/`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00			`static struct origin find_rename(struct scoreboard sb,`
			`struct commit *parent,`
			`struct origin *origin)`
			`{`
			`struct origin *porigin = NULL;`
			`struct diff_options diff_opts;`
			`int i;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`diff_setup(&diff_opts);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&diff_opts, RECURSIVE);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`diff_opts.detect_rename = DIFF_DETECT_RENAME;`
			`diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;`
git-pickaxe: rename detection optimization The idea is that we are interested in renaming into only one path, so we do not care about renames that happen elsewhere. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-02 09:02:11 +01:00			`diff_opts.single_follow = origin->path;`
diff_setup_done(): return void diff_setup_done() has historically returned an error code, but lost the last nonzero return in 943d5b7 (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-03 14:16:24 +02:00			`diff_setup_done(&diff_opts);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`if (is_null_sha1(origin->commit->object.sha1))`
			`do_diff_cache(parent->tree->object.sha1, &diff_opts);`
			`else`
			`diff_tree_sha1(parent->tree->object.sha1,`
			`origin->commit->tree->object.sha1,`
			`"", &diff_opts);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`diffcore_std(&diff_opts);`

			`for (i = 0; i < diff_queued_diff.nr; i++) {`
			`struct diff_filepair *p = diff_queued_diff.queue[i];`
git-pickaxe: do not keep commit buffer. We need the commit buffer data while generating the final result, but until then we do not need them. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 08:49:31 +02:00			`if ((p->status == 'R' \|\| p->status == 'C') &&`
git-pickaxe: get rid of wasteful find_origin(). After finding out which path in the parent to scan to pass blames, using get_tree_entry() to extract the blob information again was quite wasteful, since diff-tree already gave us that information. Separate the function to create an origin out as get_origin(). You'll never know what is more efficient unless you try and/or think hard. I somehow thought that extracting one known path out of commit's tree is cheaper than running a diff-tree for the current path between the commit and its parent, but it is not the case. In real, non-toy projects, most commits do not touch the path you are interested in, and if the path is a few levels away from the toplevel, whole-subdirectory comparison logic diff-tree allows us to skip opening lower subdirectories. This commit rewrites find_origin() function to use a single-path diff-tree to see if the parent has the same blob as the current suspect, which is cheaper than extracting the blob information using get_tree_entry() and comparing it with what the current suspect has. This shaves about 6% overhead when annotating kernel/sched.c in the Linux kernel repository on my machine. The saving rises to 25% for arch/i386/kernel/Makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 11:56:33 +02:00			`!strcmp(p->two->path, origin->path)) {`
			`porigin = get_origin(sb, parent, p->one->path);`
			`hashcpy(porigin->blob_sha1, p->one->sha1);`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`porigin->mode = p->one->mode;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`break;`
			`}`
			`}`
			`diff_flush(&diff_opts);`
remove diff_tree_{setup,release}_paths Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-07-14 10:35:58 +02:00			`free_pathspec(&diff_opts.pathspec);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`return porigin;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* Append a new blame entry to a given output queue.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static void add_blame_entry(struct blame_entry **queue, struct blame_entry e)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`origin_incref(e->suspect);`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`e->next = **queue;`
			`**queue = e;`
			`*queue = &e->next;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* src typically is on-stack; we want to copy the information in it to`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* a malloced blame_entry that gets added to the given queue. The`
			`* origin of dst loses a refcnt.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static void dup_entry(struct blame_entry ***queue,`
			`struct blame_entry dst, struct blame_entry src)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`origin_incref(src->suspect);`
			`origin_decref(dst->suspect);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`memcpy(dst, src, sizeof(*src));`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`dst->next = **queue;`
			`**queue = dst;`
			`*queue = &dst->next;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

Refactor parse_loc We want to use the same style of -L n,m argument for 'git log -L' as for git-blame. Refactor the argument parsing of the range arguments from builtin/blame.c to the (new) file that will hold the 'git log -L' logic. To accommodate different data structures in blame and log -L, the file contents are abstracted away; parse_range_arg takes a callback that it uses to get the contents of a line of the (notional) file. The new test is for a case that made me pause during debugging: the 'blame -L with invalid end' test was the only one that noticed an outright failure to parse the end at all. So make a more explicit test for that. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 17:47:30 +01:00			`static const char nth_line(struct scoreboard sb, long lno)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
			`return sb->final_buf + sb->lineno[lno];`
			`}`

Refactor parse_loc We want to use the same style of -L n,m argument for 'git log -L' as for git-blame. Refactor the argument parsing of the range arguments from builtin/blame.c to the (new) file that will hold the 'git log -L' logic. To accommodate different data structures in blame and log -L, the file contents are abstracted away; parse_range_arg takes a callback that it uses to get the contents of a line of the (notional) file. The new test is for a case that made me pause during debugging: the 'blame -L with invalid end' test was the only one that noticed an outright failure to parse the end at all. So make a more explicit test for that. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 17:47:30 +01:00			`static const char nth_line_cb(void data, long lno)`
			`{`
			`return nth_line((struct scoreboard *)data, lno);`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* It is known that lines between tlno to same came from parent, and e`
			`* has an overlap with that range. it also is known that parent's`
			`* line plno corresponds to e's line tlno.`
			`*`
			`* <---- e ----->`
			`* <------>`
			`* <------------>`
			`* <------------>`
			`* <------------------>`
			`*`
			`* Split e into potentially three parts; before this chunk, the chunk`
			`* to be blamed for the parent, and after that portion.`
			`*/`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`static void split_overlap(struct blame_entry *split,`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct blame_entry *e,`
			`int tlno, int plno, int same,`
			`struct origin *parent)`
			`{`
			`int chunk_end_lno;`
			`memset(split, 0, sizeof(struct blame_entry [3]));`

			`if (e->s_lno < tlno) {`
			`/* there is a pre-chunk part not blamed on parent */`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`split[0].suspect = origin_incref(e->suspect);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`split[0].lno = e->lno;`
			`split[0].s_lno = e->s_lno;`
			`split[0].num_lines = tlno - e->s_lno;`
			`split[1].lno = e->lno + tlno - e->s_lno;`
			`split[1].s_lno = plno;`
			`}`
			`else {`
			`split[1].lno = e->lno;`
			`split[1].s_lno = plno + (e->s_lno - tlno);`
			`}`

			`if (same < e->s_lno + e->num_lines) {`
			`/* there is a post-chunk part not blamed on parent */`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`split[2].suspect = origin_incref(e->suspect);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`split[2].lno = e->lno + (same - e->s_lno);`
			`split[2].s_lno = e->s_lno + (same - e->s_lno);`
			`split[2].num_lines = e->s_lno + e->num_lines - same;`
			`chunk_end_lno = split[2].lno;`
			`}`
			`else`
			`chunk_end_lno = e->lno + e->num_lines;`
			`split[1].num_lines = chunk_end_lno - split[1].lno;`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* if it turns out there is nothing to blame the parent for,`
			`* forget about the splitting. !split[1].suspect signals this.`
			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`if (split[1].num_lines < 1)`
			`return;`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`split[1].suspect = origin_incref(parent);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* split_overlap() divided an existing blame e into up to three parts`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* in split. Any assigned blame is moved to queue to`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`* reflect the split.`
			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static void split_blame(struct blame_entry ***blamed,`
			`struct blame_entry ***unblamed,`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`struct blame_entry *split,`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct blame_entry *e)`
			`{`
			`struct blame_entry *new_entry;`

			`if (split[0].suspect && split[2].suspect) {`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/* The first part (reuse storage for the existing entry e) */`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`dup_entry(unblamed, e, &split[0]);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/* The last part -- me */`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`new_entry = xmalloc(sizeof(*new_entry));`
			`memcpy(new_entry, &(split[2]), sizeof(struct blame_entry));`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`add_blame_entry(unblamed, new_entry);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/* ... and the middle part -- parent */`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`new_entry = xmalloc(sizeof(*new_entry));`
			`memcpy(new_entry, &(split[1]), sizeof(struct blame_entry));`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`add_blame_entry(blamed, new_entry);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
			`else if (!split[0].suspect && !split[2].suspect)`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* The parent covers the entire area; reuse storage for`
			`* e and replace it with the parent.`
			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`dup_entry(blamed, e, &split[1]);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`else if (split[0].suspect) {`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/* me and then parent */`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`dup_entry(unblamed, e, &split[0]);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`new_entry = xmalloc(sizeof(*new_entry));`
			`memcpy(new_entry, &(split[1]), sizeof(struct blame_entry));`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`add_blame_entry(blamed, new_entry);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
			`else {`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/* parent and then me */`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`dup_entry(blamed, e, &split[1]);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`new_entry = xmalloc(sizeof(*new_entry));`
			`memcpy(new_entry, &(split[2]), sizeof(struct blame_entry));`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`add_blame_entry(unblamed, new_entry);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* After splitting the blame, the origins used by the`
			`* on-stack blame_entry should lose one refcnt each.`
			`*/`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`static void decref_split(struct blame_entry *split)`
			`{`
			`int i;`

			`for (i = 0; i < 3; i++)`
			`origin_decref(split[i].suspect);`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* reverse_blame reverses the list given in head, appending tail.`
			`* That allows us to build lists in reverse order, then reverse them`
			`* afterwards. This can be faster than building the list in proper`
			`* order right away. The reason is that building in proper order`
			`* requires writing a link in the _previous_ element, while building`
			`* in reverse order just requires placing the list head into the`
			`* _current_ element.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static struct blame_entry reverse_blame(struct blame_entry head,`
			`struct blame_entry *tail)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`while (head) {`
			`struct blame_entry *next = head->next;`
			`head->next = tail;`
			`tail = head;`
			`head = next;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`return tail;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Process one hunk from the patch between the current suspect for`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* blame_entry e and its parent. This first blames any unfinished`
			`* entries before the chunk (which is where target and parent start`
			`* differing) on the parent, and then splits blame entries at the`
			`* start and at the end of the difference region. Since use of -M and`
			`* -C options may lead to overlapping/duplicate source line number`
			`* ranges, all we can rely on from sorting/merging is the order of the`
			`* first suspect line number.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static void blame_chunk(struct blame_entry *dstq, struct blame_entry *srcq,`
			`int tlno, int offset, int same,`
			`struct origin *parent)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct blame_entry e = *srcq;`
			`struct blame_entry samep = NULL, diffp = NULL;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`while (e && e->s_lno < tlno) {`
			`struct blame_entry *next = e->next;`
			`/*`
			`* current record starts before differing portion. If`
			`* it reaches into it, we need to split it up and`
			`* examine the second part separately.`
			`*/`
			`if (e->s_lno + e->num_lines > tlno) {`
			`/* Move second half to a new record */`
			`int len = tlno - e->s_lno;`
			`struct blame_entry *n = xcalloc(1, sizeof (struct blame_entry));`
			`n->suspect = e->suspect;`
			`n->lno = e->lno + len;`
			`n->s_lno = e->s_lno + len;`
			`n->num_lines = e->num_lines - len;`
			`e->num_lines = len;`
			`e->score = 0;`
			`/* Push new record to diffp */`
			`n->next = diffp;`
			`diffp = n;`
			`} else`
			`origin_decref(e->suspect);`
			`/* Pass blame for everything before the differing`
			`* chunk to the parent */`
			`e->suspect = origin_incref(parent);`
			`e->s_lno += offset;`
			`e->next = samep;`
			`samep = e;`
			`e = next;`
			`}`
			`/*`
			`* As we don't know how much of a common stretch after this`
			`* diff will occur, the currently blamed parts are all that we`
			`* can assign to the parent for now.`
			`*/`

			`if (samep) {`
			`dstq = reverse_blame(samep, dstq);`
			`*dstq = &samep->next;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/*`
			`* Prepend the split off portions: everything after e starts`
			`* after the blameable portion.`
			`*/`
			`e = reverse_blame(diffp, e);`

			`/*`
			`* Now retain records on the target while parts are different`
			`* from the parent.`
			`*/`
			`samep = NULL;`
			`diffp = NULL;`
			`while (e && e->s_lno < same) {`
			`struct blame_entry *next = e->next;`

			`/*`
			`* If current record extends into sameness, need to split.`
			`*/`
			`if (e->s_lno + e->num_lines > same) {`
			`/*`
			`* Move second half to a new record to be`
			`* processed by later chunks`
			`*/`
			`int len = same - e->s_lno;`
			`struct blame_entry *n = xcalloc(1, sizeof (struct blame_entry));`
			`n->suspect = origin_incref(e->suspect);`
			`n->lno = e->lno + len;`
			`n->s_lno = e->s_lno + len;`
			`n->num_lines = e->num_lines - len;`
			`e->num_lines = len;`
			`e->score = 0;`
			`/* Push new record to samep */`
			`n->next = samep;`
			`samep = n;`
			`}`
			`e->next = diffp;`
			`diffp = e;`
			`e = next;`
			`}`
			`**srcq = reverse_blame(diffp, reverse_blame(samep, e));`
			`/* Move across elements that are in the unblamable portion */`
			`if (diffp)`
			`*srcq = &diffp->next;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`struct blame_chunk_cb_data {`
			`struct origin *parent;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`long offset;`
			`struct blame_entry **dstq;`
			`struct blame_entry **srcq;`
blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`};`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* diff chunks are from parent to target */`
blame: use hunk_func(), part 1 Use blame_chunk_cb() directly as hunk_func() callback, without detour through xdi_diff_hunks(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:21:56 +02:00			`static int blame_chunk_cb(long start_a, long count_a,`
			`long start_b, long count_b, void *data)`
blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`{`
			`struct blame_chunk_cb_data *d = data;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (start_a - start_b != d->offset)`
			`die("internal error in blame::blame_chunk_cb");`
			`blame_chunk(&d->dstq, &d->srcq, start_b, start_a - start_b,`
			`start_b + count_b, d->parent);`
			`d->offset = start_a + count_a - (start_b + count_b);`
blame: use hunk_func(), part 1 Use blame_chunk_cb() directly as hunk_func() callback, without detour through xdi_diff_hunks(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:21:56 +02:00			`return 0;`
blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* We are looking at the origin 'target' and aiming to pass blame`
			`* for the lines it is suspected to its parent. Run diff to find`
			`* which lines came from parent and pass blame for them.`
			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static void pass_blame_to_parent(struct scoreboard *sb,`
			`struct origin *target,`
			`struct origin *parent)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
blame: inline get_patch() Inline get_patch() to its only call site as a preparation for getting rid of struct patch. Also we don't need to check the ptr members because fill_origin_blob() already did, and the caller didn't check for NULL anyway, so drop the test. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:30:22 +02:00			`mmfile_t file_p, file_o;`
Rewrite dynamic structure initializations to runtime assignment Unfortunately, there are still plenty of production systems with vendor compilers that choke unless all compound declarations can be determined statically at compile time, for example hpux10.20 (I can provide a comprehensive list of our supported platforms that exhibit this problem if necessary). This patch simply breaks apart any compound declarations with dynamic initialisation expressions, and moves the initialisation until after the last declaration in the same block, in all the places necessary to have the offending compilers accept the code. Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-14 11:31:33 +02:00			`struct blame_chunk_cb_data d;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct blame_entry *newdest = NULL;`
blame: use hunk_func(), part 1 Use blame_chunk_cb() directly as hunk_func() callback, without detour through xdi_diff_hunks(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:21:56 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (!target->suspects)`
			`return; /* nothing remains for this target */`

			`d.parent = parent;`
			`d.offset = 0;`
			`d.dstq = &newdest; d.srcq = &target->suspects;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`fill_origin_blob(&sb->revs->diffopt, parent, &file_p);`
			`fill_origin_blob(&sb->revs->diffopt, target, &file_o);`
blame: inline get_patch() Inline get_patch() to its only call site as a preparation for getting rid of struct patch. Also we don't need to check the ptr members because fill_origin_blob() already did, and the caller didn't check for NULL anyway, so drop the test. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:30:22 +02:00			`num_get_patch++;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
blame: factor out helper for calling xdi_diff() Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:23:39 +02:00			`diff_hunks(&file_p, &file_o, 0, blame_chunk_cb, &d);`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* The rest are the same as the parent */`
			`blame_chunk(&d.dstq, &d.srcq, INT_MAX, d.offset, INT_MAX, parent);`
			`*d.dstq = NULL;`
			`queue_blames(sb, parent, newdest);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`return;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* The lines in blame_entry after splitting blames many times can become`
			`* very small and trivial, and at some point it becomes pointless to`
			`* blame the parents. E.g. "\t\t}\n\t}\n\n" appears everywhere in any`
			`* ordinary C program, and it is not worth to say it was copied from`
			`* totally unrelated file in the parent.`
			`*`
			`* Compute how trivial the lines in the blame_entry are.`
			`*/`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`static unsigned ent_score(struct scoreboard sb, struct blame_entry e)`
			`{`
			`unsigned score;`
			`const char cp, ep;`

			`if (e->score)`
			`return e->score;`

git-pickaxe: do not keep commit buffer. We need the commit buffer data while generating the final result, but until then we do not need them. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 08:49:31 +02:00			`score = 1;`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`cp = nth_line(sb, e->lno);`
			`ep = nth_line(sb, e->lno + e->num_lines);`
			`while (cp < ep) {`
			`unsigned ch = ((unsigned char )cp);`
			`if (isalnum(ch))`
			`score++;`
			`cp++;`
			`}`
			`e->score = score;`
			`return score;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* best_so_far[] and this[] are both a split of an existing blame_entry`
			`* that passes blame to the parent. Maintain best_so_far the best split`
			`* so far, by comparing this and best_so_far and copying this into`
			`* bst_so_far as needed.`
			`*/`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`static void copy_split_if_better(struct scoreboard *sb,`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`struct blame_entry *best_so_far,`
			`struct blame_entry *this)`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`{`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`int i;`

git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`if (!this[1].suspect)`
			`return;`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`if (best_so_far[1].suspect) {`
			`if (ent_score(sb, &this[1]) < ent_score(sb, &best_so_far[1]))`
			`return;`
			`}`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00
			`for (i = 0; i < 3; i++)`
			`origin_incref(this[i].suspect);`
			`decref_split(best_so_far);`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`memcpy(best_so_far, this, sizeof(struct blame_entry [3]));`
			`}`

blame: Notice a wholesale incorporation of an existing file. The -C option to blame tries to find a section of a preimage file by running diff against the lines whose origin is still unknown, and excluding the different parts. The code however did not cover the case where the tail part of the section matched, which we handle for the normal non-move/copy codepath. This breakage was most visible when preimage file matches in its entirety and failed to pass blame in such a case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-05 18:13:26 +02:00			`/*`
			`* We are looking at a part of the final image represented by`
			`* ent (tlno and same are offset by ent->s_lno).`
			`* tlno is where we are looking at in the final image.`
			`* up to (but not including) same match preimage.`
			`* plno is where we are looking at in the preimage.`
			`*`
			`* <-------------- final image ---------------------->`
			`* <------ent------>`
			`* ^tlno ^same`
			`* <---------preimage----->`
			`* ^plno`
			`*`
			`* All line numbers are 0-based.`
			`*/`
			`static void handle_split(struct scoreboard *sb,`
			`struct blame_entry *ent,`
			`int tlno, int plno, int same,`
			`struct origin *parent,`
			`struct blame_entry *split)`
			`{`
			`if (ent->num_lines <= tlno)`
			`return;`
			`if (tlno < same) {`
			`struct blame_entry this[3];`
			`tlno += ent->s_lno;`
			`same += ent->s_lno;`
			`split_overlap(this, ent, tlno, plno, same, parent);`
			`copy_split_if_better(sb, split, this);`
			`decref_split(this);`
			`}`
			`}`

blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`struct handle_split_cb_data {`
			`struct scoreboard *sb;`
			`struct blame_entry *ent;`
			`struct origin *parent;`
			`struct blame_entry *split;`
			`long plno;`
			`long tlno;`
			`};`

blame: use hunk_func(), part 2 Use handle_split_cb() directly as hunk_func() callback, without going through xdi_diff_hunks(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:22:47 +02:00			`static int handle_split_cb(long start_a, long count_a,`
			`long start_b, long count_b, void *data)`
blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`{`
			`struct handle_split_cb_data *d = data;`
blame: use hunk_func(), part 2 Use handle_split_cb() directly as hunk_func() callback, without going through xdi_diff_hunks(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:22:47 +02:00			`handle_split(d->sb, d->ent, d->tlno, d->plno, start_b, d->parent,`
			`d->split);`
			`d->plno = start_a + count_a;`
			`d->tlno = start_b + count_b;`
			`return 0;`
blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Find the lines from parent that are the same as ent so that`
			`* we can pass blames to it. file_p has the blob contents for`
			`* the parent.`
			`*/`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`static void find_copy_in_blob(struct scoreboard *sb,`
			`struct blame_entry *ent,`
			`struct origin *parent,`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`struct blame_entry *split,`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`mmfile_t *file_p)`
			`{`
			`const char *cp;`
			`mmfile_t file_o;`
Rewrite dynamic structure initializations to runtime assignment Unfortunately, there are still plenty of production systems with vendor compilers that choke unless all compound declarations can be determined statically at compile time, for example hpux10.20 (I can provide a comprehensive list of our supported platforms that exhibit this problem if necessary). This patch simply breaks apart any compound declarations with dynamic initialisation expressions, and moves the initialisation until after the last declaration in the same block, in all the places necessary to have the offending compilers accept the code. Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-14 11:31:33 +02:00			`struct handle_split_cb_data d;`
blame: use hunk_func(), part 2 Use handle_split_cb() directly as hunk_func() callback, without going through xdi_diff_hunks(). Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:22:47 +02:00
Rewrite dynamic structure initializations to runtime assignment Unfortunately, there are still plenty of production systems with vendor compilers that choke unless all compound declarations can be determined statically at compile time, for example hpux10.20 (I can provide a comprehensive list of our supported platforms that exhibit this problem if necessary). This patch simply breaks apart any compound declarations with dynamic initialisation expressions, and moves the initialisation until after the last declaration in the same block, in all the places necessary to have the offending compilers accept the code. Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-14 11:31:33 +02:00			`memset(&d, 0, sizeof(d));`
			`d.sb = sb; d.ent = ent; d.parent = parent; d.split = split;`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Prepare mmfile that contains only the lines in ent.`
			`*/`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`cp = nth_line(sb, ent->lno);`
Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-01 11:06:36 +02:00			`file_o.ptr = (char *) cp;`
builtin/blame.c::find_copy_in_blob: no need to scan for region end The region end can be looked up just like its beginning. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-22 17:02:47 +01:00			`file_o.size = nth_line(sb, ent->lno + ent->num_lines) - cp;`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00
blame: Notice a wholesale incorporation of an existing file. The -C option to blame tries to find a section of a preimage file by running diff against the lines whose origin is still unknown, and excluding the different parts. The code however did not cover the case where the tail part of the section matched, which we handle for the normal non-move/copy codepath. This breakage was most visible when preimage file matches in its entirety and failed to pass blame in such a case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-05 18:13:26 +02:00			`/*`
			`* file_o is a part of final image we are annotating.`
			`* file_p partially may match that image.`
			`*/`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`memset(split, 0, sizeof(struct blame_entry [3]));`
blame: factor out helper for calling xdi_diff() Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-09 22:23:39 +02:00			`diff_hunks(file_p, &file_o, 1, handle_split_cb, &d);`
blame: Notice a wholesale incorporation of an existing file. The -C option to blame tries to find a section of a preimage file by running diff against the lines whose origin is still unknown, and excluding the different parts. The code however did not cover the case where the tail part of the section matched, which we handle for the normal non-move/copy codepath. This breakage was most visible when preimage file matches in its entirety and failed to pass blame in such a case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-05 18:13:26 +02:00			`/* remainder, if any, all match the preimage */`
blame: use xdi_diff_hunks(), get rid of struct patch Based on a patch by Brian Downing, this replaces the struct patch based code for blame passing with calls to xdi_diff_hunks(). This way we avoid generating and then parsing patches; we only let the interesting infos be passed to our callbacks instead. This makes blame a bit faster: $ blame="./git blame -M -C -C -p --incremental v1.6.0" # master $ /usr/bin/time $blame Makefile >/dev/null 1.38user 0.14system 0:01.52elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12226minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.66user 0.13system 0:01.80elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12262minor)pagefaults 0swaps # this patch series $ /usr/bin/time $blame Makefile >/dev/null 1.27user 0.12system 0:01.40elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+11836minor)pagefaults 0swaps $ /usr/bin/time $blame cache.h >/dev/null 1.52user 0.12system 0:01.70elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+12052minor)pagefaults 0swaps Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-25 15:31:36 +02:00			`handle_split(sb, ent, d.tlno, d.plno, ent->num_lines, parent, split);`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`}`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* Move all blame entries from list *source that have a score smaller`
			`* than score_min to the front of list *small.`
			`* Returns a pointer to the link pointing to the old head of the small list.`
			`*/`

			`static struct blame_entry *filter_small(struct scoreboard sb,`
			`struct blame_entry **small,`
			`struct blame_entry **source,`
			`unsigned score_min)`
			`{`
			`struct blame_entry p = source;`
			`struct blame_entry oldsmall = small;`
			`while (p) {`
			`if (ent_score(sb, p) <= score_min) {`
			`*small = p;`
			`small = &p->next;`
			`p = *small;`
			`} else {`
			`*source = p;`
			`source = &p->next;`
			`p = *source;`
			`}`
			`}`
			`*small = oldsmall;`
			`*source = NULL;`
			`return small;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* See if lines currently target is suspected for can be attributed to`
			`* parent.`
			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static void find_move_in_parent(struct scoreboard *sb,`
			`struct blame_entry ***blamed,`
			`struct blame_entry **toosmall,`
			`struct origin *target,`
			`struct origin *parent)`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`{`
git-pickaxe: do not confuse two origins that are the same. It used to be that we can compare the address of the origin structure to determine if they are the same because they are always registered with scoreboard. After introduction of the loop to try finding the best split, that is not true anymore. The current code has rather serious leaks with origin structure, but more importantly it gets confused when two origins that points at the same commit and same path. We might eventually have to refcount and gc origin, but let's fix the correctness issue first. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 09:41:38 +02:00			`struct blame_entry *e, split[3];`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct blame_entry *unblamed = target->suspects;`
			`struct blame_entry *leftover = NULL;`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`mmfile_t file_p;`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (!unblamed)`
			`return; /* nothing remains for this target */`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`fill_origin_blob(&sb->revs->diffopt, parent, &file_p);`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`if (!file_p.ptr)`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`return;`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* At each iteration, unblamed has a NULL-terminated list of`
			`* entries that have not yet been tested for blame. leftover`
			`* contains the reversed list of entries that have been tested`
			`* without being assignable to the parent.`
			`*/`
			`do {`
			`struct blame_entry **unblamedtail = &unblamed;`
			`struct blame_entry *next;`
			`for (e = unblamed; e; e = next) {`
			`next = e->next;`
git-pickaxe: re-scan the blob after making progress with -M Otherwise we would miss copied lines that are contained in the parts before or after the part that we find after splitting the blame_entry (i.e. split[0] and split[2]). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-04 21:37:02 +01:00			`find_copy_in_blob(sb, e, parent, split, &file_p);`
			`if (split[1].suspect &&`
			`blame_move_score < ent_score(sb, &split[1])) {`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`split_blame(blamed, &unblamedtail, split, e);`
			`} else {`
			`e->next = leftover;`
			`leftover = e;`
git-pickaxe: re-scan the blob after making progress with -M Otherwise we would miss copied lines that are contained in the parts before or after the part that we find after splitting the blame_entry (i.e. split[0] and split[2]). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-04 21:37:02 +01:00			`}`
			`decref_split(split);`
			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`*unblamedtail = NULL;`
			`toosmall = filter_small(sb, toosmall, &unblamed, blame_move_score);`
			`} while (unblamed);`
			`target->suspects = reverse_blame(leftover, NULL);`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`}`

git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`struct blame_list {`
			`struct blame_entry *ent;`
			`struct blame_entry split[3];`
			`};`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Count the number of entries the target is suspected for,`
			`* and prepare a list of entry and the best split.`
			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static struct blame_list setup_blame_list(struct blame_entry unblamed,`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`int *num_ents_p)`
			`{`
			`struct blame_entry *e;`
			`int num_ents, i;`
			`struct blame_list *blame_list = NULL;`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`for (e = unblamed, num_ents = 0; e; e = e->next)`
			`num_ents++;`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`if (num_ents) {`
			`blame_list = xcalloc(num_ents, sizeof(struct blame_list));`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`for (e = unblamed, i = 0; e; e = e->next)`
			`blame_list[i++].ent = e;`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`}`
			`*num_ents_p = num_ents;`
			`return blame_list;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* For lines target is suspected for, see if we can find code movement`
			`* across file boundary from the parent commit. porigin is the path`
			`* in the parent we already tried.`
			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`static void find_copy_in_parent(struct scoreboard *sb,`
			`struct blame_entry ***blamed,`
			`struct blame_entry **toosmall,`
			`struct origin *target,`
			`struct commit *parent,`
			`struct origin *porigin,`
			`int opt)`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`{`
			`struct diff_options diff_opts;`
git-pickaxe: swap comparison loop used for -C When assigning blames for code movements across file boundaries, we used to iterate over blame entries (i.e. groups of lines to be blamed) in the outer loop and compared each entry with paths in the parent commit in an inner loop. This meant that we opened the blob data from each path number of times. Reorganize the loop so that we read the same path only once, and compare it against all relevant blame entries. This should perform better, but seems to give mixed results, though. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 12:30:53 +02:00			`int i, j;`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`struct blame_list *blame_list;`
git-pickaxe: swap comparison loop used for -C When assigning blames for code movements across file boundaries, we used to iterate over blame entries (i.e. groups of lines to be blamed) in the outer loop and compared each entry with paths in the parent commit in an inner loop. This meant that we opened the blob data from each path number of times. Reorganize the loop so that we read the same path only once, and compare it against all relevant blame entries. This should perform better, but seems to give mixed results, though. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 12:30:53 +02:00			`int num_ents;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct blame_entry *unblamed = target->suspects;`
			`struct blame_entry *leftover = NULL;`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (!unblamed)`
			`return; /* nothing remains for this target */`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00
			`diff_setup(&diff_opts);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&diff_opts, RECURSIVE);`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;`

diff_setup_done(): return void diff_setup_done() has historically returned an error code, but lost the last nonzero return in 943d5b7 (allow diff.renamelimit to be set regardless of -M/-C, 2006-08-09). The callers were in a pretty confused state: some actually checked for the return code, and some did not. Let it return void, and patch all callers to take this into account. This conveniently also gets rid of a handful of different(!) error messages that could never be triggered anyway. Note that the function can still die(). Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-03 14:16:24 +02:00			`diff_setup_done(&diff_opts);`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00
			`/* Try "find copies harder" on new path if requested;`
			`* we do not want to use diffcore_rename() actually to`
			`* match things up; find_copies_harder is set only to`
			`* force diff_tree_sha1() to feed all filepairs to diff_queue,`
			`* and this code needs to be after diff_setup_done(), which`
			`* usually makes find-copies-harder imply copy detection.`
			`*/`
blame: -C -C -C When you do this, existing "blame -C -C" would not find that the latter half of the file2 came from the existing file1: ... both file1 and file2 are tracked ... $ cat file1 >>file2 $ git add file1 file2 $ git commit This is because we avoid the expensive find-copies-harder code that makes unchanged file (in this case, file1) as a candidate for copy & paste source when annotating an existing file (file2). The third -C now allows it. However, this obviously makes the process very expensive. We've actually seen this patch before, but I dismissed it because it covers such a narrow (and arguably stupid) corner case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-06 06:18:57 +02:00			`if ((opt & PICKAXE_BLAME_COPY_HARDEST)`
			`\|\| ((opt & PICKAXE_BLAME_COPY_HARDER)`
			`&& (!porigin \|\| strcmp(target->path, porigin->path))))`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&diff_opts, FIND_COPIES_HARDER);`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`if (is_null_sha1(target->commit->object.sha1))`
			`do_diff_cache(parent->tree->object.sha1, &diff_opts);`
			`else`
			`diff_tree_sha1(parent->tree->object.sha1,`
			`target->commit->tree->object.sha1,`
			`"", &diff_opts);`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (!DIFF_OPT_TST(&diff_opts, FIND_COPIES_HARDER))`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`diffcore_std(&diff_opts);`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`do {`
			`struct blame_entry **unblamedtail = &unblamed;`
			`blame_list = setup_blame_list(unblamed, &num_ents);`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00
			`for (i = 0; i < diff_queued_diff.nr; i++) {`
			`struct diff_filepair *p = diff_queued_diff.queue[i];`
			`struct origin *norigin;`
			`mmfile_t file_p;`
			`struct blame_entry this[3];`

			`if (!DIFF_FILE_VALID(p->one))`
			`continue; /* does not exist in parent */`
builtin-blame: Fix blame -C -C with submodules. When performing copy detection, git-blame tries to read gitlinks as blobs, which causes it to die. This patch adds a check to skip them. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-10-03 18:23:50 +02:00			`if (S_ISGITLINK(p->one->mode))`
			`continue; /* ignore git links */`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`if (porigin && !strcmp(p->one->path, porigin->path))`
			`/* find_move already dealt with this path */`
			`continue;`

			`norigin = get_origin(sb, parent, p->one->path);`
			`hashcpy(norigin->blob_sha1, p->one->sha1);`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`norigin->mode = p->one->mode;`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`fill_origin_blob(&sb->revs->diffopt, norigin, &file_p);`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`if (!file_p.ptr)`
			`continue;`

			`for (j = 0; j < num_ents; j++) {`
			`find_copy_in_blob(sb, blame_list[j].ent,`
			`norigin, this, &file_p);`
			`copy_split_if_better(sb, blame_list[j].split,`
			`this);`
			`decref_split(this);`
			`}`
			`origin_decref(norigin);`
			`}`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00
git-pickaxe: swap comparison loop used for -C When assigning blames for code movements across file boundaries, we used to iterate over blame entries (i.e. groups of lines to be blamed) in the outer loop and compared each entry with paths in the parent commit in an inner loop. This meant that we opened the blob data from each path number of times. Reorganize the loop so that we read the same path only once, and compare it against all relevant blame entries. This should perform better, but seems to give mixed results, though. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 12:30:53 +02:00			`for (j = 0; j < num_ents; j++) {`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`struct blame_entry *split = blame_list[j].split;`
			`if (split[1].suspect &&`
			`blame_copy_score < ent_score(sb, &split[1])) {`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`split_blame(blamed, &unblamedtail, split,`
			`blame_list[j].ent);`
			`} else {`
			`blame_list[j].ent->next = leftover;`
			`leftover = blame_list[j].ent;`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`}`
			`decref_split(split);`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`}`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`free(blame_list);`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`*unblamedtail = NULL;`
			`toosmall = filter_small(sb, toosmall, &unblamed, blame_copy_score);`
			`} while (unblamed);`
			`target->suspects = reverse_blame(leftover, NULL);`
git-pickaxe: re-scan the blob after making progress with -C The reason to do this is the same as in the previous change for line copy detection within the same file (-M). Also this fixes -C and -C -C (aka find-copies-harder) logic; in this application we are not interested in the similarity matching diffcore-rename makes, because we are only interested in scanning files that were modified, or in the case of -C -C, scanning all files in the parent and we want to do that ourselves. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 01:39:03 +01:00			`diff_flush(&diff_opts);`
remove diff_tree_{setup,release}_paths Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-07-14 10:35:58 +02:00			`free_pathspec(&diff_opts.pathspec);`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* The blobs of origin and porigin exactly match, so everything`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`* origin is suspected for can be blamed on the parent.`
			`*/`
			`static void pass_whole_blame(struct scoreboard *sb,`
			`struct origin origin, struct origin porigin)`
			`{`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct blame_entry e, suspects;`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00
			`if (!porigin->file.ptr && origin->file.ptr) {`
			`/* Steal its file */`
			`porigin->file = origin->file;`
			`origin->file.ptr = NULL;`
			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`suspects = origin->suspects;`
			`origin->suspects = NULL;`
			`for (e = suspects; e; e = e->next) {`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`origin_incref(porigin);`
			`origin_decref(e->suspect);`
			`e->suspect = porigin;`
			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`queue_blames(sb, porigin, suspects);`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`}`

builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`/*`
			`* We pass blame from the current commit to its parents. We keep saying`
			`* "parent" (and "porigin"), but what we mean is to find scapegoat to`
			`* exonerate ourselves.`
			`*/`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`static struct commit_list first_scapegoat(struct rev_info revs, struct commit *commit)`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`{`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`if (!reverse)`
			`return commit->parents;`
			`return lookup_decoration(&revs->children, &commit->object);`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`}`

git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`static int num_scapegoats(struct rev_info revs, struct commit commit)`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`{`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`struct commit_list *l = first_scapegoat(revs, commit);`
use commit_list_count() to count the members of commit_lists Call commit_list_count() instead of open-coding it repeatedly. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-07-17 01:52:09 +02:00			`return commit_list_count(l);`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`}`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`/* Distribute collected unsorted blames to the respected sorted lists`
			`* in the various origins.`
			`*/`
			`static void distribute_blame(struct scoreboard sb, struct blame_entry blamed)`
			`{`
			`blamed = blame_sort(blamed, compare_blame_suspect);`
			`while (blamed)`
			`{`
			`struct origin *porigin = blamed->suspect;`
			`struct blame_entry *suspects = NULL;`
			`do {`
			`struct blame_entry *next = blamed->next;`
			`blamed->next = suspects;`
			`suspects = blamed;`
			`blamed = next;`
			`} while (blamed && blamed->suspect == porigin);`
			`suspects = reverse_blame(suspects, NULL);`
			`queue_blames(sb, porigin, suspects);`
			`}`
			`}`

builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`#define MAXSG 16`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`static void pass_blame(struct scoreboard sb, struct origin origin, int opt)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`struct rev_info *revs = sb->revs;`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`int i, pass, num_sg;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct commit *commit = origin->commit;`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`struct commit_list *sg;`
			`struct origin *sg_buf[MAXSG];`
			`struct origin porigin, *sg_origin = sg_buf;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct blame_entry *toosmall = NULL;`
			`struct blame_entry blames, *blametail = &blames;`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`num_sg = num_scapegoats(revs, commit);`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`if (!num_sg)`
			`goto finish;`
			`else if (num_sg < ARRAY_SIZE(sg_buf))`
			`memset(sg_buf, 0, sizeof(sg_buf));`
			`else`
			`sg_origin = xcalloc(num_sg, sizeof(*sg_origin));`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`/*`
			`* The first pass looks for unrenamed path to optimize for`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00			`* common cases, then we look for renames in the second pass.`
			`*/`
blame: pay attention to --no-follow If you know your history did not have renames, or if you care only about the history after a large rename that happened some time ago, "git blame --no-follow $path" is a way to tell the command not to bother about renames. When you use -C, the lines that came from the renamed file will still be found without the whole-file rename detection, so it is not all that interesting either way, though. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-21 22:52:25 +02:00			`for (pass = 0; pass < 2 - no_whole_file_rename; pass++) {`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00			`struct origin (find)(struct scoreboard *,`
			`struct commit , struct origin );`
			`find = pass ? find_rename : find_origin;`

git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`for (i = 0, sg = first_scapegoat(revs, commit);`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`i < num_sg && sg;`
			`sg = sg->next, i++) {`
			`struct commit *p = sg->item;`
git-pickaxe: simplify Octopus merges further If more than one parents in an Octopus merge have the same origin, ignore later ones because it would not make any difference in the outcome. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-04 21:20:09 +01:00			`int j, same;`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`if (sg_origin[i])`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00			`continue;`
			`if (parse_commit(p))`
			`continue;`
git-pickaxe: cache one already found path per commit. Depending on how bushy the commit DAG is, this saves calls to the internal diff-tree for fork-point commits. For example, annotating Makefile in the kernel repository saves about a third of such diff-tree calls. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 10:00:01 +01:00			`porigin = find(sb, p, origin);`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00			`if (!porigin)`
			`continue;`
			`if (!hashcmp(porigin->blob_sha1, origin->blob_sha1)) {`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`pass_whole_blame(sb, origin, porigin);`
git-pickaxe: split find_origin() into find_rename() and find_origin(). When a merge adds a new file from the second parent, the earlier code tried to find renames in the first parent before noticing that the vertion from the second parent was added without modification. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-31 02:17:41 +01:00			`origin_decref(porigin);`
			`goto finish;`
			`}`
git-pickaxe: simplify Octopus merges further If more than one parents in an Octopus merge have the same origin, ignore later ones because it would not make any difference in the outcome. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-04 21:20:09 +01:00			`for (j = same = 0; j < i; j++)`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`if (sg_origin[j] &&`
			`!hashcmp(sg_origin[j]->blob_sha1,`
git-pickaxe: simplify Octopus merges further If more than one parents in an Octopus merge have the same origin, ignore later ones because it would not make any difference in the outcome. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-04 21:20:09 +01:00			`porigin->blob_sha1)) {`
			`same = 1;`
			`break;`
			`}`
			`if (!same)`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`sg_origin[i] = porigin;`
git-pickaxe: simplify Octopus merges further If more than one parents in an Octopus merge have the same origin, ignore later ones because it would not make any difference in the outcome. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-04 21:20:09 +01:00			`else`
			`origin_decref(porigin);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
			`}`

git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`num_commits++;`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`for (i = 0, sg = first_scapegoat(revs, commit);`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`i < num_sg && sg;`
			`sg = sg->next, i++) {`
			`struct origin *porigin = sg_origin[i];`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`if (!porigin)`
			`continue;`
blame: show "previous" information in --porcelain/--incremental format When the final blame is laid for a line to a <commit, path> pair, it also gives a "previous" information to --porcelain and --incremental output format. It gives the parent commit of the blamed commit, _and_ a path in that parent commit that corresponds to the blamed path --- in short, it is the origin that would have been blamed (or passed blame through) for the line _if_ the blamed commit did not change that line. This unfortunately makes sanity checking of refcount quite complex, so I ripped it out for now. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:58:40 +02:00			`if (!origin->previous) {`
			`origin_incref(porigin);`
			`origin->previous = porigin;`
			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`pass_blame_to_parent(sb, origin, porigin);`
			`if (!origin->suspects)`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`goto finish;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00
			`/*`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`* Optionally find moves in parents' files.`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (opt & PICKAXE_BLAME_MOVE) {`
			`filter_small(sb, &toosmall, &origin->suspects, blame_move_score);`
			`if (origin->suspects) {`
			`for (i = 0, sg = first_scapegoat(revs, commit);`
			`i < num_sg && sg;`
			`sg = sg->next, i++) {`
			`struct origin *porigin = sg_origin[i];`
			`if (!porigin)`
			`continue;`
			`find_move_in_parent(sb, &blametail, &toosmall, origin, porigin);`
			`if (!origin->suspects)`
			`break;`
			`}`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`}`
git-pickaxe -M: blame line movements within a file. This makes pickaxe more intelligent than the classic blame. A typical example is a change that moves one static C function from lower part of the file to upper part of the same file, because you added a new caller in the middle. The versions in the parent and the child would look like this: parent child A static foo() { B ... C } D A E B F C G D static foo() { ... call foo(); ... E } F H G H With the classic blame algorithm, we can blame lines A B C D E F G and H to the parent. The child is guilty of introducing the line "... call foo();", and the blame is placed on the child. However, the classic blame algorithm fails to notice that the implementation of foo() at the top of the file is not new, and moved from the lower part of the parent. This commit introduces detection of such line movements, and correctly blames the lines that were simply moved in the file to the parent. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:49:30 +02:00
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`/*`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`* Optionally find copies from parents' files.`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`*/`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (opt & PICKAXE_BLAME_COPY) {`
			`if (blame_copy_score > blame_move_score)`
			`filter_small(sb, &toosmall, &origin->suspects, blame_copy_score);`
			`else if (blame_copy_score < blame_move_score) {`
			`origin->suspects = blame_merge(origin->suspects, toosmall);`
			`toosmall = NULL;`
			`filter_small(sb, &toosmall, &origin->suspects, blame_copy_score);`
			`}`
			`if (!origin->suspects)`
			`goto finish;`

git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`for (i = 0, sg = first_scapegoat(revs, commit);`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`i < num_sg && sg;`
			`sg = sg->next, i++) {`
			`struct origin *porigin = sg_origin[i];`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`find_copy_in_parent(sb, &blametail, &toosmall,`
			`origin, sg->item, porigin, opt);`
			`if (!origin->suspects)`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`goto finish;`
git-pickaxe -C: blame cut-and-pasted lines. This completes the initial round of git-pickaxe. In addition to the detection of line movements we already have, this finds new lines that were created by moving or cutting-and-pasting lines from different files in the parent. With this, git pickaxe -f -n -C v1.4.0 -- revision.c finds that a major part of that file actually came from rev-list.c when Linus split the latter at commit ae563642 and blames them to earlier commits that touch rev-list.c. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 03:50:17 +02:00			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`}`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`finish:`
			`*blametail = NULL;`
			`distribute_blame(sb, blames);`
			`/*`
			`* prepend toosmall to origin->suspects`
			`*`
			`* There is no point in sorting: this ends up on a big`
			`* unsorted list in the caller anyway.`
			`*/`
			`if (toosmall) {`
			`struct blame_entry **tail = &toosmall;`
			`while (*tail)`
			`tail = &(*tail)->next;`
			`*tail = origin->suspects;`
			`origin->suspects = toosmall;`
			`}`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`for (i = 0; i < num_sg; i++) {`
			`if (sg_origin[i]) {`
			`drop_origin_blob(sg_origin[i]);`
			`origin_decref(sg_origin[i]);`
blame: drop blob data after passing blame to the parent We used to keep the blob data for each origin that has any remaining line in the result, but this will get very costly with a huge file that has a deep history. This patch releases the blob after we ran diff between the child rev and its parents. When passing blame from a parent to its parent (i.e. the grandparent), the blob data for the parent may need to be read again, but it should be relatively cheap, thanks to delta-base cache. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-12 01:05:50 +01:00			`}`
			`}`
			`drop_origin_blob(origin);`
builtin-blame.c: allow more than 16 parents This removes the hardcoded 16 parents limit from git-blame by allowing the parent array to be allocated dynamically. As the ultimate objective is not about allowing dodecapus, but about annotating the history upside down, it also renames "parent" in the code to "scapegoat"; the name of the game used to be "pass blame to your parents", but now it is "find a scapegoat to pass blame on". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 09:56:23 +02:00			`if (sg_buf != sg_origin)`
			`free(sg_origin);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Information on commits, used for output.`
			`*/`
standardize brace placement in struct definitions In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char baz; }; Indeed, grepping for 'struct [a-z_] {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-16 08:08:34 +01:00			`struct commit_info {`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`struct strbuf author;`
			`struct strbuf author_mail;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`unsigned long author_time;`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`struct strbuf author_tz;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`/* filled only when asked for details */`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`struct strbuf committer;`
			`struct strbuf committer_mail;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`unsigned long committer_time;`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`struct strbuf committer_tz;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`struct strbuf summary;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`};`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Parse author/committer line in the commit object buffer`
			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`static void get_ac_line(const char inbuf, const char what,`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`struct strbuf name, struct strbuf mail,`
			`unsigned long time, struct strbuf tz)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
Use split_ident_line to parse author and committer Currently blame.c::get_acline(), pretty.c::pp_user_info() and shortlog.c::insert_one_record() are parsing author name, email, time and tz themselves. Use ident.c::split_ident_line() for better code reuse. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:38 +01:00			`struct ident_split ident;`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`size_t len, maillen, namelen;`
			`char tmp, endp;`
			`const char namebuf, mailbuf;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`tmp = strstr(inbuf, what);`
			`if (!tmp)`
			`goto error_out;`
			`tmp += strlen(what);`
			`endp = strchr(tmp, '\n');`
			`if (!endp)`
			`len = strlen(tmp);`
			`else`
			`len = endp - tmp;`
Use split_ident_line to parse author and committer Currently blame.c::get_acline(), pretty.c::pp_user_info() and shortlog.c::insert_one_record() are parsing author name, email, time and tz themselves. Use ident.c::split_ident_line() for better code reuse. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:38 +01:00
			`if (split_ident_line(&ident, tmp, len)) {`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`error_out:`
			`/* Ugh */`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`tmp = "(unknown)";`
			`strbuf_addstr(name, tmp);`
			`strbuf_addstr(mail, tmp);`
			`strbuf_addstr(tz, tmp);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*time = 0;`
			`return;`
			`}`

Use split_ident_line to parse author and committer Currently blame.c::get_acline(), pretty.c::pp_user_info() and shortlog.c::insert_one_record() are parsing author name, email, time and tz themselves. Use ident.c::split_ident_line() for better code reuse. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:38 +01:00			`namelen = ident.name_end - ident.name_begin;`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`namebuf = ident.name_begin;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`maillen = ident.mail_end - ident.mail_begin;`
			`mailbuf = ident.mail_begin;`
Apply mailmap in git-blame output. This makes git-blame to use the same mailmap used by git-shortlog. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-27 09:42:15 +02:00
blame: handle broken commit headers gracefully split_ident_line() can leave us with the pointers date_begin, date_end, tz_begin and tz_end all set to NULL. Check them before use and supply the same fallback values as in the case of a negative return code from split_ident_line(). The "(unknown)" is not actually shown in the output, though, because it will be converted to a number (zero) eventually. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-17 20:33:54 +02:00			`if (ident.date_begin && ident.date_end)`
			`*time = strtoul(ident.date_begin, NULL, 10);`
			`else`
			`*time = 0;`
Apply mailmap in git-blame output. This makes git-blame to use the same mailmap used by git-shortlog. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-27 09:42:15 +02:00
blame: handle broken commit headers gracefully split_ident_line() can leave us with the pointers date_begin, date_end, tz_begin and tz_end all set to NULL. Check them before use and supply the same fallback values as in the case of a negative return code from split_ident_line(). The "(unknown)" is not actually shown in the output, though, because it will be converted to a number (zero) eventually. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-17 20:33:54 +02:00			`if (ident.tz_begin && ident.tz_end)`
			`strbuf_add(tz, ident.tz_begin, ident.tz_end - ident.tz_begin);`
			`else`
			`strbuf_addstr(tz, "(unknown)");`
Apply mailmap in git-blame output. This makes git-blame to use the same mailmap used by git-shortlog. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-27 09:42:15 +02:00
			`/*`
Change current mailmap usage to do matching on both name and email of author/committer. Signed-off-by: Marius Storm-Olsen <marius@trolltech.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-08 15:34:30 +01:00			`* Now, convert both name and e-mail using mailmap`
Apply mailmap in git-blame output. This makes git-blame to use the same mailmap used by git-shortlog. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-27 09:42:15 +02:00			`*/`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`map_user(&mailmap, &mailbuf, &maillen,`
			`&namebuf, &namelen);`

			`strbuf_addf(mail, "<%.*s>", (int)maillen, mailbuf);`
			`strbuf_add(name, namebuf, namelen);`
			`}`

			`static void commit_info_init(struct commit_info *ci)`
			`{`

			`strbuf_init(&ci->author, 0);`
			`strbuf_init(&ci->author_mail, 0);`
			`strbuf_init(&ci->author_tz, 0);`
			`strbuf_init(&ci->committer, 0);`
			`strbuf_init(&ci->committer_mail, 0);`
			`strbuf_init(&ci->committer_tz, 0);`
			`strbuf_init(&ci->summary, 0);`
			`}`

			`static void commit_info_destroy(struct commit_info *ci)`
			`{`

			`strbuf_release(&ci->author);`
			`strbuf_release(&ci->author_mail);`
			`strbuf_release(&ci->author_tz);`
			`strbuf_release(&ci->committer);`
			`strbuf_release(&ci->committer_mail);`
			`strbuf_release(&ci->committer_tz);`
			`strbuf_release(&ci->summary);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

			`static void get_commit_info(struct commit *commit,`
			`struct commit_info *ret,`
			`int detailed)`
			`{`
			`int len;`
pretty: remove reencode_commit_message() This function has only two callsites, and is a thin wrapper whose usefulness is dubious. When the caller needs to learn the log output encoding, it should be able to do so by directly calling get_log_output_encoding() and calling the underlying logmsg_reencode() with it. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-18 02:12:55 +02:00			`const char subject, encoding;`
logmsg_reencode: return const buffer The return value from logmsg_reencode may be either a newly allocated buffer or a pointer to the existing commit->buffer. We would not want the caller to accidentally free() or modify the latter, so let's mark it as const. We can cast away the constness in logmsg_free, but only once we have determined that it is a free-able buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-10 23:39:30 +02:00			`const char *message;`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00
			`commit_info_init(ret);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
pretty: remove reencode_commit_message() This function has only two callsites, and is a thin wrapper whose usefulness is dubious. When the caller needs to learn the log output encoding, it should be able to do so by directly calling get_log_output_encoding() and calling the underlying logmsg_reencode() with it. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-18 02:12:55 +02:00			`encoding = get_log_output_encoding();`
pretty: save commit encoding from logmsg_reencode if the caller needs it The commit encoding is parsed by logmsg_reencode, there's no need for the caller to re-parse it again. The reencoded message now has the new encoding, not the original one. The caller would need to read commit object again before parsing. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-19 01:08:40 +02:00			`message = logmsg_reencode(commit, NULL, encoding);`
builtin-blame: Reencode commit messages according to git-log rules. Currently git-blame outputs text from the commit messages (e.g. the author name and the summary string) as-is, without even providing any information about the encoding used for the data. It makes interpreting the data in multilingual environment very difficult. This commit changes the blame implementation to recode the messages using the rules used by other commands like git-log. Namely, the target encoding can be specified through the i18n.commitEncoding or i18n.logOutputEncoding options, or directly on the command line using the --encoding parameter. Converting the encoding before output seems to be more friendly to the porcelain tools than simply providing the value of the encoding header, and does not require changing the output format. If anybody needs the old behavior, it is possible to achieve it by specifying --encoding=none. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-21 22:55:57 +02:00			`get_ac_line(message, "\nauthor ",`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`&ret->author, &ret->author_mail,`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`&ret->author_time, &ret->author_tz);`

builtin-blame: Reencode commit messages according to git-log rules. Currently git-blame outputs text from the commit messages (e.g. the author name and the summary string) as-is, without even providing any information about the encoding used for the data. It makes interpreting the data in multilingual environment very difficult. This commit changes the blame implementation to recode the messages using the rules used by other commands like git-log. Namely, the target encoding can be specified through the i18n.commitEncoding or i18n.logOutputEncoding options, or directly on the command line using the --encoding parameter. Converting the encoding before output seems to be more friendly to the porcelain tools than simply providing the value of the encoding header, and does not require changing the output format. If anybody needs the old behavior, it is possible to achieve it by specifying --encoding=none. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-21 22:55:57 +02:00			`if (!detailed) {`
convert logmsg_reencode to get_commit_buffer Like the callsites in the previous commit, logmsg_reencode already falls back to read_sha1_file when necessary. However, I split its conversion out into its own commit because it's a bit more complex. We return either: 1. The original commit->buffer 2. A newly allocated buffer from read_sha1_file 3. A reencoded buffer (based on either 1 or 2 above). while trying to do as few extra reads/allocations as possible. Callers currently free the result with logmsg_free, but we can simplify this by pointing them straight to unuse_commit_buffer. This is a slight layering violation, in that we may be passing a buffer from (3). However, since the end result is to free() anything except (1), which is unlikely to change, and because this makes the interface much simpler, it's a reasonable bending of the rules. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-10 23:41:39 +02:00			`unuse_commit_buffer(commit, message);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`return;`
builtin-blame: Reencode commit messages according to git-log rules. Currently git-blame outputs text from the commit messages (e.g. the author name and the summary string) as-is, without even providing any information about the encoding used for the data. It makes interpreting the data in multilingual environment very difficult. This commit changes the blame implementation to recode the messages using the rules used by other commands like git-log. Namely, the target encoding can be specified through the i18n.commitEncoding or i18n.logOutputEncoding options, or directly on the command line using the --encoding parameter. Converting the encoding before output seems to be more friendly to the porcelain tools than simply providing the value of the encoding header, and does not require changing the output format. If anybody needs the old behavior, it is possible to achieve it by specifying --encoding=none. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-21 22:55:57 +02:00			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
builtin-blame: Reencode commit messages according to git-log rules. Currently git-blame outputs text from the commit messages (e.g. the author name and the summary string) as-is, without even providing any information about the encoding used for the data. It makes interpreting the data in multilingual environment very difficult. This commit changes the blame implementation to recode the messages using the rules used by other commands like git-log. Namely, the target encoding can be specified through the i18n.commitEncoding or i18n.logOutputEncoding options, or directly on the command line using the --encoding parameter. Converting the encoding before output seems to be more friendly to the porcelain tools than simply providing the value of the encoding header, and does not require changing the output format. If anybody needs the old behavior, it is possible to achieve it by specifying --encoding=none. Signed-off-by: Alexander Gavrilov <angavrilov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-10-21 22:55:57 +02:00			`get_ac_line(message, "\ncommitter ",`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`&ret->committer, &ret->committer_mail,`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`&ret->committer_time, &ret->committer_tz);`

blame: use find_commit_subject() instead of custom code Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-22 15:18:35 +02:00			`len = find_commit_subject(message, &subject);`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`if (len)`
			`strbuf_add(&ret->summary, subject, len);`
			`else`
			`strbuf_addf(&ret->summary, "(%s)", sha1_to_hex(commit->object.sha1));`

convert logmsg_reencode to get_commit_buffer Like the callsites in the previous commit, logmsg_reencode already falls back to read_sha1_file when necessary. However, I split its conversion out into its own commit because it's a bit more complex. We return either: 1. The original commit->buffer 2. A newly allocated buffer from read_sha1_file 3. A reencoded buffer (based on either 1 or 2 above). while trying to do as few extra reads/allocations as possible. Callers currently free the result with logmsg_free, but we can simplify this by pointing them straight to unuse_commit_buffer. This is a slight layering violation, in that we may be passing a buffer from (3). However, since the end result is to free() anything except (1), which is unlikely to change, and because this makes the interface much simpler, it's a reasonable bending of the rules. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-10 23:41:39 +02:00			`unuse_commit_buffer(commit, message);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* To allow LF and other nonportable characters in pathnames,`
			`* they are c-style quoted as needed.`
			`*/`
git-blame --porcelain: quote filename in c-style when needed. Otherwise a pathname that has funny characters such as LF would screw up the parsing programs of the output. Strictly speaking, this is not backward compatible, but the current output for pathnames that have embedded LF and such cannot be sanely parsed anyway, and pathnames that only use characters from the portable pathname character set won't be affected. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:42:31 +01:00			`static void write_filename_info(const char *path)`
			`{`
			`printf("filename ");`
Full rework of quote_c_style and write_name_quoted. * quote_c_style works on a strbuf instead of a wild buffer. * quote_c_style is now clever enough to not add double quotes if not needed. * write_name_quoted inherits those advantages, but also take a different set of arguments. Now instead of asking for quotes or not, you pass a "terminator". If it's \0 then we assume you don't want to escape, else C escaping is performed. In any case, the terminator is also appended to the stream. It also no longer takes the prefix/prefix_len arguments, as it's seldomly used, and makes some optimizations harder. * write_name_quotedpfx is created to work like write_name_quoted and take the prefix/prefix_len arguments. Thanks to those API changes, diff.c has somehow lost weight, thanks to the removal of functions that were wrappers around the old write_name_quoted trying to give it a semantics like the new one, but performing a lot of allocations for this goal. Now we always write directly to the stream, no intermediate allocation is performed. As a side effect of the refactor in builtin-apply.c, the length of the bar graphs in diffstats are not affected anymore by the fact that the path was clipped. Signed-off-by: Pierre Habouzit <madcoder@debian.org> 2007-09-20 00:42:15 +02:00			`write_name_quoted(path, stdout, '\n');`
git-blame --porcelain: quote filename in c-style when needed. Otherwise a pathname that has funny characters such as LF would screw up the parsing programs of the output. Strictly speaking, this is not backward compatible, but the current output for pathnames that have embedded LF and such cannot be sanely parsed anyway, and pathnames that only use characters from the portable pathname character set won't be affected. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:42:31 +01:00			`}`

git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`/*`
			`* Porcelain/Incremental format wants to show a lot of details per`
			`* commit. Instead of repeating this every line, emit it only once,`
blame: refactor porcelain output This is in preparation for adding more porcelain output options. The three changes are: 1. emit_porcelain now receives the format option flags 2. emit_one_suspect_detail takes an optional "repeat" parameter to suppress the "show only once" behavior 3. The code for emitting porcelain suspect is factored into its own function for repeatability. There should be no functional changes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:02 +02:00			`* the first time each commit appears in the output (unless the`
			`* user has specifically asked for us to repeat).`
git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`*/`
blame: refactor porcelain output This is in preparation for adding more porcelain output options. The three changes are: 1. emit_porcelain now receives the format option flags 2. emit_one_suspect_detail takes an optional "repeat" parameter to suppress the "show only once" behavior 3. The code for emitting porcelain suspect is factored into its own function for repeatability. There should be no functional changes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:02 +02:00			`static int emit_one_suspect_detail(struct origin *suspect, int repeat)`
git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`{`
			`struct commit_info ci;`

blame: refactor porcelain output This is in preparation for adding more porcelain output options. The three changes are: 1. emit_porcelain now receives the format option flags 2. emit_one_suspect_detail takes an optional "repeat" parameter to suppress the "show only once" behavior 3. The code for emitting porcelain suspect is factored into its own function for repeatability. There should be no functional changes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:02 +02:00			`if (!repeat && (suspect->commit->object.flags & METAINFO_SHOWN))`
git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`return 0;`

			`suspect->commit->object.flags \|= METAINFO_SHOWN;`
			`get_commit_info(suspect->commit, &ci, 1);`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`printf("author %s\n", ci.author.buf);`
			`printf("author-mail %s\n", ci.author_mail.buf);`
git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`printf("author-time %lu\n", ci.author_time);`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`printf("author-tz %s\n", ci.author_tz.buf);`
			`printf("committer %s\n", ci.committer.buf);`
			`printf("committer-mail %s\n", ci.committer_mail.buf);`
git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`printf("committer-time %lu\n", ci.committer_time);`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`printf("committer-tz %s\n", ci.committer_tz.buf);`
			`printf("summary %s\n", ci.summary.buf);`
git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`if (suspect->commit->object.flags & UNINTERESTING)`
			`printf("boundary\n");`
blame: show "previous" information in --porcelain/--incremental format When the final blame is laid for a line to a <commit, path> pair, it also gives a "previous" information to --porcelain and --incremental output format. It gives the parent commit of the blamed commit, _and_ a path in that parent commit that corresponds to the blamed path --- in short, it is the origin that would have been blamed (or passed blame through) for the line _if_ the blamed commit did not change that line. This unfortunately makes sanity checking of refcount quite complex, so I ripped it out for now. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:58:40 +02:00			`if (suspect->previous) {`
			`struct origin *prev = suspect->previous;`
			`printf("previous %s ", sha1_to_hex(prev->commit->object.sha1));`
			`write_name_quoted(prev->path, stdout, '\n');`
			`}`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00
			`commit_info_destroy(&ci);`

git-blame: refactor code to emit "porcelain format" output Both the --porcelain and --incremental format shared the same output format but implemented with two identical codepaths. This merges them into one shared function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-06-05 07:48:42 +02:00			`return 1;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* The blame_entry is found to be guilty for the range.`
			`* Show it in incremental output.`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`static void found_guilty_entry(struct blame_entry *ent)`
			`{`
			`if (incremental) {`
			`struct origin *suspect = ent->suspect;`

			`printf("%s %d %d %d\n",`
			`sha1_to_hex(suspect->commit->object.sha1),`
			`ent->s_lno + 1, ent->lno + 1, ent->num_lines);`
blame: refactor porcelain output This is in preparation for adding more porcelain output options. The three changes are: 1. emit_porcelain now receives the format option flags 2. emit_one_suspect_detail takes an optional "repeat" parameter to suppress the "show only once" behavior 3. The code for emitting porcelain suspect is factored into its own function for repeatability. There should be no functional changes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:02 +02:00			`emit_one_suspect_detail(suspect, 0);`
git-blame --porcelain: quote filename in c-style when needed. Otherwise a pathname that has funny characters such as LF would screw up the parsing programs of the output. Strictly speaking, this is not backward compatible, but the current output for pathnames that have embedded LF and such cannot be sanely parsed anyway, and pathnames that only use characters from the portable pathname character set won't be affected. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:42:31 +01:00			`write_filename_info(suspect->path);`
Don't fflush(stdout) when it's not helpful This patch arose from a discussion started by Jim Meyering's patch whose intention was to provide better diagnostics for failed writes. Linus proposed a better way to do things, which also had the added benefit that adding a fflush() to git-log-* operations and incremental git-blame operations could improve interactive respose time feel, at the cost of making things a bit slower when we aren't piping the output to a downstream program. This patch skips the fflush() calls when stdout is a regular file, or if the environment variable GIT_FLUSH is set to "0". This latter can speed up a command such as: GIT_FLUSH=0 strace -c -f -e write time git-rev-list HEAD \| wc -l a tiny amount. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-29 19:40:46 +02:00			`maybe_flush_or_die(stdout, "stdout");`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`}`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`* The main loop -- while we have blobs with lines whose true origin`
			`* is still unknown, pick one blob, and allow its lines to pass blames`
			`* to its parents. */`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`static void assign_blame(struct scoreboard *sb, int opt)`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`{`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`struct rev_info *revs = sb->revs;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct commit *commit = prio_queue_get(&sb->commits);`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`while (commit) {`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`struct blame_entry *ent;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`struct origin *suspect = commit->util;`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00
			`/* find one suspect to break down */`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`while (suspect && !suspect->suspects)`
			`suspect = suspect->next;`

			`if (!suspect) {`
			`commit = prio_queue_get(&sb->commits);`
			`continue;`
			`}`

			`assert(commit == suspect->commit);`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* We will use this suspect later in the loop,`
			`* so hold onto it in the meantime.`
			`*/`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`origin_incref(suspect);`
assume parse_commit checks commit->object.parsed The parse_commit function will check the "parsed" flag of the object and do nothing if it is set. There is no need for callers to check the flag themselves, and doing so only clutters the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-10-24 10:53:01 +02:00			`parse_commit(commit);`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`if (reverse \|\|`
			`(!(commit->object.flags & UNINTERESTING) &&`
			`!(revs->max_age != -1 && commit->date < revs->max_age)))`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`pass_blame(sb, suspect, opt);`
			`else {`
			`commit->object.flags \|= UNINTERESTING;`
			`if (commit->object.parsed)`
			`mark_parents_uninteresting(commit);`
			`}`
			`/* treat root commit as boundary */`
			`if (!commit->parents && !show_root)`
			`commit->object.flags \|= UNINTERESTING;`

			`/* Take responsibility for the remaining entries */`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`ent = suspect->suspects;`
			`if (ent) {`
			`suspect->guilty = 1;`
			`for (;;) {`
			`struct blame_entry *next = ent->next;`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`found_guilty_entry(ent);`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (next) {`
			`ent = next;`
			`continue;`
			`}`
			`ent->next = sb->ent;`
			`sb->ent = suspect->suspects;`
			`suspect->suspects = NULL;`
			`break;`
			`}`
			`}`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`origin_decref(suspect);`

			`if (DEBUG) /* sanity */`
			`sanity_check_refcnt(sb);`
			`}`
			`}`

			`static const char format_time(unsigned long time, const char tz_str,`
			`int show_raw_time)`
			`{`
blame: fix broken time_buf paddings in relative timestamp Command `git blame --date relative` aligns the date field with a fixed-width (defined by blame_date_width), and if time_str is shorter than that, it adds spaces for padding. But there are two bugs in the following codes: time_len = strlen(time_str); ... memset(time_buf + time_len, ' ', blame_date_width - time_len); 1. The type of blame_date_width is size_t, which is unsigned. If time_len is greater than blame_date_width, the result of "blame_date_width - time_len" will never be a negative number, but a really big positive number, and will cause memory overwrite. This bug can be triggered if either l10n message for function show_date_relative() in date.c is longer than 30 characters, then `git blame --date relative` may exit abnormally. 2. When show blame information with relative time, the UTF-8 characters in time_str will break the alignment of columns after the date field. This is because the time_buf padding with spaces should have a constant display width, not a fixed strlen size. So we should call utf8_strwidth() instead of strlen() for width calibration. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-21 08:02:03 +02:00			`static struct strbuf time_buf = STRBUF_INIT;`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00
blame: fix broken time_buf paddings in relative timestamp Command `git blame --date relative` aligns the date field with a fixed-width (defined by blame_date_width), and if time_str is shorter than that, it adds spaces for padding. But there are two bugs in the following codes: time_len = strlen(time_str); ... memset(time_buf + time_len, ' ', blame_date_width - time_len); 1. The type of blame_date_width is size_t, which is unsigned. If time_len is greater than blame_date_width, the result of "blame_date_width - time_len" will never be a negative number, but a really big positive number, and will cause memory overwrite. This bug can be triggered if either l10n message for function show_date_relative() in date.c is longer than 30 characters, then `git blame --date relative` may exit abnormally. 2. When show blame information with relative time, the UTF-8 characters in time_str will break the alignment of columns after the date field. This is because the time_buf padding with spaces should have a constant display width, not a fixed strlen size. So we should call utf8_strwidth() instead of strlen() for width calibration. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-21 08:02:03 +02:00			`strbuf_reset(&time_buf);`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`if (show_raw_time) {`
blame: fix broken time_buf paddings in relative timestamp Command `git blame --date relative` aligns the date field with a fixed-width (defined by blame_date_width), and if time_str is shorter than that, it adds spaces for padding. But there are two bugs in the following codes: time_len = strlen(time_str); ... memset(time_buf + time_len, ' ', blame_date_width - time_len); 1. The type of blame_date_width is size_t, which is unsigned. If time_len is greater than blame_date_width, the result of "blame_date_width - time_len" will never be a negative number, but a really big positive number, and will cause memory overwrite. This bug can be triggered if either l10n message for function show_date_relative() in date.c is longer than 30 characters, then `git blame --date relative` may exit abnormally. 2. When show blame information with relative time, the UTF-8 characters in time_str will break the alignment of columns after the date field. This is because the time_buf padding with spaces should have a constant display width, not a fixed strlen size. So we should call utf8_strwidth() instead of strlen() for width calibration. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-21 08:02:03 +02:00			`strbuf_addf(&time_buf, "%lu %s", time, tz_str);`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`}`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`else {`
builtin/blame.c: reduce scope of variables Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-01-29 14:33:15 +01:00			`const char *time_str;`
blame: fix broken time_buf paddings in relative timestamp Command `git blame --date relative` aligns the date field with a fixed-width (defined by blame_date_width), and if time_str is shorter than that, it adds spaces for padding. But there are two bugs in the following codes: time_len = strlen(time_str); ... memset(time_buf + time_len, ' ', blame_date_width - time_len); 1. The type of blame_date_width is size_t, which is unsigned. If time_len is greater than blame_date_width, the result of "blame_date_width - time_len" will never be a negative number, but a really big positive number, and will cause memory overwrite. This bug can be triggered if either l10n message for function show_date_relative() in date.c is longer than 30 characters, then `git blame --date relative` may exit abnormally. 2. When show blame information with relative time, the UTF-8 characters in time_str will break the alignment of columns after the date field. This is because the time_buf padding with spaces should have a constant display width, not a fixed strlen size. So we should call utf8_strwidth() instead of strlen() for width calibration. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-21 08:02:03 +02:00			`size_t time_width;`
builtin/blame.c: reduce scope of variables Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-01-29 14:33:15 +01:00			`int tz;`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`tz = atoi(tz_str);`
			`time_str = show_date(time, tz, blame_date_mode);`
blame: fix broken time_buf paddings in relative timestamp Command `git blame --date relative` aligns the date field with a fixed-width (defined by blame_date_width), and if time_str is shorter than that, it adds spaces for padding. But there are two bugs in the following codes: time_len = strlen(time_str); ... memset(time_buf + time_len, ' ', blame_date_width - time_len); 1. The type of blame_date_width is size_t, which is unsigned. If time_len is greater than blame_date_width, the result of "blame_date_width - time_len" will never be a negative number, but a really big positive number, and will cause memory overwrite. This bug can be triggered if either l10n message for function show_date_relative() in date.c is longer than 30 characters, then `git blame --date relative` may exit abnormally. 2. When show blame information with relative time, the UTF-8 characters in time_str will break the alignment of columns after the date field. This is because the time_buf padding with spaces should have a constant display width, not a fixed strlen size. So we should call utf8_strwidth() instead of strlen() for width calibration. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-21 08:02:03 +02:00			`strbuf_addstr(&time_buf, time_str);`
			`/*`
			`* Add space paddings to time_buf to display a fixed width`
			`* string, and use time_width for display width calibration.`
			`*/`
			`for (time_width = utf8_strwidth(time_str);`
			`time_width < blame_date_width;`
			`time_width++)`
			`strbuf_addch(&time_buf, ' ');`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`}`
blame: fix broken time_buf paddings in relative timestamp Command `git blame --date relative` aligns the date field with a fixed-width (defined by blame_date_width), and if time_str is shorter than that, it adds spaces for padding. But there are two bugs in the following codes: time_len = strlen(time_str); ... memset(time_buf + time_len, ' ', blame_date_width - time_len); 1. The type of blame_date_width is size_t, which is unsigned. If time_len is greater than blame_date_width, the result of "blame_date_width - time_len" will never be a negative number, but a really big positive number, and will cause memory overwrite. This bug can be triggered if either l10n message for function show_date_relative() in date.c is longer than 30 characters, then `git blame --date relative` may exit abnormally. 2. When show blame information with relative time, the UTF-8 characters in time_str will break the alignment of columns after the date field. This is because the time_buf padding with spaces should have a constant display width, not a fixed strlen size. So we should call utf8_strwidth() instead of strlen() for width calibration. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-21 08:02:03 +02:00			`return time_buf.buf;`
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`}`

git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`#define OUTPUT_ANNOTATE_COMPAT 001`
			`#define OUTPUT_LONG_OBJECT_NAME 002`
			`#define OUTPUT_RAW_TIMESTAMP 004`
			`#define OUTPUT_PORCELAIN 010`
			`#define OUTPUT_SHOW_NAME 020`
			`#define OUTPUT_SHOW_NUMBER 040`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`#define OUTPUT_SHOW_SCORE 0100`
blame -s: suppress author name and time. With this "git blame -b -s HEAD~n..HEAD" becomes a nicer way to review the result of recent changes in context. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 00:50:45 +02:00			`#define OUTPUT_NO_AUTHOR 0200`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`#define OUTPUT_SHOW_EMAIL 0400`
blame: add --line-porcelain output format This is just like --porcelain, except that we always output the commit information for each line, not just the first time it is referenced. This can make quick and dirty scripts much easier to write; see the example added to the blame documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:42 +02:00			`#define OUTPUT_LINE_PORCELAIN 01000`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
blame: refactor porcelain output This is in preparation for adding more porcelain output options. The three changes are: 1. emit_porcelain now receives the format option flags 2. emit_one_suspect_detail takes an optional "repeat" parameter to suppress the "show only once" behavior 3. The code for emitting porcelain suspect is factored into its own function for repeatability. There should be no functional changes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:02 +02:00			`static void emit_porcelain_details(struct origin *suspect, int repeat)`
			`{`
			`if (emit_one_suspect_detail(suspect, repeat) \|\|`
			`(suspect->commit->object.flags & MORE_THAN_ONE_PATH))`
			`write_filename_info(suspect->path);`
			`}`

			`static void emit_porcelain(struct scoreboard sb, struct blame_entry ent,`
			`int opt)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
blame: add --line-porcelain output format This is just like --porcelain, except that we always output the commit information for each line, not just the first time it is referenced. This can make quick and dirty scripts much easier to write; see the example added to the blame documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:42 +02:00			`int repeat = opt & OUTPUT_LINE_PORCELAIN;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`int cnt;`
			`const char *cp;`
			`struct origin *suspect = ent->suspect;`
			`char hex[41];`

			`strcpy(hex, sha1_to_hex(suspect->commit->object.sha1));`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`printf("%s %d %d %d\n",`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`hex,`
			`ent->s_lno + 1,`
			`ent->lno + 1,`
			`ent->num_lines);`
blame: add --line-porcelain output format This is just like --porcelain, except that we always output the commit information for each line, not just the first time it is referenced. This can make quick and dirty scripts much easier to write; see the example added to the blame documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:42 +02:00			`emit_porcelain_details(suspect, repeat);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`cp = nth_line(sb, ent->lno);`
			`for (cnt = 0; cnt < ent->num_lines; cnt++) {`
			`char ch;`
blame: add --line-porcelain output format This is just like --porcelain, except that we always output the commit information for each line, not just the first time it is referenced. This can make quick and dirty scripts much easier to write; see the example added to the blame documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:42 +02:00			`if (cnt) {`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`printf("%s %d %d\n", hex,`
			`ent->s_lno + 1 + cnt,`
			`ent->lno + 1 + cnt);`
blame: add --line-porcelain output format This is just like --porcelain, except that we always output the commit information for each line, not just the first time it is referenced. This can make quick and dirty scripts much easier to write; see the example added to the blame documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:42 +02:00			`if (repeat)`
			`emit_porcelain_details(suspect, 1);`
			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`putchar('\t');`
			`do {`
			`ch = *cp++;`
			`putchar(ch);`
			`} while (ch != '\n' &&`
			`cp < sb->final_buf + sb->final_buf_size);`
			`}`
blame: make sure that the last line ends in an LF This is convenient when parsing multiple the blame of multiple files, for example: git ls-files -z --exclude-standard -- "*.[ch]" \| xargs --null -n 1 git blame -p > output and then analyzing the 'output' file using a seperate script. Currently the parsing is difficult when not all files have a newline at EOF, this patch ensures that even such files have a newline at the end of the blame output. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> CC: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-20 05:06:28 +02:00
			`if (sb->final_buf_size && cp[-1] != '\n')`
			`putchar('\n');`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

			`static void emit_other(struct scoreboard sb, struct blame_entry ent, int opt)`
			`{`
			`int cnt;`
			`const char *cp;`
			`struct origin *suspect = ent->suspect;`
			`struct commit_info ci;`
			`char hex[41];`
			`int show_raw_time = !!(opt & OUTPUT_RAW_TIMESTAMP);`

			`get_commit_info(suspect->commit, &ci, 1);`
			`strcpy(hex, sha1_to_hex(suspect->commit->object.sha1));`

			`cp = nth_line(sb, ent->lno);`
			`for (cnt = 0; cnt < ent->num_lines; cnt++) {`
			`char ch;`
blame: add --abbrev command line option and make it honor core.abbrev If user sets config.abbrev option, use it as if --abbrev was given. This is the default value and user can override different abbrev length by specifying the --abbrev=N command line option. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-04-06 04:20:50 +02:00			`int length = (opt & OUTPUT_LONG_OBJECT_NAME) ? 40 : abbrev;`
git-blame: show lines attributed to boundary commits differently. When blaming with revision ranges, often many lines are attributed to different commits at the boundary, but they are not interesting for the purpose of finding project history during that revision range. This outputs the lines blamed on boundary commits differently. When showing "human format" output, their SHA-1 are shown with '^' prefixed. In "porcelain format", the commit will be shown with an extra attribute line "boundary". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-02 05:45:45 +01:00
			`if (suspect->commit->object.flags & UNINTERESTING) {`
annotate: fix for cvsserver. git-cvsserver does not want the boundary commits shown any differently. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-06 10:52:04 +01:00			`if (blank_boundary)`
			`memset(hex, ' ', length);`
"blame -c" should be compatible with "annotate" There is no reason to have a separate variable cmd_is_annotate; OUTPUT_ANNOTATE_COMPAT option is supposed to produce the compatibility output, and we should produce the same output even when the command was not invoked as "annotate" but as "blame -c". Noticed by Pasky. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-09-05 09:57:35 +02:00			`else if (!(opt & OUTPUT_ANNOTATE_COMPAT)) {`
blame: -b (blame.blankboundary) and --root (blame.showroot) When blame.blankboundary is set (or -b option is given), commit object names are blanked out in the "human readable" output format for boundary commits. When blame.showroot is not set (or --root is not given), the root commits are treated as boundary commits. The code still attributes the lines to them, but with -b their object names are not shown. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 23:04:38 +01:00			`length--;`
			`putchar('^');`
			`}`
git-blame: show lines attributed to boundary commits differently. When blaming with revision ranges, often many lines are attributed to different commits at the boundary, but they are not interesting for the purpose of finding project history during that revision range. This outputs the lines blamed on boundary commits differently. When showing "human format" output, their SHA-1 are shown with '^' prefixed. In "porcelain format", the commit will be shown with an extra attribute line "boundary". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-02 05:45:45 +01:00			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-blame: show lines attributed to boundary commits differently. When blaming with revision ranges, often many lines are attributed to different commits at the boundary, but they are not interesting for the purpose of finding project history during that revision range. This outputs the lines blamed on boundary commits differently. When showing "human format" output, their SHA-1 are shown with '^' prefixed. In "porcelain format", the commit will be shown with an extra attribute line "boundary". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-02 05:45:45 +01:00			`printf("%.*s", length, hex);`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`if (opt & OUTPUT_ANNOTATE_COMPAT) {`
			`const char *name;`
			`if (opt & OUTPUT_SHOW_EMAIL)`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`name = ci.author_mail.buf;`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`else`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`name = ci.author.buf;`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`printf("\t(%10s\t%10s\t%d)", name,`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`format_time(ci.author_time, ci.author_tz.buf,`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`show_raw_time),`
			`ent->lno + 1 + cnt);`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`} else {`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`if (opt & OUTPUT_SHOW_SCORE)`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`printf(" %*d %02d",`
			`max_score_digits, ent->score,`
			`ent->suspect->refcnt);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`if (opt & OUTPUT_SHOW_NAME)`
			`printf(" %-.s", longest_file, longest_file,`
			`suspect->path);`
			`if (opt & OUTPUT_SHOW_NUMBER)`
			`printf(" %*d", max_orig_digits,`
			`ent->s_lno + 1 + cnt);`
blame -s: suppress author name and time. With this "git blame -b -s HEAD~n..HEAD" becomes a nicer way to review the result of recent changes in context. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 00:50:45 +02:00
builtin-blame.c: Use utf8_strwidth for author's names git blame misaligns output if a author's name has a differing display width and strlen; for instance, an accented Latin letter that takes two bytes to encode will cause the rest of the line to be shifted to the left by one. To fix this, use utf8_strwidth instead of strlen (and compute the padding ourselves, since printf doesn't know about UTF-8). Signed-off-by: Geoffrey Thomas <geofft@mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 10:41:29 +01:00			`if (!(opt & OUTPUT_NO_AUTHOR)) {`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`const char *name;`
			`int pad;`
			`if (opt & OUTPUT_SHOW_EMAIL)`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`name = ci.author_mail.buf;`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`else`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`name = ci.author.buf;`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`pad = longest_author - utf8_strwidth(name);`
builtin-blame.c: Use utf8_strwidth for author's names git blame misaligns output if a author's name has a differing display width and strlen; for instance, an accented Latin letter that takes two bytes to encode will cause the rest of the line to be shifted to the left by one. To fix this, use utf8_strwidth instead of strlen (and compute the padding ourselves, since printf doesn't know about UTF-8). Signed-off-by: Geoffrey Thomas <geofft@mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 10:41:29 +01:00			`printf(" (%s%*s %10s",`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`name, pad, "",`
blame -s: suppress author name and time. With this "git blame -b -s HEAD~n..HEAD" becomes a nicer way to review the result of recent changes in context. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 00:50:45 +02:00			`format_time(ci.author_time,`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`ci.author_tz.buf,`
blame -s: suppress author name and time. With this "git blame -b -s HEAD~n..HEAD" becomes a nicer way to review the result of recent changes in context. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 00:50:45 +02:00			`show_raw_time));`
builtin-blame.c: Use utf8_strwidth for author's names git blame misaligns output if a author's name has a differing display width and strlen; for instance, an accented Latin letter that takes two bytes to encode will cause the rest of the line to be shifted to the left by one. To fix this, use utf8_strwidth instead of strlen (and compute the padding ourselves, since printf doesn't know about UTF-8). Signed-off-by: Geoffrey Thomas <geofft@mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-30 10:41:29 +01:00			`}`
blame -s: suppress author name and time. With this "git blame -b -s HEAD~n..HEAD" becomes a nicer way to review the result of recent changes in context. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 00:50:45 +02:00			`printf(" %*d) ",`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`max_digits, ent->lno + 1 + cnt);`
			`}`
			`do {`
			`ch = *cp++;`
			`putchar(ch);`
			`} while (ch != '\n' &&`
			`cp < sb->final_buf + sb->final_buf_size);`
			`}`
blame: make sure that the last line ends in an LF This is convenient when parsing multiple the blame of multiple files, for example: git ls-files -z --exclude-standard -- "*.[ch]" \| xargs --null -n 1 git blame -p > output and then analyzing the 'output' file using a seperate script. Currently the parsing is difficult when not all files have a newline at EOF, this patch ensures that even such files have a newline at the end of the blame output. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> CC: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-20 05:06:28 +02:00
			`if (sb->final_buf_size && cp[-1] != '\n')`
			`putchar('\n');`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00
			`commit_info_destroy(&ci);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

			`static void output(struct scoreboard *sb, int option)`
			`{`
			`struct blame_entry *ent;`

			`if (option & OUTPUT_PORCELAIN) {`
			`for (ent = sb->ent; ent; ent = ent->next) {`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`int count = 0;`
			`struct origin *suspect;`
			`struct commit *commit = ent->suspect->commit;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`if (commit->object.flags & MORE_THAN_ONE_PATH)`
			`continue;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`for (suspect = commit->util; suspect; suspect = suspect->next) {`
			`if (suspect->guilty && count++) {`
			`commit->object.flags \|= MORE_THAN_ONE_PATH;`
			`break;`
			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
			`}`
			`}`

			`for (ent = sb->ent; ent; ent = ent->next) {`
			`if (option & OUTPUT_PORCELAIN)`
blame: refactor porcelain output This is in preparation for adding more porcelain output options. The three changes are: 1. emit_porcelain now receives the format option flags 2. emit_one_suspect_detail takes an optional "repeat" parameter to suppress the "show only once" behavior 3. The code for emitting porcelain suspect is factored into its own function for repeatability. There should be no functional changes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-05-09 15:34:02 +02:00			`emit_porcelain(sb, ent, option);`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`else {`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`emit_other(sb, ent, option);`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
			`}`

blame: factor out get_next_line() Move the code for finding the start of the next line into a helper function in order to reduce duplication. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:53:03 +02:00			`static const char get_next_line(const char start, const char *end)`
			`{`
			`const char *nl = memchr(start, '\n', end - start);`
blame: simplify prepare_lines() Changing get_next_line() to return the end pointer instead of NULL in case no newline character is found treats allows us to treat complete and incomplete lines the same, simplifying the code. Switching to counting lines instead of EOLs allows us to start counting at the first character, instead of having to call get_next_line() first. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:54:59 +02:00			`return nl ? nl + 1 : end;`
blame: factor out get_next_line() Move the code for finding the start of the next line into a helper function in order to reduce duplication. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:53:03 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* To allow quick access to the contents of nth line in the`
			`* final image, prepare an index in the scoreboard.`
			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`static int prepare_lines(struct scoreboard *sb)`
			`{`
			`const char *buf = sb->final_buf;`
			`unsigned long len = sb->final_buf_size;`
blame.c: prepare_lines should not call xrealloc for every line Making a single preparation run for counting the lines will avoid memory fragmentation. Also, fix the allocated memory size which was wrong when sizeof(int ) != sizeof(int), and would have been too small for sizeof(int ) < sizeof(int), admittedly unlikely. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-12 15:27:24 +01:00			`const char *end = buf + len;`
			`const char *p;`
			`int *lineno;`
blame: simplify prepare_lines() Changing get_next_line() to return the end pointer instead of NULL in case no newline character is found treats allows us to treat complete and incomplete lines the same, simplifying the code. Switching to counting lines instead of EOLs allows us to start counting at the first character, instead of having to call get_next_line() first. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:54:59 +02:00			`int num = 0;`
blame.c: prepare_lines should not call xrealloc for every line Making a single preparation run for counting the lines will avoid memory fragmentation. Also, fix the allocated memory size which was wrong when sizeof(int ) != sizeof(int), and would have been too small for sizeof(int ) < sizeof(int), admittedly unlikely. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-12 15:27:24 +01:00
blame: simplify prepare_lines() Changing get_next_line() to return the end pointer instead of NULL in case no newline character is found treats allows us to treat complete and incomplete lines the same, simplifying the code. Switching to counting lines instead of EOLs allows us to start counting at the first character, instead of having to call get_next_line() first. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:54:59 +02:00			`for (p = buf; p < end; p = get_next_line(p, end))`
blame: factor out get_next_line() Move the code for finding the start of the next line into a helper function in order to reduce duplication. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:53:03 +02:00			`num++;`
blame.c: prepare_lines should not call xrealloc for every line Making a single preparation run for counting the lines will avoid memory fragmentation. Also, fix the allocated memory size which was wrong when sizeof(int ) != sizeof(int), and would have been too small for sizeof(int ) < sizeof(int), admittedly unlikely. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-12 15:27:24 +01:00
blame: simplify prepare_lines() Changing get_next_line() to return the end pointer instead of NULL in case no newline character is found treats allows us to treat complete and incomplete lines the same, simplifying the code. Switching to counting lines instead of EOLs allows us to start counting at the first character, instead of having to call get_next_line() first. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:54:59 +02:00			`sb->lineno = lineno = xmalloc(sizeof(sb->lineno) (num + 1));`
blame.c: prepare_lines should not call xrealloc for every line Making a single preparation run for counting the lines will avoid memory fragmentation. Also, fix the allocated memory size which was wrong when sizeof(int ) != sizeof(int), and would have been too small for sizeof(int ) < sizeof(int), admittedly unlikely. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-12 15:27:24 +01:00
blame: simplify prepare_lines() Changing get_next_line() to return the end pointer instead of NULL in case no newline character is found treats allows us to treat complete and incomplete lines the same, simplifying the code. Switching to counting lines instead of EOLs allows us to start counting at the first character, instead of having to call get_next_line() first. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:54:59 +02:00			`for (p = buf; p < end; p = get_next_line(p, end))`
blame: factor out get_next_line() Move the code for finding the start of the next line into a helper function in order to reduce duplication. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:53:03 +02:00			`*lineno++ = p - buf;`
blame.c: prepare_lines should not call xrealloc for every line Making a single preparation run for counting the lines will avoid memory fragmentation. Also, fix the allocated memory size which was wrong when sizeof(int ) != sizeof(int), and would have been too small for sizeof(int ) < sizeof(int), admittedly unlikely. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-12 15:27:24 +01:00
blame: simplify prepare_lines() Changing get_next_line() to return the end pointer instead of NULL in case no newline character is found treats allows us to treat complete and incomplete lines the same, simplifying the code. Switching to counting lines instead of EOLs allows us to start counting at the first character, instead of having to call get_next_line() first. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:54:59 +02:00			`*lineno = len;`
blame.c: prepare_lines should not call xrealloc for every line Making a single preparation run for counting the lines will avoid memory fragmentation. Also, fix the allocated memory size which was wrong when sizeof(int ) != sizeof(int), and would have been too small for sizeof(int ) < sizeof(int), admittedly unlikely. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-02-12 15:27:24 +01:00
blame: simplify prepare_lines() Changing get_next_line() to return the end pointer instead of NULL in case no newline character is found treats allows us to treat complete and incomplete lines the same, simplifying the code. Switching to counting lines instead of EOLs allows us to start counting at the first character, instead of having to call get_next_line() first. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 21:54:59 +02:00			`sb->num_lines = num;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`return sb->num_lines;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Add phony grafts for use with -S; this is primarily to`
Start conforming code to "git subcmd" style User notifications are presented as 'git cmd', and code comments are presented as '"cmd"' or 'git's cmd', rather than 'git-cmd'. Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-30 13:12:53 +02:00			`* support git's cvsserver that wants to give a linear history`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`* to its clients.`
			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`static int read_ancestry(const char *graft_file)`
			`{`
			`FILE *fp = fopen(graft_file, "r");`
Remove the line length limit for graft files Support for grafts predates Git's strbuf, and hence it is understandable that there was a hard-coded line length limit of 1023 characters (which was chosen a bit awkwardly, given that it is exactly one byte short of aligning with the 41 bytes occupied by a commit name and the following space or new-line character). While regular commit histories hardly win comprehensibility in general if they merge more than twenty-two branches in one go, it is not Git's business to limit grafts in such a way. In this particular developer's case, the use case that requires substantially longer graft lines to be supported is the visualization of the commits' order implied by their changes: commits are considered to have an implicit relationship iff exchanging them in an interactive rebase would result in merge conflicts. Thusly implied branches tend to be very shallow in general, and the resulting thicket of implied branches is usually very wide; It is actually quite common that most of the commits in a topic branch have not even one implied parent, so that a final merge commit has about as many implied parents as there are commits in said branch. [jc: squashed in tests by Jonathan] Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-12-27 21:49:57 +01:00			`struct strbuf buf = STRBUF_INIT;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`if (!fp)`
			`return -1;`
Remove the line length limit for graft files Support for grafts predates Git's strbuf, and hence it is understandable that there was a hard-coded line length limit of 1023 characters (which was chosen a bit awkwardly, given that it is exactly one byte short of aligning with the 41 bytes occupied by a commit name and the following space or new-line character). While regular commit histories hardly win comprehensibility in general if they merge more than twenty-two branches in one go, it is not Git's business to limit grafts in such a way. In this particular developer's case, the use case that requires substantially longer graft lines to be supported is the visualization of the commits' order implied by their changes: commits are considered to have an implicit relationship iff exchanging them in an interactive rebase would result in merge conflicts. Thusly implied branches tend to be very shallow in general, and the resulting thicket of implied branches is usually very wide; It is actually quite common that most of the commits in a topic branch have not even one implied parent, so that a final merge commit has about as many implied parents as there are commits in said branch. [jc: squashed in tests by Jonathan] Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-12-27 21:49:57 +01:00			`while (!strbuf_getwholeline(&buf, fp, '\n')) {`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`/* The format is just "Commit Parent1 Parent2 ...\n" */`
Remove the line length limit for graft files Support for grafts predates Git's strbuf, and hence it is understandable that there was a hard-coded line length limit of 1023 characters (which was chosen a bit awkwardly, given that it is exactly one byte short of aligning with the 41 bytes occupied by a commit name and the following space or new-line character). While regular commit histories hardly win comprehensibility in general if they merge more than twenty-two branches in one go, it is not Git's business to limit grafts in such a way. In this particular developer's case, the use case that requires substantially longer graft lines to be supported is the visualization of the commits' order implied by their changes: commits are considered to have an implicit relationship iff exchanging them in an interactive rebase would result in merge conflicts. Thusly implied branches tend to be very shallow in general, and the resulting thicket of implied branches is usually very wide; It is actually quite common that most of the commits in a topic branch have not even one implied parent, so that a final merge commit has about as many implied parents as there are commits in said branch. [jc: squashed in tests by Jonathan] Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-12-27 21:49:57 +01:00			`struct commit_graft *graft = read_graft_line(buf.buf, buf.len);`
git-annotate: fix -S on graft file with comments. The graft file can contain comment lines and read_graft_line can return NULL for such an input, which should be skipped by the reader. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-10 22:39:01 +01:00			`if (graft)`
			`register_commit_graft(graft, 0);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
			`fclose(fp);`
Remove the line length limit for graft files Support for grafts predates Git's strbuf, and hence it is understandable that there was a hard-coded line length limit of 1023 characters (which was chosen a bit awkwardly, given that it is exactly one byte short of aligning with the 41 bytes occupied by a commit name and the following space or new-line character). While regular commit histories hardly win comprehensibility in general if they merge more than twenty-two branches in one go, it is not Git's business to limit grafts in such a way. In this particular developer's case, the use case that requires substantially longer graft lines to be supported is the visualization of the commits' order implied by their changes: commits are considered to have an implicit relationship iff exchanging them in an interactive rebase would result in merge conflicts. Thusly implied branches tend to be very shallow in general, and the resulting thicket of implied branches is usually very wide; It is actually quite common that most of the commits in a topic branch have not even one implied parent, so that a final merge commit has about as many implied parents as there are commits in said branch. [jc: squashed in tests by Jonathan] Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-12-27 21:49:57 +01:00			`strbuf_release(&buf);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`return 0;`
			`}`

blame: compute abbreviation width that ensures uniqueness Julia Lawall noticed that in linux-next repository the commit object 60d5c9f5 (shown with the default abbreviation width baked into "git blame") in output from $ git blame -L 3675,3675 60d5c9f5b -- \ drivers/staging/brcm80211/brcmfmac/wl_iw.c is no longer unique in the repository, which results in "short SHA1 60d5c9f5 is ambiguous". Compute the minimum abbreviation width that ensures uniqueness when the user did not specify the --abbrev option to avoid this. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-02 09:54:00 +02:00			`static int update_auto_abbrev(int auto_abbrev, struct origin *suspect)`
			`{`
			`const char *uniq = find_unique_abbrev(suspect->commit->object.sha1,`
			`auto_abbrev);`
			`int len = strlen(uniq);`
			`if (auto_abbrev < len)`
			`return len;`
			`return auto_abbrev;`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* How many columns do we need to show line numbers, authors,`
			`* and filenames?`
			`*/`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`static void find_alignment(struct scoreboard sb, int option)`
			`{`
			`int longest_src_lines = 0;`
			`int longest_dst_lines = 0;`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`unsigned largest_score = 0;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`struct blame_entry *e;`
blame: compute abbreviation width that ensures uniqueness Julia Lawall noticed that in linux-next repository the commit object 60d5c9f5 (shown with the default abbreviation width baked into "git blame") in output from $ git blame -L 3675,3675 60d5c9f5b -- \ drivers/staging/brcm80211/brcmfmac/wl_iw.c is no longer unique in the repository, which results in "short SHA1 60d5c9f5 is ambiguous". Compute the minimum abbreviation width that ensures uniqueness when the user did not specify the --abbrev option to avoid this. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-02 09:54:00 +02:00			`int compute_auto_abbrev = (abbrev < 0);`
			`int auto_abbrev = default_abbrev;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`for (e = sb->ent; e; e = e->next) {`
			`struct origin *suspect = e->suspect;`
			`struct commit_info ci;`
			`int num;`

blame: compute abbreviation width that ensures uniqueness Julia Lawall noticed that in linux-next repository the commit object 60d5c9f5 (shown with the default abbreviation width baked into "git blame") in output from $ git blame -L 3675,3675 60d5c9f5b -- \ drivers/staging/brcm80211/brcmfmac/wl_iw.c is no longer unique in the repository, which results in "short SHA1 60d5c9f5 is ambiguous". Compute the minimum abbreviation width that ensures uniqueness when the user did not specify the --abbrev option to avoid this. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-02 09:54:00 +02:00			`if (compute_auto_abbrev)`
			`auto_abbrev = update_auto_abbrev(auto_abbrev, suspect);`
git blame -C: fix output format tweaks when crossing file boundary. We used to get the case that more than two paths came from the same commit wrong when computing the output width and deciding to turn on --show-name option automatically. When we find that lines that came from a path that is different from what we started digging from, we should always turn --show-name on, and we should count the name length for all files involved. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-29 07:29:18 +01:00			`if (strcmp(suspect->path, sb->path))`
			`*option \|= OUTPUT_SHOW_NAME;`
			`num = strlen(suspect->path);`
			`if (longest_file < num)`
			`longest_file = num;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`if (!(suspect->commit->object.flags & METAINFO_SHOWN)) {`
			`suspect->commit->object.flags \|= METAINFO_SHOWN;`
			`get_commit_info(suspect->commit, &ci, 1);`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`if (*option & OUTPUT_SHOW_EMAIL)`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`num = utf8_strwidth(ci.author_mail.buf);`
blame: Add option to show author email instead of name Add a new option -e (or --show-email) to git-blame that will display the author's email instead of name on each line. This option works for both git-blame and git-annotate. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-16 08:57:51 +02:00			`else`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00			`num = utf8_strwidth(ci.author.buf);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`if (longest_author < num)`
			`longest_author = num;`
			`}`
			`num = e->s_lno + e->num_lines;`
			`if (longest_src_lines < num)`
			`longest_src_lines = num;`
			`num = e->lno + e->num_lines;`
			`if (longest_dst_lines < num)`
			`longest_dst_lines = num;`
git-pickaxe: improve "best match" heuristics Instead of comparing number of lines matched, look at the matched characters and count alnums, so that we do not pass blame on not-so-interesting lines, such as an empty line and a line that is indentation followed by a closing brace. Add an option --score-debug to show the score of each blame_entry while we cook this further on the "next" branch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 23:51:12 +02:00			`if (largest_score < ent_score(sb, e))`
			`largest_score = ent_score(sb, e);`
mailmap: simplify map_user() interface Simplify map_user(), mostly to avoid copies of string buffers. It also simplifies caller functions. map_user() directly receive pointers and length from the commit buffer as mail and name. If mapping of the user and mail can be done, the pointer is updated to a new location. Lengths are also updated if necessary. The caller of map_user() can then copy the new email and name if necessary. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-05 22:26:40 +01:00
			`commit_info_destroy(&ci);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
make lineno_width() from blame reusable for others builtin/blame.c has a helper function to compute how many columns we need to show a line-number, whose implementation is reusable as a more generic helper function to count the number of columns necessary to show any cardinal number. Rename it to decimal_width(), move it to pager.c and export it for use by future callers. Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-12 15:16:20 +01:00			`max_orig_digits = decimal_width(longest_src_lines);`
			`max_digits = decimal_width(longest_dst_lines);`
			`max_score_digits = decimal_width(largest_score);`
blame: compute abbreviation width that ensures uniqueness Julia Lawall noticed that in linux-next repository the commit object 60d5c9f5 (shown with the default abbreviation width baked into "git blame") in output from $ git blame -L 3675,3675 60d5c9f5b -- \ drivers/staging/brcm80211/brcmfmac/wl_iw.c is no longer unique in the repository, which results in "short SHA1 60d5c9f5 is ambiguous". Compute the minimum abbreviation width that ensures uniqueness when the user did not specify the --abbrev option to avoid this. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-02 09:54:00 +02:00
			`if (compute_auto_abbrev)`
			`/* one more abbrev length is needed for the boundary commit */`
			`abbrev = auto_abbrev + 1;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* For debugging -- origin is refcounted, and this asserts that`
			`* we do not underflow.`
			`*/`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`static void sanity_check_refcnt(struct scoreboard *sb)`
			`{`
			`int baa = 0;`
			`struct blame_entry *ent;`

			`for (ent = sb->ent; ent; ent = ent->next) {`
git-pickaxe: tighten sanity checks. When compiled for debugging, make sure that refcnt sanity check code detects underflows in origin reference counting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-30 23:27:52 +01:00			`/* Nobody should have zero or negative refcnt */`
git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 04:18:50 +01:00			`if (ent->suspect->refcnt <= 0) {`
			`fprintf(stderr, "%s in %s has negative refcnt %d\n",`
			`ent->suspect->path,`
			`sha1_to_hex(ent->suspect->commit->object.sha1),`
			`ent->suspect->refcnt);`
git-pickaxe: tighten sanity checks. When compiled for debugging, make sure that refcnt sanity check code detects underflows in origin reference counting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-30 23:27:52 +01:00			`baa = 1;`
git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 04:18:50 +01:00			`}`
git-pickaxe: tighten sanity checks. When compiled for debugging, make sure that refcnt sanity check code detects underflows in origin reference counting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-30 23:27:52 +01:00			`}`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`if (baa) {`
			`int opt = 0160;`
			`find_alignment(sb, &opt);`
			`output(sb, opt);`
git-pickaxe: fix origin refcounting When we introduced the cached origin per commit, we gave up proper garbage collecting because it meant that commits hold onto their cached copy. There is no need to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 04:18:50 +01:00			`die("Baa %d!", baa);`
git-pickaxe: WIP to refcount origin structure. The origin structure is allocated for each commit and path while the code traverse down it is copied into different blame entries. To avoid leaks, try refcounting them. This still seems to leak, which I haven't tracked down fully yet. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-29 12:07:40 +01:00			`}`
			`}`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* Used for the command line parsing; check if the path exists`
			`* in the working tree.`
			`*/`
Rename path_list to string_list The name path_list was correct for the first usage of that data structure, but it really is a general-purpose string list. $ perl -i -pe 's/path-list/string-list/g' $(git grep -l path-list) $ perl -i -pe 's/path_list/string_list/g' $(git grep -l path_list) $ git mv path-list.h string-list.h $ git mv path-list.c string-list.c $ perl -i -pe 's/has_path/has_string/g' $(git grep -l has_path) $ perl -i -pe 's/path/string/g' string-list.[ch] $ git mv Documentation/technical/api-path-list.txt \ Documentation/technical/api-string-list.txt $ perl -i -pe 's/strdup_paths/strdup_strings/g' $(git grep -l strdup_paths) ... and then fix all users of string-list to access the member "string" instead of "path". Documentation/technical/api-string-list.txt needed some rewrapping, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-21 20:03:49 +02:00			`static int has_string_in_work_tree(const char *path)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
			`struct stat st;`
			`return !lstat(path, &st);`
			`}`

git-pickaxe: introduce heuristics to avoid "trivial" chunks This adds scoring logic to blame_entry to prevent blames on very trivial chunks (e.g. lots of empty lines, indent followed by a closing brace) from being passed down to unrelated lines in the parent. The current heuristics are quite simple and may need to be tweaked later, but we need to start somewhere. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 00:37:12 +02:00			`static unsigned parse_score(const char *arg)`
			`{`
			`char *end;`
			`unsigned long score = strtoul(arg, &end, 10);`
			`if (*end)`
			`return 0;`
			`return score;`
			`}`

git-pickaxe: work properly in a subdirectory. We forgot to add prefix to the given path. [jc: interestingly enough, Jeff King had the same idea after I pushed mine out to "pu", and his patch was cleaner, so I dropped mine.] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-02 08:22:49 +01:00			`static const char add_prefix(const char prefix, const char *path)`
			`{`
Make blame accept absolute paths Blame did not always use prefix_path. Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:07:04 +01:00			`return prefix_path(prefix, prefix ? strlen(prefix) : 0, path);`
git-pickaxe: work properly in a subdirectory. We forgot to add prefix to the given path. [jc: interestingly enough, Jeff King had the same idea after I pushed mine out to "pu", and his patch was cleaner, so I dropped mine.] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-02 08:22:49 +01:00			`}`

Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`static int git_blame_config(const char var, const char value, void *cb)`
blame: -b (blame.blankboundary) and --root (blame.showroot) When blame.blankboundary is set (or -b option is given), commit object names are blanked out in the "human readable" output format for boundary commits. When blame.showroot is not set (or --root is not given), the root commits are treated as boundary commits. The code still attributes the lines to them, but with -b their object names are not shown. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 23:04:38 +01:00			`{`
			`if (!strcmp(var, "blame.showroot")) {`
			`show_root = git_config_bool(var, value);`
			`return 0;`
			`}`
			`if (!strcmp(var, "blame.blankboundary")) {`
			`blank_boundary = git_config_bool(var, value);`
			`return 0;`
			`}`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`if (!strcmp(var, "blame.date")) {`
			`if (!value)`
			`return config_error_nonbool(var);`
			`blame_date_mode = parse_date_format(value);`
			`return 0;`
			`}`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00
drop odd return value semantics from userdiff_config When the userdiff_config function was introduced in be58e70 (diff: unify external diff and funcname parsing code, 2008-10-05), it used a return value convention unlike any other config callback. Like other callbacks, it used "-1" to signal error. But it returned "1" to indicate that it found something, and "0" otherwise; other callbacks simply returned "0" to indicate that no error occurred. This distinction was necessary at the time, because the userdiff namespace overlapped slightly with the color configuration namespace. So "diff.color.foo" could mean "the 'foo' slot of diff coloring" or "the 'foo' component of the "color" userdiff driver". Because the color-parsing code would die on an unknown color slot, we needed the userdiff code to indicate that it had matched the variable, letting us bypass the color-parsing code entirely. Later, in 8b8e862 (ignore unknown color configuration, 2009-12-12), the color-parsing code learned to silently ignore unknown slots. This means we no longer need to protect userdiff-matched variables from reaching the color-parsing code. We can therefore change the userdiff_config calling convention to a more normal one. This drops some code from each caller, which is nice. But more importantly, it reduces the cognitive load for readers who may wonder why userdiff_config is unlike every other config callback. There's no need to add a new test confirming that this works; t4020 already contains a test that sets diff.color.external. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-02-07 19:23:02 +01:00			`if (userdiff_config(var, value) < 0)`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`return -1;`

Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`return git_default_config(var, value, cb);`
blame: -b (blame.blankboundary) and --root (blame.showroot) When blame.blankboundary is set (or -b option is given), commit object names are blanked out in the "human readable" output format for boundary commits. When blame.showroot is not set (or --root is not given), the root commits are treated as boundary commits. The code still attributes the lines to them, but with -b their object names are not shown. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-18 23:04:38 +01:00			`}`

blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00			`static void verify_working_tree_path(struct commit work_tree, const char path)`
blame $path: avoid getting fooled by case insensitive filesystems "git blame MAKEFILE" run in a history that has "Makefile" but not MAKEFILE can get confused on a case insensitive filesystem, because the check we run to see if there is a corresponding file in the working tree with lstat("MAKEFILE") succeeds. In addition to that check, we have to make sure that the given path also exists in the commit we start digging history from (i.e. "HEAD"). Note that this reveals the breakage in a test added in cd8ae20 (git-blame shouldn't crash if run in an unmerged tree, 2007-10-18), which expects the entire merge-in-progress path to be blamed to the working tree when it did not exist in our tree. As it is clear in the log message of that commit, the old breakage was that it was causing an internal error and the fix was about avoiding it. Just check that the command does not die an uncontrolled death. For this particular case, the blame should fail, as the history for the file in that contents has not been committed yet at the point in the test. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 01:30:20 +02:00			`{`
blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00			`struct commit_list *parents;`
blame $path: avoid getting fooled by case insensitive filesystems "git blame MAKEFILE" run in a history that has "Makefile" but not MAKEFILE can get confused on a case insensitive filesystem, because the check we run to see if there is a corresponding file in the working tree with lstat("MAKEFILE") succeeds. In addition to that check, we have to make sure that the given path also exists in the commit we start digging history from (i.e. "HEAD"). Note that this reveals the breakage in a test added in cd8ae20 (git-blame shouldn't crash if run in an unmerged tree, 2007-10-18), which expects the entire merge-in-progress path to be blamed to the working tree when it did not exist in our tree. As it is clear in the log message of that commit, the old breakage was that it was causing an internal error and the fix was about avoiding it. Just check that the command does not die an uncontrolled death. For this particular case, the blame should fail, as the history for the file in that contents has not been committed yet at the point in the test. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 01:30:20 +02:00
blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00			`for (parents = work_tree->parents; parents; parents = parents->next) {`
			`const unsigned char *commit_sha1 = parents->item->object.sha1;`
			`unsigned char blob_sha1[20];`
			`unsigned mode;`

			`if (!get_tree_entry(commit_sha1, path, blob_sha1, &mode) &&`
			`sha1_object_info(blob_sha1, NULL) == OBJ_BLOB)`
			`return;`
			`}`
			`die("no such path '%s' in HEAD", path);`
			`}`

			`static struct commit_list append_parent(struct commit_list tail, const unsigned char *sha1)`
			`{`
			`struct commit *parent;`

			`parent = lookup_commit_reference(sha1);`
			`if (!parent)`
			`die("no such commit %s", sha1_to_hex(sha1));`
			`return &commit_list_insert(parent, tail)->next;`
			`}`

			`static void append_merge_parents(struct commit_list **tail)`
			`{`
			`int merge_head;`
			`const char *merge_head_file = git_path("MERGE_HEAD");`
			`struct strbuf line = STRBUF_INIT;`

			`merge_head = open(merge_head_file, O_RDONLY);`
			`if (merge_head < 0) {`
			`if (errno == ENOENT)`
			`return;`
			`die("cannot open '%s' for reading", merge_head_file);`
			`}`

			`while (!strbuf_getwholeline_fd(&line, merge_head, '\n')) {`
			`unsigned char sha1[20];`
			`if (line.len < 40 \|\| get_sha1_hex(line.buf, sha1))`
			`die("unknown line in '%s': %s", merge_head_file, line.buf);`
			`tail = append_parent(tail, sha1);`
			`}`
			`close(merge_head);`
			`strbuf_release(&line);`
blame $path: avoid getting fooled by case insensitive filesystems "git blame MAKEFILE" run in a history that has "Makefile" but not MAKEFILE can get confused on a case insensitive filesystem, because the check we run to see if there is a corresponding file in the working tree with lstat("MAKEFILE") succeeds. In addition to that check, we have to make sure that the given path also exists in the commit we start digging history from (i.e. "HEAD"). Note that this reveals the breakage in a test added in cd8ae20 (git-blame shouldn't crash if run in an unmerged tree, 2007-10-18), which expects the entire merge-in-progress path to be blamed to the working tree when it did not exist in our tree. As it is clear in the log message of that commit, the old breakage was that it was causing an internal error and the fix was about avoiding it. Just check that the command does not die an uncontrolled death. For this particular case, the blame should fail, as the history for the file in that contents has not been committed yet at the point in the test. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 01:30:20 +02:00			`}`

commit: record buffer length in cache Most callsites which use the commit buffer try to use the cached version attached to the commit, rather than re-reading from disk. Unfortunately, that interface provides only a pointer to the NUL-terminated buffer, with no indication of the original length. For the most part, this doesn't matter. People do not put NULs in their commit messages, and the log code is happy to treat it all as a NUL-terminated string. However, some code paths do care. For example, when checking signatures, we want to be very careful that we verify all the bytes to avoid malicious trickery. This patch just adds an optional "size" out-pointer to get_commit_buffer and friends. The existing callers all pass NULL (there did not seem to be any obvious sites where we could avoid an immediate strlen() call, though perhaps with some further refactoring we could). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-10 23:44:13 +02:00			`/*`
			`* This isn't as simple as passing sb->buf and sb->len, because we`
			`* want to transfer ownership of the buffer to the commit (so we`
			`* must use detach).`
			`*/`
			`static void set_commit_buffer_from_strbuf(struct commit c, struct strbuf sb)`
			`{`
			`size_t len;`
			`void *buf = strbuf_detach(sb, &len);`
			`set_commit_buffer(c, buf, len);`
			`}`

builtin-blame.c: move prepare_final() into a separate function. After parsing the command line, we have a long loop to compute the commit object to start annotating from. Move the logic to a separate function, so that later patches become easier to read. It also makes fill_origin_blob() return void; the check is always done on !file->ptr, and nobody looks at the return value from the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`/*`
			`* Prepare a dummy commit that represents the work tree (or staged) item.`
			`* Note that annotating work tree item never works in the reverse.`
			`*/`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`static struct commit fake_working_tree_commit(struct diff_options opt,`
			`const char *path,`
			`const char *contents_from)`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`{`
			`struct commit *commit;`
			`struct origin *origin;`
blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00			`struct commit_list *parent_tail, parent;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`unsigned char head_sha1[20];`
Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer Many call sites use strbuf_init(&foo, 0) to initialize local strbuf variable "foo" which has not been accessed since its declaration. These can be replaced with a static initialization using the STRBUF_INIT macro which is just as readable, saves a function call, and takes up fewer lines. Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-10-09 21:12:12 +02:00			`struct strbuf buf = STRBUF_INIT;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`const char *ident;`
			`time_t now;`
			`int size, len;`
			`struct cache_entry *ce;`
			`unsigned mode;`
blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00			`struct strbuf msg = STRBUF_INIT;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`time(&now);`
do not create "struct commit" with xcalloc In both blame and merge-recursive, we sometimes create a "fake" commit struct for convenience (e.g., to represent the HEAD state as if we would commit it). By allocating ourselves rather than using alloc_commit_node, we do not properly set the "index" field of the commit. This can produce subtle bugs if we then use commit-slab on the resulting commit, as we will share the "0" index with another commit. We can fix this by using alloc_commit_node() to allocate. Note that we cannot free the result, as it is part of our commit allocator. However, both cases were already leaking the allocated commit anyway, so there's nothing to fix up. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-10 23:39:11 +02:00			`commit = alloc_commit_node();`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`commit->object.parsed = 1;`
			`commit->date = now;`
blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00			`parent_tail = &commit->parents;`

			`if (!resolve_ref_unsafe("HEAD", head_sha1, 1, NULL))`
			`die("no such ref: HEAD");`

			`parent_tail = append_parent(parent_tail, head_sha1);`
			`append_merge_parents(parent_tail);`
			`verify_working_tree_path(commit, path);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`origin = make_origin(commit, path);`

blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00			`ident = fmt_ident("Not Committed Yet", "not.committed.yet", NULL, 0);`
			`strbuf_addstr(&msg, "tree 0000000000000000000000000000000000000000\n");`
			`for (parent = commit->parents; parent; parent = parent->next)`
			`strbuf_addf(&msg, "parent %s\n",`
			`sha1_to_hex(parent->item->object.sha1));`
			`strbuf_addf(&msg,`
			`"author %s\n"`
			`"committer %s\n\n"`
			`"Version of %s from %s\n",`
			`ident, ident, path,`
			`(!contents_from ? path :`
			`(!strcmp(contents_from, "-") ? "standard input" : contents_from)));`
commit: record buffer length in cache Most callsites which use the commit buffer try to use the cached version attached to the commit, rather than re-reading from disk. Unfortunately, that interface provides only a pointer to the NUL-terminated buffer, with no indication of the original length. For the most part, this doesn't matter. People do not put NULs in their commit messages, and the log code is happy to treat it all as a NUL-terminated string. However, some code paths do care. For example, when checking signatures, we want to be very careful that we verify all the bytes to avoid malicious trickery. This patch just adds an optional "size" out-pointer to get_commit_buffer and friends. The existing callers all pass NULL (there did not seem to be any obvious sites where we could avoid an immediate strlen() call, though perhaps with some further refactoring we could). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-10 23:44:13 +02:00			`set_commit_buffer_from_strbuf(commit, &msg);`
blame: allow "blame file" in the middle of a conflicted merge "git blame file" has always meant "find the origin of each line of the file in the history leading to HEAD, oh by the way, blame the lines that are modified locally to the working tree". This teaches "git blame" that during a conflicted merge, some uncommitted changes may have come from the other history that is being merged. The verify_working_tree_path() function introduced in the previous patch to notice a typo in the filename (primarily on case insensitive filesystems) has been updated to allow a filename that does not exist in HEAD (i.e. the tip of our history) as long as it exists one of the commits being merged, so that a "we deleted, the other side modified" case tracks the history of the file in the history of the other side. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-11 23:30:03 +02:00
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`if (!contents_from \|\| strcmp("-", contents_from)) {`
			`struct stat st;`
			`const char *read_from;`
blame.c: Properly initialize strbuf after calling textconv_object(), again 2564aa4 started to initialize buf.alloc, but that should actually be one more byte than the string length due to the trailing \0. Also, do not modify buf.alloc out of the strbuf code. Use the existing strbuf_attach instead. Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 18:33:34 +01:00			`char *buf_ptr;`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`unsigned long buf_len;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`if (contents_from) {`
			`if (stat(contents_from, &st) < 0)`
Use die_errno() instead of die() when checking syscalls Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:47 +02:00			`die_errno("Cannot stat '%s'", contents_from);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`read_from = contents_from;`
			`}`
			`else {`
			`if (lstat(path, &st) < 0)`
Use die_errno() instead of die() when checking syscalls Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:47 +02:00			`die_errno("Cannot lstat '%s'", path);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`read_from = path;`
			`}`
			`mode = canon_mode(st.st_mode);`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`switch (st.st_mode & S_IFMT) {`
			`case S_IFREG:`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`if (DIFF_OPT_TST(opt, ALLOW_TEXTCONV) &&`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`textconv_object(read_from, mode, null_sha1, 0, &buf_ptr, &buf_len))`
blame.c: Properly initialize strbuf after calling textconv_object(), again 2564aa4 started to initialize buf.alloc, but that should actually be one more byte than the string length due to the trailing \0. Also, do not modify buf.alloc out of the strbuf code. Use the existing strbuf_attach instead. Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-07 18:33:34 +01:00			`strbuf_attach(&buf, buf_ptr, buf_len, buf_len + 1);`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`else if (strbuf_read_file(&buf, read_from, st.st_size) != st.st_size)`
Use die_errno() instead of die() when checking syscalls Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:47 +02:00			`die_errno("cannot open or read '%s'", read_from);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`break;`
			`case S_IFLNK:`
builtin-blame.c: use strbuf_readlink() When faking a commit out of the work tree contents, use strbuf_readlink() to read the contents of symbolic links. Signed-off-by: Junio C Hamano <gitster@pobox.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-12-17 21:37:53 +01:00			`if (strbuf_readlink(&buf, read_from, st.st_size) < 0)`
Use die_errno() instead of die() when checking syscalls Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:47 +02:00			`die_errno("cannot readlink '%s'", read_from);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`break;`
			`default:`
			`die("unsupported file type %s", read_from);`
			`}`
			`}`
			`else {`
			`/* Reading from stdin */`
			`mode = 0;`
Strbuf API extensions and fixes. * Add strbuf_rtrim to remove trailing spaces. * Add strbuf_insert to insert data at a given position. * Off-by one fix in strbuf_addf: strbuf_avail() does not counts the final \0 so the overflow test for snprintf is the strict comparison. This is not critical as the growth mechanism chosen will always allocate _more_ memory than asked, so the second test will not fail. It's some kind of miracle though. * Add size extension hints for strbuf_init and strbuf_read. If 0, default applies, else: + initial buffer has the given size for strbuf_init. + first growth checks it has at least this size rather than the default 8192. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-10 12:35:04 +02:00			`if (strbuf_read(&buf, 0, 0) < 0)`
Convert existing die(..., strerror(errno)) to die_errno() Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:46 +02:00			`die_errno("failed to read from stdin");`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`}`
Use strbuf API in apply, blame, commit-tree and diff Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-06 13:20:09 +02:00			`origin->file.ptr = buf.buf;`
			`origin->file.size = buf.len;`
			`pretend_sha1_file(buf.buf, buf.len, OBJ_BLOB, origin->blob_sha1);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`/*`
			`* Read the current index, replace the path entry with`
			`* origin->blob_sha1 without mucking with its mode or type`
			`* bits; we are not going to write this index out -- we just`
			`* want to run "diff-index --cached".`
			`*/`
			`discard_cache();`
			`read_cache();`

			`len = strlen(path);`
			`if (!mode) {`
			`int pos = cache_name_pos(path, len);`
			`if (0 <= pos)`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`mode = active_cache[pos]->ce_mode;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`else`
			`/* Let's not bother reading from HEAD tree */`
			`mode = S_IFREG \| 0644;`
			`}`
			`size = cache_entry_size(len);`
			`ce = xcalloc(1, size);`
			`hashcpy(ce->sha1, origin->blob_sha1);`
			`memcpy(ce->name, path, len);`
Strip namelen out of ce_flags into a ce_namelen field Strip the name length from the ce_flags field and move it into its own ce_namelen field in struct cache_entry. This will both give us a tiny bit of a performance enhancement when working with long pathnames and is a refactoring for more readability of the code. It enhances readability, by making it more clear what is a flag, and where the length is stored and make it clear which functions use stages in comparisions and which only use the length. It also makes CE_NAMEMASK private, so that users don't mistakenly write the name length in the flags. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-11 11:22:37 +02:00			`ce->ce_flags = create_ce_flags(0);`
			`ce->ce_namelen = len;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`ce->ce_mode = create_ce_mode(mode);`
			`add_cache_entry(ce, ADD_CACHE_OK_TO_ADD\|ADD_CACHE_OK_TO_REPLACE);`

			`/*`
			`* We are not going to write this out, so this does not matter`
			`* right now, but someday we might optimize diff-index --cached`
			`* with cache-tree information.`
			`*/`
cache-tree: mark istate->cache_changed on cache tree invalidation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-13 14:19:31 +02:00			`cache_tree_invalidate_path(&the_index, path);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`return commit;`
			`}`

git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`static const char prepare_final(struct scoreboard sb)`
builtin-blame.c: move prepare_final() into a separate function. After parsing the command line, we have a long loop to compute the commit object to start annotating from. Move the logic to a separate function, so that later patches become easier to read. It also makes fill_origin_blob() return void; the check is always done on !file->ptr, and nobody looks at the return value from the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`{`
			`int i;`
			`const char *final_commit_name = NULL;`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`struct rev_info *revs = sb->revs;`
builtin-blame.c: move prepare_final() into a separate function. After parsing the command line, we have a long loop to compute the commit object to start annotating from. Move the logic to a separate function, so that later patches become easier to read. It also makes fill_origin_blob() return void; the check is always done on !file->ptr, and nobody looks at the return value from the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00
			`/*`
			`* There must be one and only one positive commit in the`
			`* revs->pending array.`
			`*/`
			`for (i = 0; i < revs->pending.nr; i++) {`
			`struct object *obj = revs->pending.objects[i].item;`
			`if (obj->flags & UNINTERESTING)`
			`continue;`
			`while (obj->type == OBJ_TAG)`
			`obj = deref_tag(obj, NULL, 0);`
			`if (obj->type != OBJ_COMMIT)`
			`die("Non commit %s?", revs->pending.objects[i].name);`
			`if (sb->final)`
			`die("More than one commit to dig from %s and %s?",`
			`revs->pending.objects[i].name,`
			`final_commit_name);`
			`sb->final = (struct commit *) obj;`
			`final_commit_name = revs->pending.objects[i].name;`
			`}`
			`return final_commit_name;`
			`}`

git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`static const char prepare_initial(struct scoreboard sb)`
			`{`
			`int i;`
			`const char *final_commit_name = NULL;`
			`struct rev_info *revs = sb->revs;`

			`/*`
			`* There must be one and only one negative commit, and it must be`
			`* the boundary.`
			`*/`
			`for (i = 0; i < revs->pending.nr; i++) {`
			`struct object *obj = revs->pending.objects[i].item;`
			`if (!(obj->flags & UNINTERESTING))`
			`continue;`
			`while (obj->type == OBJ_TAG)`
			`obj = deref_tag(obj, NULL, 0);`
			`if (obj->type != OBJ_COMMIT)`
			`die("Non commit %s?", revs->pending.objects[i].name);`
			`if (sb->final)`
			`die("More than one commit to dig down to %s and %s?",`
			`revs->pending.objects[i].name,`
			`final_commit_name);`
			`sb->final = (struct commit *) obj;`
			`final_commit_name = revs->pending.objects[i].name;`
			`}`
			`if (!final_commit_name)`
			`die("No commit to dig down to?");`
			`return final_commit_name;`
			`}`

git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`static int blame_copy_callback(const struct option option, const char arg, int unset)`
			`{`
			`int *opt = option->value;`

			`/*`
			`* -C enables copy from removed files;`
			`* -C -C enables copy from existing files, but only`
			`* when blaming a new file;`
			`* -C -C -C enables copy from existing files for`
			`* everybody`
			`*/`
			`if (*opt & PICKAXE_BLAME_COPY_HARDER)`
			`*opt \|= PICKAXE_BLAME_COPY_HARDEST;`
			`if (*opt & PICKAXE_BLAME_COPY)`
			`*opt \|= PICKAXE_BLAME_COPY_HARDER;`
			`*opt \|= PICKAXE_BLAME_COPY \| PICKAXE_BLAME_MOVE;`

			`if (arg)`
			`blame_copy_score = parse_score(arg);`
			`return 0;`
			`}`

			`static int blame_move_callback(const struct option option, const char arg, int unset)`
			`{`
			`int *opt = option->value;`

			`*opt \|= PICKAXE_BLAME_MOVE;`

			`if (arg)`
			`blame_move_score = parse_score(arg);`
			`return 0;`
			`}`

git-pickaxe: retire pickaxe Just make it take over blame's place. Documentation and command have all stopped mentioning "git-pickaxe". The built-in synonym is left in the command table, so you can still say "git pickaxe", but it probably is a good idea to retire it as well. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-09 03:47:54 +01:00			`int cmd_blame(int argc, const char *argv, const char prefix)`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`{`
			`struct rev_info revs;`
			`const char *path;`
			`struct scoreboard sb;`
			`struct origin *o;`
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`struct blame_entry *ent = NULL;`
			`long dashdash_pos, lno;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`const char *final_commit_name = NULL;`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`static struct string_list range_list;`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`static int output_option = 0, opt = 0;`
			`static int show_stats = 0;`
			`static const char *revs_file = NULL;`
			`static const char *contents_from = NULL;`
			`static const struct option options[] = {`
Replace deprecated OPT_BOOLEAN by OPT_BOOL This task emerged from b04ba2bb (parse-options: deprecate OPT_BOOLEAN, 2011-09-27). All occurrences of the respective variables have been reviewed and none of them relied on the counting up mechanism, but all of them were using the variable as a true boolean. This patch does not change semantics of any command intentionally. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-03 13:51:19 +02:00			`OPT_BOOL(0, "incremental", &incremental, N_("Show blame entries as we find them, incrementally")),`
			`OPT_BOOL('b', NULL, &blank_boundary, N_("Show blank SHA-1 for boundary commits (Default: off)")),`
			`OPT_BOOL(0, "root", &show_root, N_("Do not treat root commits as boundaries (Default: off)")),`
			`OPT_BOOL(0, "show-stats", &show_stats, N_("Show work cost statistics")),`
i18n: blame: mark parseopt strings for translation Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-20 14:31:54 +02:00			`OPT_BIT(0, "score-debug", &output_option, N_("Show output score for blame entries"), OUTPUT_SHOW_SCORE),`
			`OPT_BIT('f', "show-name", &output_option, N_("Show original filename (Default: auto)"), OUTPUT_SHOW_NAME),`
			`OPT_BIT('n', "show-number", &output_option, N_("Show original linenumber (Default: off)"), OUTPUT_SHOW_NUMBER),`
			`OPT_BIT('p', "porcelain", &output_option, N_("Show in a format designed for machine consumption"), OUTPUT_PORCELAIN),`
			`OPT_BIT(0, "line-porcelain", &output_option, N_("Show porcelain format with per-line commit information"), OUTPUT_PORCELAIN\|OUTPUT_LINE_PORCELAIN),`
			`OPT_BIT('c', NULL, &output_option, N_("Use the same output mode as git-annotate (Default: off)"), OUTPUT_ANNOTATE_COMPAT),`
			`OPT_BIT('t', NULL, &output_option, N_("Show raw timestamp (Default: off)"), OUTPUT_RAW_TIMESTAMP),`
			`OPT_BIT('l', NULL, &output_option, N_("Show long commit SHA1 (Default: off)"), OUTPUT_LONG_OBJECT_NAME),`
			`OPT_BIT('s', NULL, &output_option, N_("Suppress author name and timestamp (Default: off)"), OUTPUT_NO_AUTHOR),`
			`OPT_BIT('e', "show-email", &output_option, N_("Show author email instead of name (Default: off)"), OUTPUT_SHOW_EMAIL),`
			`OPT_BIT('w', NULL, &xdl_opts, N_("Ignore whitespace differences"), XDF_IGNORE_WHITESPACE),`
			`OPT_BIT(0, "minimal", &xdl_opts, N_("Spend extra cycles to find better match"), XDF_NEED_MINIMAL),`
			`OPT_STRING('S', NULL, &revs_file, N_("file"), N_("Use revisions from <file> instead of calling git-rev-list")),`
			`OPT_STRING(0, "contents", &contents_from, N_("file"), N_("Use <file>'s contents as the final image")),`
			`{ OPTION_CALLBACK, 'C', NULL, &opt, N_("score"), N_("Find line copies within and across files"), PARSE_OPT_OPTARG, blame_copy_callback },`
			`{ OPTION_CALLBACK, 'M', NULL, &opt, N_("score"), N_("Find line movements within and across files"), PARSE_OPT_OPTARG, blame_move_callback },`
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`OPT_STRING_LIST('L', NULL, &range_list, N_("n,m"), N_("Process only line range n,m, counting from 1")),`
blame: add --abbrev command line option and make it honor core.abbrev If user sets config.abbrev option, use it as if --abbrev was given. This is the default value and user can override different abbrev length by specifying the --abbrev=N command line option. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-04-06 04:20:50 +02:00			`OPT__ABBREV(&abbrev),`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`OPT_END()`
			`};`

			`struct parse_opt_ctx_t ctx;`
"blame -c" should be compatible with "annotate" There is no reason to have a separate variable cmd_is_annotate; OUTPUT_ANNOTATE_COMPAT option is supposed to produce the compatibility output, and we should produce the same output even when the command was not invoked as "annotate" but as "blame -c". Noticed by Pasky. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-09-05 09:57:35 +02:00			`int cmd_is_annotate = !strcmp(argv[0], "annotate");`
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`struct range_set ranges;`
			`unsigned int range_i;`
blame: teach -L/RE/ to search from end of previous -L range Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:42 +02:00			`long anchor;`
annotate: fix for cvsserver. git-cvsserver does not want the boundary commits shown any differently. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-06 10:52:04 +01:00
Provide git_config with a callback-data parameter git_config() only had a function parameter, but no callback data parameter. This assumes that all callback functions only modify global variables. With this patch, every callback gets a void * parameter, and it is hoped that this will help the libification effort. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-05-14 19:46:53 +02:00			`git_config(git_blame_config, NULL);`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`init_revisions(&revs, NULL);`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`revs.date_mode = blame_date_mode;`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`DIFF_OPT_SET(&revs.diffopt, ALLOW_TEXTCONV);`
blame: pay attention to --no-follow If you know your history did not have renames, or if you care only about the history after a large rename that happened some time ago, "git blame --no-follow $path" is a way to tell the command not to bother about renames. When you use -C, the lines that came from the renamed file will still be found without the whole-file rename detection, so it is not all that interesting either way, though. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-21 22:52:25 +02:00			`DIFF_OPT_SET(&revs.diffopt, FOLLOW_RENAMES);`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00
git-pickaxe: do not keep commit buffer. We need the commit buffer data while generating the final result, but until then we do not need them. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 08:49:31 +02:00			`save_commit_buffer = 0;`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`dashdash_pos = 0;`
git-pickaxe: do not keep commit buffer. We need the commit buffer data while generating the final result, but until then we do not need them. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 08:49:31 +02:00
parse-options: Don't call parse_options_check() so much parse_options_check() is being called for each invocation of parse_options_step which can be quite a bit for some commands. The commit introducing this function cb9d398 (parse-options: add parse_options_check to validate option specs., 2009-06-09) had the correct motivation and explicitly states that parse_options_check() should be called from parse_options_start(). However, the implementation differs from the motivation. Fix it. Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-06 08:57:42 +01:00			`parse_options_start(&ctx, argc, argv, prefix, options,`
			`PARSE_OPT_KEEP_DASHDASH \| PARSE_OPT_KEEP_ARGV0);`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`for (;;) {`
			`switch (parse_options_step(&ctx, options, blame_opt_usage)) {`
			`case PARSE_OPT_HELP:`
			`exit(129);`
			`case PARSE_OPT_DONE:`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`if (ctx.argv[0])`
			`dashdash_pos = ctx.cpidx;`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`goto parse_done;`
			`}`

			`if (!strcmp(ctx.argv[0], "--reverse")) {`
			`ctx.argv[0] = "--children";`
			`reverse = 1;`
			`}`
revisions: refactor handle_revision_opt into parse_revision_opt. It seems we're using handle_revision_opt the same way each time, have a wrapper around it that does the 9-liner we copy each time instead. handle_revision_opt can be static in the module for now, it's always possible to make it public again if needed. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-09 23:38:34 +02:00			`parse_revision_opt(&revs, &ctx, options, blame_opt_usage);`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`}`
			`parse_done:`
blame: pay attention to --no-follow If you know your history did not have renames, or if you care only about the history after a large rename that happened some time ago, "git blame --no-follow $path" is a way to tell the command not to bother about renames. When you use -C, the lines that came from the renamed file will still be found without the whole-file rename detection, so it is not all that interesting either way, though. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-09-21 22:52:25 +02:00			`no_whole_file_rename = !DIFF_OPT_TST(&revs.diffopt, FOLLOW_RENAMES);`
			`DIFF_OPT_CLR(&revs.diffopt, FOLLOW_RENAMES);`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`argc = parse_options_end(&ctx);`

blame: compute abbreviation width that ensures uniqueness Julia Lawall noticed that in linux-next repository the commit object 60d5c9f5 (shown with the default abbreviation width baked into "git blame") in output from $ git blame -L 3675,3675 60d5c9f5b -- \ drivers/staging/brcm80211/brcmfmac/wl_iw.c is no longer unique in the repository, which results in "short SHA1 60d5c9f5 is ambiguous". Compute the minimum abbreviation width that ensures uniqueness when the user did not specify the --abbrev option to avoid this. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-02 09:54:00 +02:00			`if (0 < abbrev)`
			`/* one more abbrev length is needed for the boundary commit */`
			`abbrev++;`
blame: add --abbrev command line option and make it honor core.abbrev If user sets config.abbrev option, use it as if --abbrev was given. This is the default value and user can override different abbrev length by specifying the --abbrev=N command line option. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-04-06 04:20:50 +02:00
blame: read custom grafts given by -S before calling setup_revisions() setup_revisions() while getting the command line arguments parses the given commits from the command line, which means their direct parents will not be rewritten by the custom graft file. Call read_ancestry() early to work around this issue. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-03-18 08:13:03 +01:00			`if (revs_file && read_ancestry(revs_file))`
Convert existing die(..., strerror(errno)) to die_errno() Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:46 +02:00			`die_errno("reading graft file '%s' failed", revs_file);`
blame: read custom grafts given by -S before calling setup_revisions() setup_revisions() while getting the command line arguments parses the given commits from the command line, which means their direct parents will not be rewritten by the custom graft file. Call read_ancestry() early to work around this issue. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-03-18 08:13:03 +01:00
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`if (cmd_is_annotate) {`
"blame -c" should be compatible with "annotate" There is no reason to have a separate variable cmd_is_annotate; OUTPUT_ANNOTATE_COMPAT option is supposed to produce the compatibility output, and we should produce the same output even when the command was not invoked as "annotate" but as "blame -c". Noticed by Pasky. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-09-05 09:57:35 +02:00			`output_option \|= OUTPUT_ANNOTATE_COMPAT;`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`blame_date_mode = DATE_ISO8601;`
			`} else {`
			`blame_date_mode = revs.date_mode;`
			`}`

			`/* The maximum width used to show the dates */`
			`switch (blame_date_mode) {`
			`case DATE_RFC2822:`
			`blame_date_width = sizeof("Thu, 19 Oct 2006 16:00:04 -0700");`
			`break;`
			`case DATE_ISO8601:`
			`blame_date_width = sizeof("2006-10-19 16:00:04 -0700");`
			`break;`
			`case DATE_RAW:`
			`blame_date_width = sizeof("1161298804 -0700");`
			`break;`
			`case DATE_SHORT:`
			`blame_date_width = sizeof("2006-10-19");`
			`break;`
			`case DATE_RELATIVE:`
blame: dynamic blame_date_width for different locales When show date in relative date format for git-blame, the max display width of datetime is set as the length of the string "Thu Oct 19 16:00:04 2006 -0700" (30 characters long). But actually the max width for C locale is only 22 (the length of string "x years, xx months ago"). And for other locale, it maybe smaller. E.g. For Chinese locale, only needs a half (16-character width). Set blame_date_width as the display width of _("4 years, 11 months ago"), so that translators can make the choice. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-22 16:39:10 +02:00			`/* TRANSLATORS: This string is used to tell us the maximum`
			`display width for a relative timestamp in "git blame"`
			`output. For C locale, "4 years, 11 months ago", which`
			`takes 22 places, is the longest among various forms of`
			`relative timestamps, but your language may need more or`
			`fewer display columns. */`
			`blame_date_width = utf8_strwidth(_("4 years, 11 months ago")) + 1; /* add the null */`
			`break;`
Make git blame's date output format configurable, like git log Add the following: - git config value blame.date that expects one of the git log date formats (e.g. relative,local,default,iso,...); - git blame command line option --date expects one of the git log date formats; - documentation in blame-options.txt; - git blame uses the appropriate date.c functions and enums to make sense of the date format and provide appropriate data; git blame continues to line up the output columns by padding the date column up to the max width of the chosen date format. The date format for git blame without both blame.date and --date continues to be ISO for backwards compatibility. git annotate ignores the date format specifiers and continues to uses the ISO format, as before. Signed-off-by: Eugene Letuchy <eugene@facebook.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-20 23:51:11 +01:00			`case DATE_LOCAL:`
			`case DATE_NORMAL:`
			`blame_date_width = sizeof("Thu Oct 19 16:00:04 2006 -0700");`
			`break;`
			`}`
			`blame_date_width -= 1; /* strip the null */`
"blame -c" should be compatible with "annotate" There is no reason to have a separate variable cmd_is_annotate; OUTPUT_ANNOTATE_COMPAT option is supposed to produce the compatibility output, and we should produce the same output even when the command was not invoked as "annotate" but as "blame -c". Noticed by Pasky. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-09-05 09:57:35 +02:00
Teach --find-copies-harder to "git blame" It's equivalent to "-C -C" with the diff family. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-31 09:05:22 +02:00			`if (DIFF_OPT_TST(&revs.diffopt, FIND_COPIES_HARDER))`
			`opt \|= (PICKAXE_BLAME_COPY \| PICKAXE_BLAME_MOVE \|`
			`PICKAXE_BLAME_COPY_HARDER);`

git-pickaxe: introduce heuristics to avoid "trivial" chunks This adds scoring logic to blame_entry to prevent blames on very trivial chunks (e.g. lots of empty lines, indent followed by a closing brace) from being passed down to unrelated lines in the parent. The current heuristics are quite simple and may need to be tweaked later, but we need to start somewhere. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-21 00:37:12 +02:00			`if (!blame_move_score)`
			`blame_move_score = BLAME_DEFAULT_MOVE_SCORE;`
			`if (!blame_copy_score)`
			`blame_copy_score = BLAME_DEFAULT_COPY_SCORE;`

git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* We have collected options unknown to us in argv[1..unk]`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`* which are to be passed to revision machinery if we are`
Assorted typo fixes Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-04 05:49:16 +01:00			`* going to do the "bottom" processing.`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*`
			`* The remaining are:`
			`*`
Typos in code comments, an error message, documentation Signed-off-by: Ralf Wildenhues <Ralf.Wildenhues@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-22 13:12:12 +02:00			`* (1) if dashdash_pos != 0, it is either`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`* "blame [revisions] -- <path>" or`
			`* "blame -- <path> <rev>"`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*`
Typos in code comments, an error message, documentation Signed-off-by: Ralf Wildenhues <Ralf.Wildenhues@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-08-22 13:12:12 +02:00			`* (2) otherwise, it is one of the two:`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`* "blame [revisions] <path>"`
			`* "blame <path> <rev>"`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`* Note that we must strip out <path> from the arguments: we do not`
			`* want the path pruning but we may want "bottom" processing.`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`*/`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`if (dashdash_pos) {`
			`switch (argc - dashdash_pos - 1) {`
			`case 2: /* (1b) */`
			`if (argc != 4)`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`usage_with_options(blame_opt_usage, options);`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`/* reorder for the new way: <rev> -- <path> */`
			`argv[1] = argv[3];`
			`argv[3] = argv[2];`
			`argv[2] = "--";`
			`/* FALLTHROUGH */`
			`case 1: /* (1a) */`
			`path = add_prefix(prefix, argv[--argc]);`
			`argv[argc] = NULL;`
			`break;`
			`default:`
			`usage_with_options(blame_opt_usage, options);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`} else {`
			`if (argc < 2)`
git-blame: migrate to incremental parse-option [1/2] This step merely moves the parser to an incremental version, still using parse_revisions. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:34 +02:00			`usage_with_options(blame_opt_usage, options);`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`path = add_prefix(prefix, argv[argc - 1]);`
Rename path_list to string_list The name path_list was correct for the first usage of that data structure, but it really is a general-purpose string list. $ perl -i -pe 's/path-list/string-list/g' $(git grep -l path-list) $ perl -i -pe 's/path_list/string_list/g' $(git grep -l path_list) $ git mv path-list.h string-list.h $ git mv path-list.c string-list.c $ perl -i -pe 's/has_path/has_string/g' $(git grep -l has_path) $ perl -i -pe 's/path/string/g' string-list.[ch] $ git mv Documentation/technical/api-path-list.txt \ Documentation/technical/api-string-list.txt $ perl -i -pe 's/strdup_paths/strdup_strings/g' $(git grep -l strdup_paths) ... and then fix all users of string-list to access the member "string" instead of "path". Documentation/technical/api-string-list.txt needed some rewrapping, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-21 20:03:49 +02:00			`if (argc == 3 && !has_string_in_work_tree(path)) { /* (2b) */`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`path = add_prefix(prefix, argv[1]);`
			`argv[1] = argv[2];`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`argv[argc - 1] = "--";`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`setup_work_tree();`
Rename path_list to string_list The name path_list was correct for the first usage of that data structure, but it really is a general-purpose string list. $ perl -i -pe 's/path-list/string-list/g' $(git grep -l path-list) $ perl -i -pe 's/path_list/string_list/g' $(git grep -l path_list) $ git mv path-list.h string-list.h $ git mv path-list.c string-list.c $ perl -i -pe 's/has_path/has_string/g' $(git grep -l has_path) $ perl -i -pe 's/path/string/g' string-list.[ch] $ git mv Documentation/technical/api-path-list.txt \ Documentation/technical/api-string-list.txt $ perl -i -pe 's/strdup_paths/strdup_strings/g' $(git grep -l strdup_paths) ... and then fix all users of string-list to access the member "string" instead of "path". Documentation/technical/api-string-list.txt needed some rewrapping, too. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-21 20:03:49 +02:00			`if (!has_string_in_work_tree(path))`
Convert existing die(..., strerror(errno)) to die_errno() Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-06-27 17:58:46 +02:00			`die_errno("cannot stat path '%s'", path);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`

Teach --stdin option to "log" family Move the logic to read revs from standard input that rev-list knows about from it to revision machinery, so that all the users of setup_revisions() can feed the list of revs from the standard input when "--stdin" is used on the command line. Allow some users of the revision machinery that want different semantics from the "--stdin" option to disable it by setting an option in the rev_info structure. This also cleans up the kludge made to bundle.c via cut and paste. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-03 15:59:18 +01:00			`revs.disable_stdin = 1;`
git-blame: migrate to incremental parse-option [2/2] Now use handle_revision_args instead of parse_revisions, and simplify the handling of old-style arguments a lot thanks to the filtering. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-07-08 15:19:35 +02:00			`setup_revisions(argc, argv, &revs, NULL);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`memset(&sb, 0, sizeof(sb));`

git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`sb.revs = &revs;`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`if (!reverse) {`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`final_commit_name = prepare_final(&sb);`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`sb.commits.compare = compare_commits_by_commit_date;`
			`}`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`else if (contents_from)`
			`die("--contents and --children do not blend well.");`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`else {`
git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`final_commit_name = prepare_initial(&sb);`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`sb.commits.compare = compare_commits_by_reverse_commit_date;`
			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
			`if (!sb.final) {`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* "--not A B -- path" without anything positive;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`* do not default to HEAD, but use the working tree`
			`* or "--contents".`
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`*/`
Make git-blame fail when working tree is needed and we're not in one Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-03 13:22:55 +01:00			`setup_work_tree();`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`sb.final = fake_working_tree_commit(&sb.revs->diffopt,`
			`path, contents_from);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`add_pending_object(&revs, &(sb.final->object), ":");`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`}`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`else if (contents_from)`
			`die("Cannot use --contents with final commit object name");`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-blame: somewhat better commenting. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 02:36:22 +01:00			`/*`
			`* If we have bottom, this will mark the ancestors of the`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`* bottom commits we would reach while traversing as`
			`* uninteresting.`
			`*/`
check return code of prepare_revision_walk A failure in prepare_revision_walk can be caused by a not parseable object. Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-18 08:31:56 +01:00			`if (prepare_revision_walk(&revs))`
builtin/blame.c: add translation to warning about failed revision walk Signed-off-by: Stefan Beller <stefanbeller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-10 23:33:25 +02:00			`die(_("revision walk setup failed"));`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`if (is_null_sha1(sb.final->object.sha1)) {`
			`o = sb.final->util;`
use xmemdupz() to allocate copies of strings given by start and length Use xmemdupz() to allocate the memory, copy the data and make sure to NUL-terminate the result, all in one step. The resulting code is shorter, doesn't contain the constants 1 and '\0', and avoids duplicating function parameters. For blame, the last copied byte (o->file.ptr[o->file.size]) is always set to NUL by fake_working_tree_commit() or read_sha1_file(), so no information is lost by the conversion to using xmemdupz(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-07-19 17:35:34 +02:00			`sb.final_buf = xmemdupz(o->file.ptr, o->file.size);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`sb.final_buf_size = o->file.size;`
			`}`
			`else {`
			`o = get_origin(&sb, sb.final, path);`
blame,cat-file --textconv: Don't assume mode is ``S_IFREF \| 0664'' We need to get the correct mode when blame reads the source from the working tree, the index, or trees. This allows us to omit running textconv filters on symbolic links. Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Reviewed-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-29 13:35:24 +02:00			`if (fill_blob_sha1_and_mode(o))`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`die("no such path %s in %s", path, final_commit_name);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`if (DIFF_OPT_TST(&sb.revs->diffopt, ALLOW_TEXTCONV) &&`
diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-07-28 17:03:01 +02:00			`textconv_object(path, o->mode, o->blob_sha1, 1, (char **) &sb.final_buf,`
textconv: support for blame This patches enables to perform textconv with blame if a textconv driver is available fos the file. The main task is performed by the textconv_object function which prepares diff_filespec and if possible converts the file using diff textconv API. Only regular files are converted, so the mode of diff_filespec is faked. Textconv conversion is enabled by default (equivalent to the option --textconv), since blaming binary files is useless in most cases. The option --no-textconv is used to disable textconv conversion. The declarations of several functions are modified to give access to a diff_options, in order to know whether the textconv option is activated or not. Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr> Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-15 15:58:48 +02:00			`&sb.final_buf_size))`
			`;`
			`else`
			`sb.final_buf = read_sha1_file(o->blob_sha1, &type,`
			`&sb.final_buf_size);`

blame: check return value from read_sha1_file() Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-25 10:26:20 +02:00			`if (!sb.final_buf)`
			`die("Cannot read blob %s for path %s",`
			`sha1_to_hex(o->blob_sha1),`
			`path);`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`}`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`num_read_blob++;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`lno = prepare_lines(&sb);`

blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`if (lno && !range_list.nr)`
			`string_list_append(&range_list, xstrdup("1"));`

blame: teach -L/RE/ to search from end of previous -L range Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:42 +02:00			`anchor = 1;`
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`range_set_init(&ranges, range_list.nr);`
			`for (range_i = 0; range_i < range_list.nr; ++range_i) {`
			`long bottom, top;`
			`if (parse_range_arg(range_list.items[range_i].string,`
blame: teach -L/RE/ to search from end of previous -L range Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:42 +02:00			`nth_line_cb, &sb, lno, anchor,`
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`&bottom, &top, sb.path))`
			`usage(blame_usage);`
			`if (lno < top \|\| ((lno \|\| bottom) && lno < bottom))`
			`die("file %s has only %lu lines", path, lno);`
			`if (bottom < 1)`
			`bottom = 1;`
			`if (top < 1)`
			`top = lno;`
			`bottom--;`
			`range_set_append_unsafe(&ranges, bottom, top);`
blame: teach -L/RE/ to search from end of previous -L range Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:42 +02:00			`anchor = top + 1;`
blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`}`
			`sort_and_merge_range_set(&ranges);`

			`for (range_i = ranges.nr; range_i > 0; --range_i) {`
			`const struct range *r = &ranges.ranges[range_i - 1];`
			`long bottom = r->start;`
			`long top = r->end;`
			`struct blame_entry *next = ent;`
			`ent = xcalloc(1, sizeof(*ent));`
			`ent->lno = bottom;`
			`ent->num_lines = top - bottom;`
			`ent->suspect = o;`
			`ent->s_lno = bottom;`
			`ent->next = next;`
			`origin_incref(o);`
			`}`
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00
			`o->suspects = ent;`
			`prio_queue_put(&sb.commits, o->commit);`

blame: accept multiple -L ranges git-blame accepts only a single -L option or none. Clients requiring blame information for multiple disjoint ranges are therefore forced either to invoke git-blame multiple times, once for each range, or only once with no -L option to cover the entire file, both of which can be costly. Teach git-blame to accept multiple -L ranges. Overlapping and out-of-order ranges are accepted. In this patch, the X in -LX,Y is absolute (for instance, /RE/ patterns search from line 1), and Y is relative to X. Follow-up patches provide more flexibility over how X is anchored. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-08-06 15:59:38 +02:00			`origin_decref(o);`

			`range_set_release(&ranges);`
			`string_list_clear(&range_list, 0);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`sb.ent = NULL;`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`sb.path = path;`

Add mailmap.file as configurational option for mailmap location This allows us to augment the repo mailmap file, and to use mailmap files elsewhere than the repository root. Meaning that the entries in mailmap.file will override the entries in "./.mailmap", should they match. Signed-off-by: Marius Storm-Olsen <marius@trolltech.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-08 15:34:27 +01:00			`read_mailmap(&mailmap, NULL);`
Apply mailmap in git-blame output. This makes git-blame to use the same mailmap used by git-shortlog. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-27 09:42:15 +02:00
Delay pager setup in git blame This avoids to launch the pager when git blame fails for any reason. Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-03 13:22:53 +01:00			`if (!incremental)`
			`setup_pager();`

git-blame --reverse This new option allows "git blame" to read an old version of the file, and up to which commit each line survived (i.e. their children rewrote the line out of the contents). The previous revision machinery update to decorate each commit with its children was leading to this change. When the --reverse option is given, we read the old version and pass blame to the children of the current suspect, instead of the usual order of starting from the latest and passing blame to parents. The standard yardstick of "blame" in git.git history is "rev-list.c" which was refactored heavily in its existence. For example: git blame -C -C -w --reverse 9de48752..master -- rev-list.c begins like this: 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 1) #include "cache... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 2) #include "commi... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 3) #include "tree.... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 4) #include "blob.... 213523f4 rev-list.c (JC Hamano 2006-03-01 5) #include "epoch... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 6) ab57c8dd rev-list.c (JC Hamano 2006-02-24 7) #define SEEN ab57c8dd rev-list.c (JC Hamano 2006-02-24 8) #define INTERES... 213523f4 rev-list.c (JC Hamano 2006-03-01 9) #define COUNTED... 7e21c29b rev-list.c (LTorvalds 2005-07-06 10) #define SHOWN ... 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 11) 6c41b801 builtin-rev-list.c (JC Hamano 2008-04-02 12) static const ch... b1349229 rev-list.c (LTorvalds 2005-07-26 13) "usage: git-... This reveals that the original first four lines survived until now in builtin-rev-list.c , inclusion of "epoch.h" was removed after 213523f4 while the contents was still in rev-list.c. This mode probably needs more tweaking so that the commit that removed the line (i.e. the children of the commits listed in the above sample output) is shown instead to be useful, but then there is a little matter of which child of a fork point to show. For now, you can find the diff that rewrote the fifth line above by doing: $ git log --children 213523f4^.. to find its child, which is 1025fe5 (Merge branch 'lt/rev-list' into next, 2006-03-01), and then look at that child with: $ git show 1025fe5 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-03 07:17:53 +02:00			`assign_blame(&sb, opt);`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00
git-blame --incremental This adds --incremental option to help GUI porcelains to show the result from git-blame incrementally. The output gives the origin information in the same format as the porcelain format. The first line has commit object name, the line number of the first line in the group in the original file, the line number of that file in the final image, and number of lines in the group. Then subsequent lines show the metainformation for the commit when the commit is shown for the first time, except the filename information is always shown (we cannot even make it conditional to -C option as blame always follows the renaming of the file wholesale). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-28 10:34:06 +01:00			`if (incremental)`
			`return 0;`

blame: large-scale performance rewrite The previous implementation used a single sorted linear list of blame entries for organizing all partial or completed work. Every subtask had to scan the whole list, with most entries not being relevant to the task. The resulting run-time was quadratic to the number of separate chunks. This change gives every subtask its own data to work with. Subtasks are organized into "struct origin" chains hanging off particular commits. Commits are organized into a priority queue, processing them in commit date order in order to keep most of the work affecting a particular blob collated even in the presence of an extensive merge history. For large files with a diversified history, a speedup by a factor of 3 or more is not unusual. Signed-off-by: David Kastrup <dak@gnu.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-04-26 01:56:49 +02:00			`sb.ent = blame_sort(sb.ent, compare_blame_final);`

git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`coalesce(&sb);`

			`if (!(output_option & OUTPUT_PORCELAIN))`
			`find_alignment(&sb, &output_option);`

			`output(&sb, output_option);`
			`free((void *)sb.final_buf);`
			`for (ent = sb.ent; ent; ) {`
			`struct blame_entry *e = ent->next;`
			`free(ent);`
			`ent = e;`
			`}`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00
blame: --show-stats for easier optimization work. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:52:43 +01:00			`if (show_stats) {`
git-pickaxe: optimize by avoiding repeated read_sha1_file(). It turns out that pickaxe reads the same blob repeatedly while blame can reuse the blob already read for the parent when handling a child commit when it's parent's turn to pass its blame to the grandparent. Have a cache in the origin structure to keep the blob there, which will be garbage collected when the origin loses the last reference to it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-05 20:51:41 +01:00			`printf("num read blob: %d\n", num_read_blob);`
			`printf("num get patch: %d\n", num_get_patch);`
			`printf("num commits: %d\n", num_commits);`
			`}`
git-pickaxe: blame rewritten. Currently it does what git-blame does, but only faster. More importantly, its internal structure is designed to support content movement (aka cut-and-paste) more easily by allowing more than one paths to be taken from the same commit. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-10-20 01:00:04 +02:00			`return 0;`
			`}`