mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-02 15:28:21 +01:00

1738 lines

42 KiB

C

Raw Normal View History

[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`/*`
			`* Copyright (C) 2005 Junio C Hamano`
			`*/`
			`#include <sys/types.h>`
			`#include <sys/wait.h>`
[PATCH] diff.c: clean temporary files When diff-cache -p and friends are interrupted, they can leave their temporary files behind. Also when the external diff program is killed instead of exiting (this usually happens when piping the output to a pager, which can cause SIGPIPE when the user quits viewing the diff early), they incorrectly died without cleaning their temporary file. This fixes these problems. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-28 19:13:01 +02:00			`#include <signal.h>`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`#include "cache.h"`
[PATCH] Make sq_expand() available as sq_quote(). A useful shell safety helper sq_expand() was hidden as a static function in diff.c. Extract it out and make it available as sq_quote(). Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-08 08:58:32 +02:00			`#include "quote.h"`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`#include "diff.h"`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`#include "diffcore.h"`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`#include "xdiff-interface.h"`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`static int use_size_cache;`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00
diff: make default rename detection limit configurable. A while ago, a rename-detection limit logic was implemented as a response to this thread: http://marc.theaimsgroup.com/?l=git&m=112413080630175 where gitweb was found to be using a lot of time and memory to detect renames on huge commits. git-diff family takes -l<num> flag, and if the number of paths that are rename destination candidates (i.e. new paths with -M, or modified paths with -C) are larger than that number, skips rename/copy detection even when -M or -C is specified on the command line. This commit makes the rename detection limit easier to use. You can have: [diff] renamelimit = 30 in your .git/config file to specify the default rename detection limit. You can override this from the command line; giving 0 means 'unlimited': git diff -M -l0 We might want to change the default behaviour, when you do not have the configuration, to limit it to say 20 paths or so. This would also help the diffstat generation after a big 'git pull'. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 21:48:08 +01:00			`int diff_rename_limit_default = -1;`

Move diff.renamelimit out of default configuration. Otherwise we would end up linking all the unneeded stuff into git-daemon only to link with git_default_config. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-22 07:52:37 +01:00			`int git_diff_config(const char var, const char value)`
			`{`
			`if (!strcmp(var, "diff.renamelimit")) {`
			`diff_rename_limit_default = git_config_int(var, value);`
			`return 0;`
			`}`

			`return git_default_config(var, value);`
			`}`

Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`static char quote_one(const char str)`
			`{`
			`int needlen;`
			`char *xp;`

			`if (!str)`
			`return NULL;`
			`needlen = quote_c_style(str, NULL, NULL, 0);`
			`if (!needlen)`
			`return strdup(str);`
			`xp = xmalloc(needlen + 1);`
			`quote_c_style(str, xp, NULL, 0);`
			`return xp;`
			`}`

			`static char quote_two(const char one, const char *two)`
			`{`
			`int need_one = quote_c_style(one, NULL, NULL, 1);`
			`int need_two = quote_c_style(two, NULL, NULL, 1);`
			`char *xp;`

			`if (need_one + need_two) {`
			`if (!need_one) need_one = strlen(one);`
			`if (!need_two) need_one = strlen(two);`

			`xp = xmalloc(need_one + need_two + 3);`
			`xp[0] = '"';`
			`quote_c_style(one, xp + 1, NULL, 1);`
			`quote_c_style(two, xp + need_one + 1, NULL, 1);`
			`strcpy(xp + need_one + need_two + 1, "\"");`
			`return xp;`
			`}`
			`need_one = strlen(one);`
			`need_two = strlen(two);`
			`xp = xmalloc(need_one + need_two + 1);`
			`strcpy(xp, one);`
			`strcpy(xp + need_one, two);`
			`return xp;`
			`}`

[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`static const char *external_diff(void)`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`{`
Rename environment variables. H. Peter Anvin mentioned that using SHA1_whatever as an environment variable name is not nice and we should instead use names starting with "GIT_" prefix to avoid conflicts. Here is what this patch does: * Renames the following environment variables: New name Old Name GIT_AUTHOR_DATE AUTHOR_DATE GIT_AUTHOR_EMAIL AUTHOR_EMAIL GIT_AUTHOR_NAME AUTHOR_NAME GIT_COMMITTER_EMAIL COMMIT_AUTHOR_EMAIL GIT_COMMITTER_NAME COMMIT_AUTHOR_NAME GIT_ALTERNATE_OBJECT_DIRECTORIES SHA1_FILE_DIRECTORIES GIT_OBJECT_DIRECTORY SHA1_FILE_DIRECTORY * Introduces a compatibility macro, gitenv(), which does an getenv() and if it fails calls gitenv_bc(), which in turn picks up the value from old name while giving a warning about using an old name. * Changes all users of the environment variable to fetch environment variable with the new name using gitenv(). * Updates the documentation and scripts shipped with Linus GIT distribution. The transition plan is as follows: * We will keep the backward compatibility list used by gitenv() for now, so the current scripts and user environments continue to work as before. The users will get warnings when they have old name but not new name in their environment to the stderr. * The Porcelain layers should start using new names. However, just in case it ends up calling old Plumbing layer implementation, they should also export old names, taking values from the corresponding new names, during the transition period. * After a transition period, we would drop the compatibility support and drop gitenv(). Revert the callers to directly call getenv() but keep using the new names. The last part is probably optional and the transition duration needs to be set to a reasonable value. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-10 02:57:56 +02:00			`static const char *external_diff_cmd = NULL;`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`static int done_preparing = 0;`

			`if (done_preparing)`
			`return external_diff_cmd;`
Retire support for old environment variables. We have deprecated the old environment variable names for quite a while and now it's time to remove them. Gone are: SHA1_FILE_DIRECTORIES AUTHOR_DATE AUTHOR_EMAIL AUTHOR_NAME COMMIT_AUTHOR_EMAIL COMMIT_AUTHOR_NAME SHA1_FILE_DIRECTORY Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-09 23:48:54 +02:00			`external_diff_cmd = getenv("GIT_EXTERNAL_DIFF");`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`done_preparing = 1;`
			`return external_diff_cmd;`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`}`

[PATCH] git: use git_mkstemp() instead of mkstemp() for diff generation. This lets you run git diff in a repository otherwise read-only to you. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 22:49:49 +02:00			`#define TEMPFILE_PATH_LEN 50`

[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`static struct diff_tempfile {`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`const char name; / filename external diff should read from */`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`char hex[41];`
			`char mode[10];`
[PATCH] git: use git_mkstemp() instead of mkstemp() for diff generation. This lets you run git diff in a repository otherwise read-only to you. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 22:49:49 +02:00			`char tmp_path[TEMPFILE_PATH_LEN];`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`} diff_temp[2];`

true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`static int count_lines(const char *data, int size)`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`{`
			`int count, ch, completely_empty = 1, nl_just_seen = 0;`
			`count = 0;`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`while (0 < size--) {`
			`ch = *data++;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`if (ch == '\n') {`
			`count++;`
			`nl_just_seen = 1;`
			`completely_empty = 0;`
			`}`
			`else {`
			`nl_just_seen = 0;`
			`completely_empty = 0;`
			`}`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`}`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`if (completely_empty)`
			`return 0;`
			`if (!nl_just_seen)`
			`count++; /* no trailing newline */`
			`return count;`
			`}`

			`static void print_line_count(int count)`
			`{`
			`switch (count) {`
			`case 0:`
			`printf("0,0");`
			`break;`
			`case 1:`
			`printf("1");`
			`break;`
			`default:`
			`printf("1,%d", count);`
			`break;`
			`}`
			`}`

true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`static void copy_file(int prefix, const char *data, int size)`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`{`
			`int ch, nl_just_seen = 1;`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`while (0 < size--) {`
			`ch = *data++;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`if (nl_just_seen)`
			`putchar(prefix);`
			`putchar(ch);`
			`if (ch == '\n')`
			`nl_just_seen = 1;`
			`else`
			`nl_just_seen = 0;`
			`}`
			`if (!nl_just_seen)`
			`printf("\n\\ No newline at end of file\n");`
			`}`

			`static void emit_rewrite_diff(const char *name_a,`
			`const char *name_b,`
diff: fix output of total-rewrite diff. We did not read in the file data before emitting the total-rewrite diff. Noticed by Pasky. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-09 04:45:39 +02:00			`struct diff_filespec *one,`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`struct diff_filespec *two)`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`{`
			`int lc_a, lc_b;`
diff: fix output of total-rewrite diff. We did not read in the file data before emitting the total-rewrite diff. Noticed by Pasky. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-09 04:45:39 +02:00			`diff_populate_filespec(one, 0);`
			`diff_populate_filespec(two, 0);`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`lc_a = count_lines(one->data, one->size);`
			`lc_b = count_lines(two->data, two->size);`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`printf("--- %s\n+++ %s\n@@ -", name_a, name_b);`
			`print_line_count(lc_a);`
			`printf(" +");`
			`print_line_count(lc_b);`
			`printf(" @@\n");`
			`if (lc_a)`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`copy_file('-', one->data, one->size);`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`if (lc_b)`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`copy_file('+', two->data, two->size);`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`}`

true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`static int fill_mmfile(mmfile_t mf, struct diff_filespec one)`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`{`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`if (!DIFF_FILE_VALID(one)) {`
			`mf->ptr = ""; /* does not matter */`
			`mf->size = 0;`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`return 0;`
			`}`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`else if (diff_populate_filespec(one, 0))`
			`return -1;`
			`mf->ptr = one->data;`
			`mf->size = one->size;`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`return 0;`
			`}`

built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`struct emit_callback {`
			`const char **label_path;`
			`};`

Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`static int fn_out(void priv, mmbuffer_t mb, int nbuf)`
			`{`
			`int i;`
built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`struct emit_callback *ecbdata = priv;`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00
built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`if (ecbdata->label_path[0]) {`
			`printf("--- %s\n", ecbdata->label_path[0]);`
			`printf("+++ %s\n", ecbdata->label_path[1]);`
			`ecbdata->label_path[0] = ecbdata->label_path[1] = NULL;`
			`}`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`for (i = 0; i < nbuf; i++)`
			`if (!fwrite(mb[i].ptr, mb[i].size, 1, stdout))`
			`return -1;`
			`return 0;`
			`}`

diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`struct diffstat_t {`
			`struct xdiff_emit_state xm;`

			`int nr;`
			`int alloc;`
			`struct diffstat_file {`
			`char *name;`
diff-files --stat: do not dump core with unmerged index. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-16 03:38:32 +02:00			`unsigned is_unmerged:1;`
			`unsigned is_binary:1;`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`unsigned int added, deleted;`
			`} **files;`
			`};`

			`static struct diffstat_file diffstat_add(struct diffstat_t diffstat,`
			`const char *name)`
			`{`
			`struct diffstat_file *x;`
			`x = xcalloc(sizeof (*x), 1);`
			`if (diffstat->nr == diffstat->alloc) {`
			`diffstat->alloc = alloc_nr(diffstat->alloc);`
			`diffstat->files = xrealloc(diffstat->files,`
			`diffstat->alloc * sizeof(x));`
			`}`
			`diffstat->files[diffstat->nr++] = x;`
			`x->name = strdup(name);`
			`return x;`
			`}`

			`static void diffstat_consume(void priv, char line, unsigned long len)`
			`{`
			`struct diffstat_t *diffstat = priv;`
			`struct diffstat_file *x = diffstat->files[diffstat->nr - 1];`

			`if (line[0] == '+')`
			`x->added++;`
			`else if (line[0] == '-')`
			`x->deleted++;`
			`}`

			`static const char pluses[] = "++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++";`
			`static const char minuses[]= "----------------------------------------------------------------------";`

			`static void show_stats(struct diffstat_t* data)`
			`{`
			`char *prefix = "";`
			`int i, len, add, del, total, adds = 0, dels = 0;`
			`int max, max_change = 0, max_len = 0;`
			`int total_files = data->nr;`

			`if (data->nr == 0)`
			`return;`

			`for (i = 0; i < data->nr; i++) {`
			`struct diffstat_file *file = data->files[i];`

Fix filename scaling for binary files Set maximum filename length for binary files so that scaling won't be triggered and result in invalid string access. Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-18 23:26:43 +02:00			`len = strlen(file->name);`
			`if (max_len < len)`
			`max_len = len;`

diff-files --stat: do not dump core with unmerged index. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-16 03:38:32 +02:00			`if (file->is_binary \|\| file->is_unmerged)`
			`continue;`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`if (max_change < file->added + file->deleted)`
			`max_change = file->added + file->deleted;`
			`}`

			`for (i = 0; i < data->nr; i++) {`
			`char *name = data->files[i]->name;`
			`int added = data->files[i]->added;`
			`int deleted = data->files[i]->deleted;`

			`if (0 < (len = quote_c_style(name, NULL, NULL, 0))) {`
			`char *qname = xmalloc(len + 1);`
			`quote_c_style(name, qname, NULL, 0);`
			`free(name);`
diff-options: add --stat (take 2) ... and a fix for an invalid free(): Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 01:09:48 +02:00			`data->files[i]->name = name = qname;`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`}`

			`/*`
			`* "scale" the filename`
			`*/`
			`len = strlen(name);`
			`max = max_len;`
			`if (max > 50)`
			`max = 50;`
			`if (len > max) {`
			`char *slash;`
			`prefix = "...";`
			`max -= 3;`
			`name += len - max;`
			`slash = strchr(name, '/');`
			`if (slash)`
			`name = slash;`
			`}`
			`len = max;`

			`/*`
			`* scale the add/delete`
			`*/`
			`max = max_change;`
			`if (max + len > 70)`
			`max = 70 - len;`

diff-files --stat: do not dump core with unmerged index. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-16 03:38:32 +02:00			`if (data->files[i]->is_binary) {`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`printf(" %s%-*s \| Bin\n", prefix, len, name);`
diff-options: add --stat (take 2) ... and a fix for an invalid free(): Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 01:09:48 +02:00			`goto free_diffstat_file;`
diff-files --stat: do not dump core with unmerged index. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-16 03:38:32 +02:00			`}`
			`else if (data->files[i]->is_unmerged) {`
			`printf(" %s%-*s \| Unmerged\n", prefix, len, name);`
			`goto free_diffstat_file;`
			`}`
			`else if (added + deleted == 0) {`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`total_files--;`
diff-options: add --stat (take 2) ... and a fix for an invalid free(): Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 01:09:48 +02:00			`goto free_diffstat_file;`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`}`

			`add = added;`
			`del = deleted;`
			`total = add + del;`
			`adds += add;`
			`dels += del;`

			`if (max_change > 0) {`
			`total = (total * max + max_change / 2) / max_change;`
			`add = (add * max + max_change / 2) / max_change;`
			`del = total - add;`
			`}`
			`printf(" %s%-s \|%5d %.s%.*s\n", prefix,`
			`len, name, added + deleted,`
			`add, pluses, del, minuses);`
diff-options: add --stat (take 2) ... and a fix for an invalid free(): Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 01:09:48 +02:00			`free_diffstat_file:`
			`free(data->files[i]->name);`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`free(data->files[i]);`
			`}`
			`free(data->files);`
			`printf(" %d files changed, %d insertions(+), %d deletions(-)\n",`
			`total_files, adds, dels);`
			`}`

built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`#define FIRST_FEW_BYTES 8000`
			`static int mmfile_is_binary(mmfile_t *mf)`
			`{`
			`long sz = mf->size;`
			`if (FIRST_FEW_BYTES < sz)`
			`sz = FIRST_FEW_BYTES;`
			`if (memchr(mf->ptr, 0, sz))`
			`return 1;`
			`return 0;`
			`}`

true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`static void builtin_diff(const char *name_a,`
[PATCH] Diff-helper update This patch adds a framework and a stub implementation of rename detection to diff-helper program. The current stub code is just enough to detect pure renames in diff-tree output and not fancier. The plan is perhaps to use the same delta code when Nico's delta storage patch is merged for similarity evaluation purposes. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-18 08:29:49 +02:00			`const char *name_b,`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`struct diff_filespec *one,`
			`struct diff_filespec *two,`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`const char *xfrm_msg,`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`int complete_rewrite)`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`{`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`mmfile_t mf1, mf2;`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`const char *lbl[2];`
			`char a_one, b_two;`

			`a_one = quote_two("a/", name_a);`
			`b_two = quote_two("b/", name_b);`
			`lbl[0] = DIFF_FILE_VALID(one) ? a_one : "/dev/null";`
			`lbl[1] = DIFF_FILE_VALID(two) ? b_two : "/dev/null";`
			`printf("diff --git %s %s\n", a_one, b_two);`
			`if (lbl[0][0] == '/') {`
			`/* /dev/null */`
			`printf("new file mode %06o\n", two->mode);`
[PATCH] Show dissimilarity index for D and N case. The way broken deletes and creates are shown in the -p (diff-patch) output format has become consistent with how rename/copy edits are shown. They will show "dissimilarity index" value, immediately following the "deleted file mode" and "new file mode" lines. The git-apply is taught to grok such an extended header. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-31 01:40:16 +02:00			`if (xfrm_msg && xfrm_msg[0])`
			`puts(xfrm_msg);`
			`}`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`else if (lbl[1][0] == '/') {`
			`printf("deleted file mode %06o\n", one->mode);`
[PATCH] Show dissimilarity index for D and N case. The way broken deletes and creates are shown in the -p (diff-patch) output format has become consistent with how rename/copy edits are shown. They will show "dissimilarity index" value, immediately following the "deleted file mode" and "new file mode" lines. The git-apply is taught to grok such an extended header. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-31 01:40:16 +02:00			`if (xfrm_msg && xfrm_msg[0])`
			`puts(xfrm_msg);`
			`}`
[PATCH 1/3] Update mode-change strings in diff output. This updates the mode change strings to be a bit more machine friendly. Although this might go against the spirit of readability for human consumption, these mode bits strings are shown only when unusual things (mode change, file creation and deletion) happens, output normalized for machine consumption would be permissible. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-14 03:40:14 +02:00			`else {`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`if (one->mode != two->mode) {`
			`printf("old mode %06o\n", one->mode);`
			`printf("new mode %06o\n", two->mode);`
[PATCH] Fix diff output take #4. This implements the output format suggested by Linus in <Pine.LNX.4.58.0505161556260.18337@ppc970.osdl.org>, except the imaginary diff option is spelled "diff --git" with double dashes as suggested by Matthias Urlichs. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-18 18:10:47 +02:00			`}`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (xfrm_msg && xfrm_msg[0])`
[PATCH] Remove final newline from the value of xfrm_msg variable. This change makes the implementation of git-external-diff-script cleaner. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:54:06 +02:00			`puts(xfrm_msg);`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00			`/*`
			`* we do not run diff between different kind`
			`* of objects.`
			`*/`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`if ((one->mode ^ two->mode) & S_IFMT)`
			`goto free_ab_and_return;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`if (complete_rewrite) {`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`emit_rewrite_diff(name_a, name_b, one, two);`
			`goto free_ab_and_return;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`}`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`}`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`if (fill_mmfile(&mf1, one) < 0 \|\| fill_mmfile(&mf2, two) < 0)`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`die("unable to read files to diff");`

built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`if (mmfile_is_binary(&mf1) \|\| mmfile_is_binary(&mf2))`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`printf("Binary files %s and %s differ\n", lbl[0], lbl[1]);`
built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`else {`
			`/* Crazy xdl interfaces.. */`
			`const char *diffopts = getenv("GIT_DIFF_OPTS");`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`xpparam_t xpp;`
			`xdemitconf_t xecfg;`
			`xdemitcb_t ecb;`
built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`struct emit_callback ecbdata;`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`ecbdata.label_path = lbl;`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`xpp.flags = XDF_NEED_MINIMAL;`
			`xecfg.ctxlen = 3;`
xdiff: Show function names in hunk headers. The speed of the built-in diff generator is nice; but the function names shown by `diff -p' are /really/ nice. And I hate having to choose. So, we hack xdiff to find the function names and print them. xdiff has grown a flag to say whether to dig up the function names. The builtin_diff function passes this flag unconditionally. I suppose it could parse GIT_DIFF_OPTS, but it doesn't at the moment. I've also reintroduced the `function name' into the test suite, from which it was removed in commit 3ce8f089. The function names are parsed by a particularly stupid algorithm at the moment: it just tries to find a line in the `old' file, from before the start of the hunk, whose first character looks plausible. Still, it's most definitely a start. Signed-off-by: Mark Wooding <mdw@distorted.org.uk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-28 04:23:31 +02:00			`xecfg.flags = XDL_EMIT_FUNCNAMES;`
built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`if (!diffopts)`
			`;`
			`else if (!strncmp(diffopts, "--unified=", 10))`
			`xecfg.ctxlen = strtoul(diffopts + 10, NULL, 10);`
			`else if (!strncmp(diffopts, "-u", 2))`
			`xecfg.ctxlen = strtoul(diffopts + 2, NULL, 10);`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`ecb.outf = fn_out;`
built-in diff: minimum tweaks This fixes up a couple of minor issues with the real built-in diff to be more usable: - Omit ---/+++ header unless we emit diff output; - Detect and punt binary diff like GNU does; - Honor GIT_DIFF_OPTS minimally (only -u<number> and --unified=<number> are currently supported); - Omit line count of 1 from "@@ -l,k +m,n @@" hunk header (i.e. when k == 1 or n == 1) - Adjust testsuite for the lack of -p support. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 21:16:17 +01:00			`ecb.priv = &ecbdata;`
Use a real built-in diff generator This uses a simplified libxdiff setup to generate unified diffs _without_ doing fork/execve of GNU "diff". This has several huge advantages, for example: Before: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m24.818s user 0m13.332s sys 0m8.664s After: [torvalds@g5 linux]$ time git diff v2.6.16.. > /dev/null real 0m4.563s user 0m2.944s sys 0m1.580s and the fact that this should be a lot more portable (ie we can ignore all the issues with doing fork/execve under Windows). Perhaps even more importantly, this allows us to do diffs without actually ever writing out the git file contents to a temporary file (and without any of the shell quoting issues on filenames etc etc). NOTE! THIS PATCH DOES NOT DO THAT OPTIMIZATION YET! I was lazy, and the current "diff-core" code actually will always write the temp-files, because it used to be something that you simply had to do. So this current one actually writes a temp-file like before, and then reads it into memory again just to do the diff. Stupid. But if this basic infrastructure is accepted, we can start switching over diff-core to not write temp-files, which should speed things up even further, especially when doing big tree-to-tree diffs. Now, in the interest of full disclosure, I should also point out a few downsides: - the libxdiff algorithm is different, and I bet GNU diff has gotten a lot more testing. And the thing is, generating a diff is not an exact science - you can get two different diffs (and you will), and they can both be perfectly valid. So it's not possible to "validate" the libxdiff output by just comparing it against GNU diff. - GNU diff does some nice eye-candy, like trying to figure out what the last function was, and adding that information to the "@@ .." line. libxdiff doesn't do that. - The libxdiff thing has some known deficiencies. In particular, it gets the "\No newline at end of file" case wrong. So this is currently for the experimental branch only. I hope Davide will help fix it. That said, I think the huge performance advantage, and the fact that it integrates better is definitely worth it. But it should go into a development branch at least due to the missing newline issue. Technical note: this is based on libxdiff-0.17, but I did some surgery to get rid of the extraneous fat - stuff that git doesn't need, and seriously cutting down on mmfile_t, which had much more capabilities than the diff algorithm either needed or used. In this version, "mmfile_t" is just a trivial <pointer,length> tuple. That said, I tried to keep the differences to simple removals, so that you can do a diff between this and the libxdiff origin, and you'll basically see just things getting deleted. Even the mmfile_t simplifications are left in a state where the diffs should be readable. Apologies to Davide, whom I'd love to get feedback on this all from (I wrote my own "fill_mmfile()" for the new simpler mmfile_t format: the old complex format had a helper function for that, but I did my surgery with the goal in mind that eventually we _should_ just do mmfile_t mf; buf = read_sha1_file(sha1, type, &size); mf->ptr = buf; mf->size = size; .. use "mf" directly .. which was really a nightmare with the old "helpful" mmfile_t, and really is that easy with the new cut-down interfaces). [ Btw, as any hawk-eye can see from the diff, this was actually generated with itself, so it is "self-hosting". That's about all the testing it has gotten, along with the above kernel diff, which eye-balls correctly, but shows the newline issue when you double-check it with "git-apply" ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-25 05:13:22 +01:00			`xdl_diff(&mf1, &mf2, &xpp, &xecfg, &ecb);`
			`}`

true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`free_ab_and_return:`
			`free(a_one);`
			`free(b_two);`
			`return;`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`}`

diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`static void builtin_diffstat(const char name_a, const char name_b,`
			`struct diff_filespec one, struct diff_filespec two,`
			`struct diffstat_t *diffstat)`
			`{`
			`mmfile_t mf1, mf2;`
			`struct diffstat_file *data;`

			`data = diffstat_add(diffstat, name_a ? name_a : name_b);`

diff-files --stat: do not dump core with unmerged index. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-16 03:38:32 +02:00			`if (!one \|\| !two) {`
			`data->is_unmerged = 1;`
			`return;`
			`}`

diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`if (fill_mmfile(&mf1, one) < 0 \|\| fill_mmfile(&mf2, two) < 0)`
			`die("unable to read files to diff");`

			`if (mmfile_is_binary(&mf1) \|\| mmfile_is_binary(&mf2))`
diff-files --stat: do not dump core with unmerged index. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-16 03:38:32 +02:00			`data->is_binary = 1;`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`else {`
			`/* Crazy xdl interfaces.. */`
			`xpparam_t xpp;`
			`xdemitconf_t xecfg;`
			`xdemitcb_t ecb;`

			`xpp.flags = XDF_NEED_MINIMAL;`
diff --stat: no need to ask funcnames nor context. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 06:35:54 +02:00			`xecfg.ctxlen = 0;`
			`xecfg.flags = 0;`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`ecb.outf = xdiff_outf;`
			`ecb.priv = diffstat;`
			`xdl_diff(&mf1, &mf2, &xpp, &xecfg, &ecb);`
			`}`
			`}`

[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`struct diff_filespec alloc_filespec(const char path)`
			`{`
			`int namelen = strlen(path);`
			`struct diff_filespec spec = xmalloc(sizeof(spec) + namelen + 1);`
[PATCH] Fix alloc_filespec() initialization This simplifies and fixes the initialization of a "diff_filespec" when allocated. The old code would not initialize "sha1_valid". Noticed by valgrind. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-14 22:41:24 +02:00
			`memset(spec, 0, sizeof(*spec));`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`spec->path = (char *)(spec + 1);`
[PATCH] Fix alloc_filespec() initialization This simplifies and fixes the initialization of a "diff_filespec" when allocated. The old code would not initialize "sha1_valid". Noticed by valgrind. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-14 22:41:24 +02:00			`memcpy(spec->path, path, namelen+1);`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`return spec;`
			`}`

			`void fill_filespec(struct diff_filespec spec, const unsigned char sha1,`
			`unsigned short mode)`
			`{`
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`if (mode) {`
tree/diff header cleanup. Introduce tree-walk.[ch] and move "struct tree_desc" and associated functions from various places. Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and move it to cache.h. This macro returns the canonicalized st_mode value in the host byte order for files, symlinks and directories -- to be compared with a tree_desc entry. create_ce_mode(mode) in cache.h is similar but is intended to be used for index entries (so it does not work for directories) and returns the value in the network byte order. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-30 08:55:43 +02:00			`spec->mode = canon_mode(mode);`
[PATCH] The diff-raw format updates. Update the diff-raw format as Linus and I discussed, except that it does not use sequence of underscore '_' letters to express nonexistence. All '0' mode is used for that purpose instead. The new diff-raw format can express rename/copy, and the earlier restriction that -M and -C _must_ be used with the patch format output is no longer necessary. The patch makes -M and -C flags independent of -p flag, so you need to say git-whatchanged -M -p to get the diff/patch format. Updated are both documentations and tests. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:42:18 +02:00			`memcpy(spec->sha1, sha1, 20);`
			`spec->sha1_valid = !!memcmp(sha1, null_sha1, 20);`
			`}`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`}`

Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`/*`
			`* Given a name and sha1 pair, if the dircache tells us the file in`
			`* the work tree has that object contents, return true, so that`
			`* prepare_temp_file() does not have to inflate and extract.`
			`*/`
			`static int work_tree_matches(const char name, const unsigned char sha1)`
			`{`
			`struct cache_entry *ce;`
			`struct stat st;`
			`int pos, len;`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00
Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`/* We do not read the cache ourselves here, because the`
			`* benchmark with my previous version that always reads cache`
			`* shows that it makes things worse for diff-tree comparing`
			`* two linux-2.6 kernel trees in an already checked out work`
[PATCH] Diff-helper update This patch adds a framework and a stub implementation of rename detection to diff-helper program. The current stub code is just enough to detect pure renames in diff-tree output and not fancier. The plan is perhaps to use the same delta code when Nico's delta storage patch is merged for similarity evaluation purposes. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-18 08:29:49 +02:00			`* tree. This is because most diff-tree comparisons deal with`
Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`* only a small number of files, while reading the cache is`
			`* expensive for a large project, and its cost outweighs the`
			`* savings we get by not inflating the object to a temporary`
			`* file. Practically, this code only helps when we are used`
			`* by diff-cache --cached, which does read the cache before`
			`* calling us.`
[PATCH] diff overhaul This cleans up the way calls are made into the diff core from diff-tree family and diff-helper. Earlier, these programs had "if (generating_patch)" sprinkled all over the place, but those ugliness are gone and handled uniformly from the diff core, even when not generating patch format. This also allowed diff-cache and diff-files to acquire -R (reverse) option to generate diff in reverse. Users of diff-tree can swap two trees easily so I did not add -R there. [ Linus' note: I'll add -R to "diff-tree" too, since a "commit diff" doesn't have another tree to switch around: the other tree is always the parent(s) of the commit ] Also -M<digits-as-mantissa> suggestion made by Linus has been implemented. Documentation updates are also included. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 04:00:36 +02:00			`*/`
Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`if (!active_cache)`
			`return 0;`

			`len = strlen(name);`
			`pos = cache_name_pos(name, len);`
			`if (pos < 0)`
			`return 0;`
			`ce = active_cache[pos];`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`if ((lstat(name, &st) < 0) \|\|`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`!S_ISREG(st.st_mode) \|\| /* careful! */`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`ce_match_stat(ce, &st, 0) \|\|`
Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`memcmp(sha1, ce->sha1, 20))`
			`return 0;`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`/* we return 1 only when we can stat, it is a regular file,`
			`* stat information matches, and sha1 recorded in the cache`
			`* matches. I.e. we know the file in the work tree really is`
			`* the same as the <name, sha1> pair.`
			`*/`
Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`return 1;`
			`}`

[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`static struct sha1_size_cache {`
			`unsigned char sha1[20];`
			`unsigned long size;`
			`} **sha1_size_cache;`
			`static int sha1_size_cache_nr, sha1_size_cache_alloc;`

			`static struct sha1_size_cache locate_size_cache(unsigned char sha1,`
[PATCH] diff.c: locate_size_cache() fix. This fixes two bugs. - declaration of auto variable "cmp" was preceeded by a statement, causing compilation error on real C compilers; noticed and patch given by Yoichi Yuasa. - the function's calling convention was overloading its size parameter to mean "largest possible value means do not add entry", which was a bad taste. Brought up during a discussion with Peter Baudis. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-04 08:02:23 +02:00			`int find_only,`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`unsigned long size)`
			`{`
			`int first, last;`
			`struct sha1_size_cache *e;`

			`first = 0;`
			`last = sha1_size_cache_nr;`
			`while (last > first) {`
[PATCH] diff.c: locate_size_cache() fix. This fixes two bugs. - declaration of auto variable "cmp" was preceeded by a statement, causing compilation error on real C compilers; noticed and patch given by Yoichi Yuasa. - the function's calling convention was overloading its size parameter to mean "largest possible value means do not add entry", which was a bad taste. Brought up during a discussion with Peter Baudis. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-04 08:02:23 +02:00			`int cmp, next = (last + first) >> 1;`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`e = sha1_size_cache[next];`
[PATCH] diff.c: locate_size_cache() fix. This fixes two bugs. - declaration of auto variable "cmp" was preceeded by a statement, causing compilation error on real C compilers; noticed and patch given by Yoichi Yuasa. - the function's calling convention was overloading its size parameter to mean "largest possible value means do not add entry", which was a bad taste. Brought up during a discussion with Peter Baudis. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-04 08:02:23 +02:00			`cmp = memcmp(e->sha1, sha1, 20);`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`if (!cmp)`
			`return e;`
			`if (cmp < 0) {`
			`last = next;`
			`continue;`
			`}`
			`first = next+1;`
			`}`
			`/* not found */`
[PATCH] diff.c: locate_size_cache() fix. This fixes two bugs. - declaration of auto variable "cmp" was preceeded by a statement, causing compilation error on real C compilers; noticed and patch given by Yoichi Yuasa. - the function's calling convention was overloading its size parameter to mean "largest possible value means do not add entry", which was a bad taste. Brought up during a discussion with Peter Baudis. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-04 08:02:23 +02:00			`if (find_only)`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`return NULL;`
			`/* insert to make it at "first" */`
			`if (sha1_size_cache_alloc <= sha1_size_cache_nr) {`
			`sha1_size_cache_alloc = alloc_nr(sha1_size_cache_alloc);`
			`sha1_size_cache = xrealloc(sha1_size_cache,`
			`sha1_size_cache_alloc *`
			`sizeof(*sha1_size_cache));`
			`}`
			`sha1_size_cache_nr++;`
			`if (first < sha1_size_cache_nr)`
			`memmove(sha1_size_cache + first + 1, sha1_size_cache + first,`
			`(sha1_size_cache_nr - first - 1) *`
			`sizeof(*sha1_size_cache));`
			`e = xmalloc(sizeof(struct sha1_size_cache));`
			`sha1_size_cache[first] = e;`
			`memcpy(e->sha1, sha1, 20);`
			`e->size = size;`
			`return e;`
			`}`

[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`/*`
			`* While doing rename detection and pickaxe operation, we may need to`
			`* grab the data for the blob (or file) for our own in-core comparison.`
			`* diff_filespec has data and size fields for this purpose.`
			`*/`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`int diff_populate_filespec(struct diff_filespec *s, int size_only)`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`{`
			`int err = 0;`
[PATCH] The diff-raw format updates. Update the diff-raw format as Linus and I discussed, except that it does not use sequence of underscore '_' letters to express nonexistence. All '0' mode is used for that purpose instead. The new diff-raw format can express rename/copy, and the earlier restriction that -M and -C _must_ be used with the patch format output is no longer necessary. The patch makes -M and -C flags independent of -p flag, so you need to say git-whatchanged -M -p to get the diff/patch format. Updated are both documentations and tests. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:42:18 +02:00			`if (!DIFF_FILE_VALID(s))`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`die("internal error: asking to populate invalid file.");`
			`if (S_ISDIR(s->mode))`
			`return -1;`

[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`if (!use_size_cache)`
			`size_only = 0;`

[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (s->data)`
			`return err;`
			`if (!s->sha1_valid \|\|`
			`work_tree_matches(s->path, s->sha1)) {`
			`struct stat st;`
			`int fd;`
			`if (lstat(s->path, &st) < 0) {`
			`if (errno == ENOENT) {`
			`err_empty:`
			`err = -1;`
			`empty:`
			`s->data = "";`
			`s->size = 0;`
			`return err;`
			`}`
			`}`
			`s->size = st.st_size;`
			`if (!s->size)`
			`goto empty;`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`if (size_only)`
			`return 0;`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (S_ISLNK(st.st_mode)) {`
			`int ret;`
			`s->data = xmalloc(s->size);`
			`s->should_free = 1;`
			`ret = readlink(s->path, s->data, s->size);`
			`if (ret < 0) {`
			`free(s->data);`
			`goto err_empty;`
			`}`
			`return 0;`
			`}`
			`fd = open(s->path, O_RDONLY);`
			`if (fd < 0)`
			`goto err_empty;`
			`s->data = mmap(NULL, s->size, PROT_READ, MAP_PRIVATE, fd, 0);`
			`close(fd);`
[PATCH] mmap error handling I have reviewed all occurrences of mmap() in git and fixed three types of errors/defects: 1) The result is not checked. 2) The file descriptor is closed if mmap() succeeds, but not when it fails. 3) Various casts applied to -1 are used instead of MAP_FAILED, which is specifically defined to check mmap() return value. [jc: This is a second round of Pavel's patch. He fixed up the problem that close() potentially clobbering the errno from mmap, which the first round had.] Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-29 16:49:14 +02:00			`if (s->data == MAP_FAILED)`
			`goto err_empty;`
			`s->should_munmap = 1;`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`}`
			`else {`
			`char type[20];`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`struct sha1_size_cache *e;`

			`if (size_only) {`
[PATCH] diff.c: locate_size_cache() fix. This fixes two bugs. - declaration of auto variable "cmp" was preceeded by a statement, causing compilation error on real C compilers; noticed and patch given by Yoichi Yuasa. - the function's calling convention was overloading its size parameter to mean "largest possible value means do not add entry", which was a bad taste. Brought up during a discussion with Peter Baudis. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-04 08:02:23 +02:00			`e = locate_size_cache(s->sha1, 1, 0);`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`if (e) {`
			`s->size = e->size;`
			`return 0;`
			`}`
[PATCH] Enhance sha1_file_size() into sha1_object_info() This lets us eliminate one use of map_sha1_file() outside sha1_file.c, to bring us one step closer to the packed GIT. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-27 12:34:06 +02:00			`if (!sha1_object_info(s->sha1, type, &s->size))`
[PATCH] diff.c: locate_size_cache() fix. This fixes two bugs. - declaration of auto variable "cmp" was preceeded by a statement, causing compilation error on real C compilers; noticed and patch given by Yoichi Yuasa. - the function's calling convention was overloading its size parameter to mean "largest possible value means do not add entry", which was a bad taste. Brought up during a discussion with Peter Baudis. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-04 08:02:23 +02:00			`locate_size_cache(s->sha1, 0, s->size);`
[PATCH] Find size of SHA1 object without inflating everything. This adds sha1_file_size() helper function and uses it in the rename/copy similarity estimator. The helper function handles deltified object as well. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 00:20:54 +02:00			`}`
			`else {`
			`s->data = read_sha1_file(s->sha1, type, &s->size);`
			`s->should_free = 1;`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`}`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`}`
			`return 0;`
			`}`

Revert "[PATCH] plug memory leak in diff.c::diff_free_filepair()" This reverts 068eac91ce04b9aca163acb1927c3878c45d1a07 commit. 2005-09-14 23:06:50 +02:00			`void diff_free_filespec_data(struct diff_filespec *s)`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`{`
			`if (s->should_free)`
			`free(s->data);`
			`else if (s->should_munmap)`
			`munmap(s->data, s->size);`
Revert "[PATCH] plug memory leak in diff.c::diff_free_filepair()" This reverts 068eac91ce04b9aca163acb1927c3878c45d1a07 commit. 2005-09-14 23:06:50 +02:00			`s->should_free = s->should_munmap = 0;`
			`s->data = NULL;`
diffcore-rename: somewhat optimized. This changes diffcore-rename to reuse statistics information gathered during similarity estimation, and updates the hashtable implementation used to keep track of the statistics to be denser. This seems to give better performance. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-12 12:22:10 +01:00			`free(s->cnt_data);`
			`s->cnt_data = NULL;`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`}`

Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`static void prep_temp_blob(struct diff_tempfile *temp,`
			`void *blob,`
			`unsigned long size,`
Consolidate null_sha1[]. Signed-off-by: Junio C Hamano <junio@twinsun.com> 2005-09-30 23:02:47 +02:00			`const unsigned char *sha1,`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`int mode)`
			`{`
			`int fd;`

[PATCH] git: use git_mkstemp() instead of mkstemp() for diff generation. This lets you run git diff in a repository otherwise read-only to you. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 22:49:49 +02:00			`fd = git_mkstemp(temp->tmp_path, TEMPFILE_PATH_LEN, ".diff_XXXXXX");`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`if (fd < 0)`
			`die("unable to create temp-file");`
			`if (write(fd, blob, size) != size)`
			`die("unable to write temp-file");`
			`close(fd);`
			`temp->name = temp->tmp_path;`
			`strcpy(temp->hex, sha1_to_hex(sha1));`
			`temp->hex[40] = 0;`
			`sprintf(temp->mode, "%06o", mode);`
			`}`

[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`static void prepare_temp_file(const char *name,`
			`struct diff_tempfile *temp,`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`struct diff_filespec *one)`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`{`
[PATCH] The diff-raw format updates. Update the diff-raw format as Linus and I discussed, except that it does not use sequence of underscore '_' letters to express nonexistence. All '0' mode is used for that purpose instead. The new diff-raw format can express rename/copy, and the earlier restriction that -M and -C _must_ be used with the patch format output is no longer necessary. The patch makes -M and -C flags independent of -p flag, so you need to say git-whatchanged -M -p to get the diff/patch format. Updated are both documentations and tests. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:42:18 +02:00			`if (!DIFF_FILE_VALID(one)) {`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`not_a_valid_file:`
[PATCH] diff.c: clean temporary files When diff-cache -p and friends are interrupted, they can leave their temporary files behind. Also when the external diff program is killed instead of exiting (this usually happens when piping the output to a pager, which can cause SIGPIPE when the user quits viewing the diff early), they incorrectly died without cleaning their temporary file. This fixes these problems. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-28 19:13:01 +02:00			`/* A '-' entry produces this for file-2, and`
			`* a '+' entry produces this for file-1.`
			`*/`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`temp->name = "/dev/null";`
			`strcpy(temp->hex, ".");`
			`strcpy(temp->mode, ".");`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`return;`
			`}`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`if (!one->sha1_valid \|\|`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`work_tree_matches(name, one->sha1)) {`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`struct stat st;`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (lstat(name, &st) < 0) {`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`if (errno == ENOENT)`
			`goto not_a_valid_file;`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`die("stat(%s): %s", name, strerror(errno));`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`}`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`if (S_ISLNK(st.st_mode)) {`
			`int ret;`
avoid asking ?alloc() for zero bytes. Avoid asking for zero bytes when that change simplifies overall logic. Later we would change the wrapper to ask for 1 byte on platforms that return NULL for zero byte request. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-26 21:34:56 +01:00			`char buf[PATH_MAX + 1]; /* ought to be SYMLINK_MAX */`
			`if (sizeof(buf) <= st.st_size)`
			`die("symlink too long: %s", name);`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`ret = readlink(name, buf, st.st_size);`
			`if (ret < 0)`
			`die("readlink(%s)", name);`
			`prep_temp_blob(temp, buf, st.st_size,`
			`(one->sha1_valid ?`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`one->sha1 : null_sha1),`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`(one->sha1_valid ?`
			`one->mode : S_IFLNK));`
			`}`
			`else {`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`/* we can borrow from the file in the work tree */`
			`temp->name = name;`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`if (!one->sha1_valid)`
			`strcpy(temp->hex, sha1_to_hex(null_sha1));`
			`else`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`strcpy(temp->hex, sha1_to_hex(one->sha1));`
[PATCH] diff: further cleanup. When preparing data to feed the external diff, we should give the mode we obtained from the caller, even when we are dealing with a file with 0{40} SHA1 (i.e. the caller said "look at the filesystem"), since the mode passed by the caller via diff_addremove() or diff_change() is always trustworthy. This is _not_ a bugfix --- the existing code stat() on the file ifself and does the same computation on st.st_mode to compute the mode the same way the caller did to give the original mode. We cannot remove the stat() call from here, but the extra computation to create the mode value is unnecessary. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 09:07:39 +02:00			`/* Even though we may sometimes borrow the`
			`* contents from the work tree, we always want`
			`* one->mode. mode is trustworthy even when`
			`* !(one->sha1_valid), as long as`
			`* DIFF_FILE_VALID(one).`
			`*/`
			`sprintf(temp->mode, "%06o", one->mode);`
Update diff engine for symlinks stored in the cache. This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-06 01:10:21 +02:00			`}`
			`return;`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`}`
			`else {`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`if (diff_populate_filespec(one, 0))`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`die("cannot read data blob for %s", one->path);`
			`prep_temp_blob(temp, one->data, one->size,`
			`one->sha1, one->mode);`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`}`
			`}`

			`static void remove_tempfile(void)`
			`{`
			`int i;`

			`for (i = 0; i < 2; i++)`
			`if (diff_temp[i].name == diff_temp[i].tmp_path) {`
			`unlink(diff_temp[i].name);`
			`diff_temp[i].name = NULL;`
			`}`
			`}`

[PATCH] diff.c: clean temporary files When diff-cache -p and friends are interrupted, they can leave their temporary files behind. Also when the external diff program is killed instead of exiting (this usually happens when piping the output to a pager, which can cause SIGPIPE when the user quits viewing the diff early), they incorrectly died without cleaning their temporary file. This fixes these problems. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-28 19:13:01 +02:00			`static void remove_tempfile_on_signal(int signo)`
			`{`
			`remove_tempfile();`
Allow diff and index commands to be interrupted So far, e.g. git-update-index --refresh was basically uninterruptable by ctrl-c, since it hooked the SIGINT handler, but that handler would only unlink the lockfile but not actually quit. This makes it propagate the signal to the default handler. Note that I expected it to work without resetting the signal handler to SIG_DFL, but without that it ended in an infinite loop of tgkill()s - is my glibc violating SUS or what? Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-01 00:40:33 +01:00			`signal(SIGINT, SIG_DFL);`
			`raise(signo);`
[PATCH] diff.c: clean temporary files When diff-cache -p and friends are interrupted, they can leave their temporary files behind. Also when the external diff program is killed instead of exiting (this usually happens when piping the output to a pager, which can cause SIGPIPE when the user quits viewing the diff early), they incorrectly died without cleaning their temporary file. This fixes these problems. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-28 19:13:01 +02:00			`}`

Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00			`static int spawn_prog(const char pgm, const char *arg)`
			`{`
			`pid_t pid;`
			`int status;`

			`fflush(NULL);`
			`pid = fork();`
			`if (pid < 0)`
			`die("unable to fork");`
			`if (!pid) {`
			`execvp(pgm, (char const) arg);`
			`exit(255);`
			`}`

			`while (waitpid(pid, &status, 0) < 0) {`
			`if (errno == EINTR)`
			`continue;`
			`return -1;`
			`}`

			`/* Earlier we did not check the exit status because`
			`* diff exits non-zero if files are different, and`
			`* we are not interested in knowing that. It was a`
			`* mistake which made it harder to quit a diff-*`
			`* session that uses the git-apply-patch-script as`
			`* the GIT_EXTERNAL_DIFF. A custom GIT_EXTERNAL_DIFF`
			`* should also exit non-zero only when it wants to`
			`* abort the entire diff-* session.`
			`*/`
			`if (WIFEXITED(status) && !WEXITSTATUS(status))`
			`return 0;`
			`return -1;`
			`}`

[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`/* An external diff command takes:`
			`*`
			`* diff-cmd name infile1 infile1-sha1 infile1-mode \`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`* infile2 infile2-sha1 infile2-mode [ rename-to ]`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`*`
			`*/`
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`static void run_external_diff(const char *pgm,`
			`const char *name,`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`const char *other,`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`struct diff_filespec *one,`
			`struct diff_filespec *two,`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`const char *xfrm_msg,`
			`int complete_rewrite)`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`{`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00			`const char *spawn_arg[10];`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`struct diff_tempfile *temp = diff_temp;`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00			`int retval;`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`static int atexit_asked = 0;`
Fix ?: statements. Omitting the first branch in ?: is a GNU extension. Cute, but not supported by other compilers. Replaced mostly by explicit tests. Calls to getenv() simply are repeated on non-GNU compilers. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> 2005-08-23 22:34:07 +02:00			`const char *othername;`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`const char **arg = &spawn_arg[0];`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00
Fix ?: statements. Omitting the first branch in ?: is a GNU extension. Cute, but not supported by other compilers. Replaced mostly by explicit tests. Calls to getenv() simply are repeated on non-GNU compilers. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> 2005-08-23 22:34:07 +02:00			`othername = (other? other : name);`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`if (one && two) {`
			`prepare_temp_file(name, &temp[0], one);`
Fix ?: statements. Omitting the first branch in ?: is a GNU extension. Cute, but not supported by other compilers. Replaced mostly by explicit tests. Calls to getenv() simply are repeated on non-GNU compilers. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> 2005-08-23 22:34:07 +02:00			`prepare_temp_file(othername, &temp[1], two);`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`if (! atexit_asked &&`
			`(temp[0].name == temp[0].tmp_path \|\|`
			`temp[1].name == temp[1].tmp_path)) {`
			`atexit_asked = 1;`
			`atexit(remove_tempfile);`
			`}`
[PATCH] diff.c: clean temporary files When diff-cache -p and friends are interrupted, they can leave their temporary files behind. Also when the external diff program is killed instead of exiting (this usually happens when piping the output to a pager, which can cause SIGPIPE when the user quits viewing the diff early), they incorrectly died without cleaning their temporary file. This fixes these problems. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-28 19:13:01 +02:00			`signal(SIGINT, remove_tempfile_on_signal);`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`}`

true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`if (one && two) {`
			`*arg++ = pgm;`
			`*arg++ = name;`
			`*arg++ = temp[0].name;`
			`*arg++ = temp[0].hex;`
			`*arg++ = temp[0].mode;`
			`*arg++ = temp[1].name;`
			`*arg++ = temp[1].hex;`
			`*arg++ = temp[1].mode;`
			`if (other) {`
			`*arg++ = other;`
			`*arg++ = xfrm_msg;`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`}`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00			`} else {`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`*arg++ = pgm;`
			`*arg++ = name;`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`}`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`*arg = NULL;`
			`retval = spawn_prog(pgm, spawn_arg);`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00			`remove_tempfile();`
			`if (retval) {`
Terminate diff-* on non-zero exit from GIT_EXTERNAL_DIFF (slightly updated from the version posted to the GIT mailing list with small bugfixes). This patch changes the git-apply-patch-script to exit non-zero when the patch cannot be applied. Previously, the external diff driver deliberately ignored the exit status of GIT_EXTERNAL_DIFF command, which was a design mistake. It now stops the processing when GIT_EXTERNAL_DIFF exits non-zero, so the damages from running git-diff-* with git-apply-patch-script between two wrong trees can be contained. The "diff" command line generated by the built-in driver is changed to always exit 0 in order to match this new behaviour. I know Pasky does not use GIT_EXTERNAL_DIFF yet, so this change should not break Cogito, either. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:38:06 +02:00			`fprintf(stderr, "external diff died, stopping at %s.\n", name);`
			`exit(1);`
[PATCH] diff.c: clean temporary files When diff-cache -p and friends are interrupted, they can leave their temporary files behind. Also when the external diff program is killed instead of exiting (this usually happens when piping the output to a pager, which can cause SIGPIPE when the user quits viewing the diff early), they incorrectly died without cleaning their temporary file. This fixes these problems. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-28 19:13:01 +02:00			`}`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`}`

true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`static void run_diff_cmd(const char *pgm,`
			`const char *name,`
			`const char *other,`
			`struct diff_filespec *one,`
			`struct diff_filespec *two,`
			`const char *xfrm_msg,`
			`int complete_rewrite)`
			`{`
			`if (pgm) {`
			`run_external_diff(pgm, name, other, one, two, xfrm_msg,`
			`complete_rewrite);`
			`return;`
			`}`
			`if (one && two)`
			`builtin_diff(name, other ? other : name,`
			`one, two, xfrm_msg, complete_rewrite);`
			`else`
			`printf("* Unmerged path %s\n", name);`
			`}`

Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`static void diff_fill_sha1_info(struct diff_filespec *one)`
			`{`
			`if (DIFF_FILE_VALID(one)) {`
			`if (!one->sha1_valid) {`
			`struct stat st;`
Handle symlinks graciously This patch converts a stat() to an lstat() call, thereby fixing the case when the date of a symlink was not the same as the one recorded in the index. The included test case demonstrates this. This is for the case that the symlink points to a non-existing file. If the file exists, worse things than just an error message happen. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-26 22:31:42 +01:00			`if (lstat(one->path, &st) < 0)`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`die("stat %s", one->path);`
			`if (index_path(one->sha1, one->path, &st, 0))`
			`die("cannot hash %s\n", one->path);`
			`}`
			`}`
			`else`
			`memset(one->sha1, 0, 20);`
			`}`

diff: --full-index A new option, --full-index, is introduced to diff family. This causes the full object name of pre- and post-images to appear on the index line of patch formatted output, to be used in conjunction with --allow-binary-replacement option of git-apply. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 02:53:22 +01:00			`static void run_diff(struct diff_filepair p, struct diff_options o)`
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`{`
			`const char *pgm = external_diff();`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`char msg[PATH_MAX2+300], xfrm_msg;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`struct diff_filespec *one;`
			`struct diff_filespec *two;`
			`const char *name;`
			`const char *other;`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`char name_munged, other_munged;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`int complete_rewrite = 0;`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`int len;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00
			`if (DIFF_PAIR_UNMERGED(p)) {`
			`/* unmerged */`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`run_diff_cmd(pgm, p->one->path, NULL, NULL, NULL, NULL, 0);`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`return;`
			`}`

			`name = p->one->path;`
			`other = (strcmp(name, p->two->path) ? p->two->path : NULL);`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`name_munged = quote_one(name);`
			`other_munged = quote_one(other);`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`one = p->one; two = p->two;`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00
			`diff_fill_sha1_info(one);`
			`diff_fill_sha1_info(two);`

			`len = 0;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`switch (p->status) {`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`case DIFF_STATUS_COPIED:`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`len += snprintf(msg + len, sizeof(msg) - len,`
			`"similarity index %d%%\n"`
			`"copy from %s\n"`
			`"copy to %s\n",`
			`(int)(0.5 + p->score * 100.0/MAX_SCORE),`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`name_munged, other_munged);`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`break;`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`case DIFF_STATUS_RENAMED:`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`len += snprintf(msg + len, sizeof(msg) - len,`
			`"similarity index %d%%\n"`
			`"rename from %s\n"`
			`"rename to %s\n",`
			`(int)(0.5 + p->score * 100.0/MAX_SCORE),`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`name_munged, other_munged);`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`break;`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`case DIFF_STATUS_MODIFIED:`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`if (p->score) {`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`len += snprintf(msg + len, sizeof(msg) - len,`
			`"dissimilarity index %d%%\n",`
			`(int)(0.5 + p->score *`
			`100.0/MAX_SCORE));`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`complete_rewrite = 1;`
			`break;`
			`}`
			`/* fallthru */`
			`default:`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`/* nothing */`
			`;`
[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`}`

Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`if (memcmp(one->sha1, two->sha1, 20)) {`
			`char one_sha1[41];`
abbrev cleanup: use symbolic constants The minimum length of abbreviated object name was hardcoded in different places to be 4, risking inconsistencies in the future. Also there were three different "default abbreviation precision". Use two C preprocessor symbols to clean up this mess. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-25 10:03:18 +01:00			`int abbrev = o->full_index ? 40 : DEFAULT_ABBREV;`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`memcpy(one_sha1, sha1_to_hex(one->sha1), 41);`

			`len += snprintf(msg + len, sizeof(msg) - len,`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`"index %.s..%.s",`
			`abbrev, one_sha1, abbrev,`
			`sha1_to_hex(two->sha1));`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`if (one->mode == two->mode)`
			`len += snprintf(msg + len, sizeof(msg) - len,`
			`" %06o", one->mode);`
			`len += snprintf(msg + len, sizeof(msg) - len, "\n");`
			`}`

			`if (len)`
			`msg[--len] = 0;`
			`xfrm_msg = len ? msg : NULL;`

[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`if (!pgm &&`
			`DIFF_FILE_VALID(one) && DIFF_FILE_VALID(two) &&`
			`(S_IFMT & one->mode) != (S_IFMT & two->mode)) {`
			`/* a filepair that changes between file and symlink`
			`* needs to be split into deletion and creation.`
			`*/`
			`struct diff_filespec *null = alloc_filespec(two->path);`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`run_diff_cmd(NULL, name, other, one, null, xfrm_msg, 0);`
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`free(null);`
			`null = alloc_filespec(one->path);`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`run_diff_cmd(NULL, name, other, null, two, xfrm_msg, 0);`
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`free(null);`
			`}`
			`else`
true built-in diff: run everything in-core. This stops using temporary files when we are using the built-in diff (including the complete rewrite). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-26 09:12:17 +02:00			`run_diff_cmd(pgm, name, other, one, two, xfrm_msg,`
			`complete_rewrite);`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00
			`free(name_munged);`
			`free(other_munged);`
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`}`

diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`static void run_diffstat(struct diff_filepair p, struct diff_options o,`
			`struct diffstat_t *diffstat)`
			`{`
			`const char *name;`
			`const char *other;`

			`if (DIFF_PAIR_UNMERGED(p)) {`
			`/* unmerged */`
			`builtin_diffstat(p->one->path, NULL, NULL, NULL, diffstat);`
			`return;`
			`}`

			`name = p->one->path;`
			`other = (strcmp(name, p->two->path) ? p->two->path : NULL);`

			`diff_fill_sha1_info(p->one);`
			`diff_fill_sha1_info(p->two);`

			`builtin_diffstat(name, other, p->one, p->two, diffstat);`
			`}`

Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`void diff_setup(struct diff_options *options)`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`{`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`memset(options, 0, sizeof(*options));`
			`options->output_format = DIFF_FORMAT_RAW;`
			`options->line_termination = '\n';`
			`options->break_opt = -1;`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`options->rename_limit = -1;`
Split up tree diff functions into tree-diff.c library This makes the tree diff functionality independent of the "git-diff-tree" program, by splitting the core functionality up into a library file. This will be needed for when we teach git-rev-list to only follow a specified set of pathnames, rather than the global revision history. Most of it is a fairly straightforward code move, but it also involves some calling convention cleanup, and moving some of the static variables from diff-tree.c into the options structure. The actual tree change callback routines also become paramterized by the diff_options structure, allowing the library functionality to do something else than just show the diff on stdout. Right now the only user of this functionality remains git-diff-tree itself. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-21 06:05:05 +02:00
			`options->change = diff_change;`
			`options->add_remove = diff_addremove;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`}`

			`int diff_setup_done(struct diff_options *options)`
			`{`
diff: make default rename detection limit configurable. A while ago, a rename-detection limit logic was implemented as a response to this thread: http://marc.theaimsgroup.com/?l=git&m=112413080630175 where gitweb was found to be using a lot of time and memory to detect renames on huge commits. git-diff family takes -l<num> flag, and if the number of paths that are rename destination candidates (i.e. new paths with -M, or modified paths with -C) are larger than that number, skips rename/copy detection even when -M or -C is specified on the command line. This commit makes the rename detection limit easier to use. You can have: [diff] renamelimit = 30 in your .git/config file to specify the default rename detection limit. You can override this from the command line; giving 0 means 'unlimited': git diff -M -l0 We might want to change the default behaviour, when you do not have the configuration, to limit it to say 20 paths or so. This would also help the diffstat generation after a big 'git pull'. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 21:48:08 +01:00			`if ((options->find_copies_harder &&`
			`options->detect_rename != DIFF_DETECT_COPY) \|\|`
			`(0 <= options->rename_limit && !options->detect_rename))`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`return -1;`
diff --stat: make sure to set recursive. Just like "patch" format always needs recursive, "diffstat" format does not make sense without setting recursive. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-18 20:29:33 +02:00
			`/*`
			`* These cases always need recursive; we do not drop caller-supplied`
			`* recursive bits for other formats here.`
			`*/`
			`if ((options->output_format == DIFF_FORMAT_PATCH) \|\|`
			`(options->output_format == DIFF_FORMAT_DIFFSTAT) \|\|`
			`(options->with_stat))`
			`options->recursive = 1;`

diff: make default rename detection limit configurable. A while ago, a rename-detection limit logic was implemented as a response to this thread: http://marc.theaimsgroup.com/?l=git&m=112413080630175 where gitweb was found to be using a lot of time and memory to detect renames on huge commits. git-diff family takes -l<num> flag, and if the number of paths that are rename destination candidates (i.e. new paths with -M, or modified paths with -C) are larger than that number, skips rename/copy detection even when -M or -C is specified on the command line. This commit makes the rename detection limit easier to use. You can have: [diff] renamelimit = 30 in your .git/config file to specify the default rename detection limit. You can override this from the command line; giving 0 means 'unlimited': git diff -M -l0 We might want to change the default behaviour, when you do not have the configuration, to limit it to say 20 paths or so. This would also help the diffstat generation after a big 'git pull'. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 21:48:08 +01:00			`if (options->detect_rename && options->rename_limit < 0)`
			`options->rename_limit = diff_rename_limit_default;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->setup & DIFF_SETUP_USE_CACHE) {`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`if (!active_cache)`
			`/* read-cache does not die even when it fails`
			`* so it is safe for us to do this here. Also`
			`* it does not smudge active_cache or active_nr`
			`* when it fails, so we do not have to worry about`
code comments: spell Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-29 10:30:08 +01:00			`* cleaning it up ourselves either.`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`*/`
			`read_cache();`
			`}`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->setup & DIFF_SETUP_USE_SIZE_CACHE)`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`use_size_cache = 1;`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`if (options->abbrev <= 0 \|\| 40 < options->abbrev)`
			`options->abbrev = 40; /* full */`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00
			`return 0;`
			`}`

			`int diff_opt_parse(struct diff_options options, const char *av, int ac)`
			`{`
			`const char *arg = av[0];`
			`if (!strcmp(arg, "-p") \|\| !strcmp(arg, "-u"))`
			`options->output_format = DIFF_FORMAT_PATCH;`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`else if (!strcmp(arg, "--patch-with-raw")) {`
			`options->output_format = DIFF_FORMAT_PATCH;`
			`options->with_raw = 1;`
			`}`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`else if (!strcmp(arg, "--stat"))`
			`options->output_format = DIFF_FORMAT_DIFFSTAT;`
diff-options: add --patch-with-stat With this option, git prepends a diffstat in front of the patch. Since I really, really do not know what a diffstat of a combined diff ("merge diff") should look like, the diffstat is not generated for these. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-15 13:41:18 +02:00			`else if (!strcmp(arg, "--patch-with-stat")) {`
			`options->output_format = DIFF_FORMAT_PATCH;`
			`options->with_stat = 1;`
			`}`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`else if (!strcmp(arg, "-z"))`
			`options->line_termination = 0;`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`else if (!strncmp(arg, "-l", 2))`
			`options->rename_limit = strtoul(arg+2, NULL, 10);`
diff: --full-index A new option, --full-index, is introduced to diff family. This causes the full object name of pre- and post-images to appear on the index line of patch formatted output, to be used in conjunction with --allow-binary-replacement option of git-apply. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 02:53:22 +01:00			`else if (!strcmp(arg, "--full-index"))`
			`options->full_index = 1;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`else if (!strcmp(arg, "--name-only"))`
			`options->output_format = DIFF_FORMAT_NAME;`
Diff: --name-status output format. The new output format shows only the status letter and paths. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:20:06 +02:00			`else if (!strcmp(arg, "--name-status"))`
			`options->output_format = DIFF_FORMAT_NAME_STATUS;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`else if (!strcmp(arg, "-R"))`
			`options->reverse_diff = 1;`
			`else if (!strncmp(arg, "-S", 2))`
			`options->pickaxe = arg + 2;`
			`else if (!strcmp(arg, "-s"))`
			`options->output_format = DIFF_FORMAT_NO_OUTPUT;`
			`else if (!strncmp(arg, "-O", 2))`
			`options->orderfile = arg + 2;`
			`else if (!strncmp(arg, "--diff-filter=", 14))`
			`options->filter = arg + 14;`
			`else if (!strcmp(arg, "--pickaxe-all"))`
			`options->pickaxe_opts = DIFF_PICKAXE_ALL;`
Support for pickaxe matching regular expressions git-diff-* --pickaxe-regex will change the -S pickaxe to match POSIX extended regular expressions instead of fixed strings. The regex.h library is a rather stupid interface and I like pcre too, but with any luck it will be everywhere we will want to run Git on, it being POSIX.2 and all. I'm not sure if we can expect platforms like AIX to conform to POSIX.2 or if win32 has regex.h. We might add a flag to Makefile if there is a portability trouble potential. Signed-off-by: Petr Baudis <pasky@suse.cz> 2006-03-29 02:16:33 +02:00			`else if (!strcmp(arg, "--pickaxe-regex"))`
			`options->pickaxe_opts = DIFF_PICKAXE_REGEX;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`else if (!strncmp(arg, "-B", 2)) {`
			`if ((options->break_opt =`
			`diff_scoreopt_parse(arg)) == -1)`
			`return -1;`
			`}`
			`else if (!strncmp(arg, "-M", 2)) {`
			`if ((options->rename_score =`
			`diff_scoreopt_parse(arg)) == -1)`
			`return -1;`
			`options->detect_rename = DIFF_DETECT_RENAME;`
			`}`
			`else if (!strncmp(arg, "-C", 2)) {`
			`if ((options->rename_score =`
			`diff_scoreopt_parse(arg)) == -1)`
			`return -1;`
			`options->detect_rename = DIFF_DETECT_COPY;`
			`}`
			`else if (!strcmp(arg, "--find-copies-harder"))`
			`options->find_copies_harder = 1;`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`else if (!strcmp(arg, "--abbrev"))`
abbrev cleanup: use symbolic constants The minimum length of abbreviated object name was hardcoded in different places to be 4, risking inconsistencies in the future. Also there were three different "default abbreviation precision". Use two C preprocessor symbols to clean up this mess. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-25 10:03:18 +01:00			`options->abbrev = DEFAULT_ABBREV;`
diff --abbrev=<n> option fix. Earier specifying an abbreviation shorter than minimum fell back to full 40 letters, which was nonsense. Make it to fall back to the minimum number (currently 4). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-27 11:19:51 +01:00			`else if (!strncmp(arg, "--abbrev=", 9)) {`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`options->abbrev = strtoul(arg + 9, NULL, 10);`
diff --abbrev=<n> option fix. Earier specifying an abbreviation shorter than minimum fell back to full 40 letters, which was nonsense. Make it to fall back to the minimum number (currently 4). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-27 11:19:51 +01:00			`if (options->abbrev < MINIMUM_ABBREV)`
			`options->abbrev = MINIMUM_ABBREV;`
			`else if (40 < options->abbrev)`
			`options->abbrev = 40;`
			`}`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`else`
			`return 0;`
			`return 1;`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`}`

[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`static int parse_num(const char **cp_p)`
			`{`
rename/copy score parsing updates. Better variant, which handles stuff like "4.5%" and rejects "192.168.0.1". Additionally, make sure numbers are unsigned (I'm making them unsigned long just for the hell of it), to make sure that artificial wraparound scenarios don't cause harm. -hpa [jc: with this, -M100 changes its meaning back to 10%. People wanting to say "pure renames only" should now say -M100% or -M1.0; sounds a bit like an earthquake, but arguably things are more consistent this way ;-)] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-21 23:17:12 +01:00			`unsigned long num, scale;`
			`int ch, dot;`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`const char cp = cp_p;`

rename/copy score parsing updates. Better variant, which handles stuff like "4.5%" and rejects "192.168.0.1". Additionally, make sure numbers are unsigned (I'm making them unsigned long just for the hell of it), to make sure that artificial wraparound scenarios don't cause harm. -hpa [jc: with this, -M100 changes its meaning back to 10%. People wanting to say "pure renames only" should now say -M100% or -M1.0; sounds a bit like an earthquake, but arguably things are more consistent this way ;-)] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-21 23:17:12 +01:00			`num = 0;`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`scale = 1;`
rename/copy score parsing updates. Better variant, which handles stuff like "4.5%" and rejects "192.168.0.1". Additionally, make sure numbers are unsigned (I'm making them unsigned long just for the hell of it), to make sure that artificial wraparound scenarios don't cause harm. -hpa [jc: with this, -M100 changes its meaning back to 10%. People wanting to say "pure renames only" should now say -M100% or -M1.0; sounds a bit like an earthquake, but arguably things are more consistent this way ;-)] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-21 23:17:12 +01:00			`dot = 0;`
			`for(;;) {`
			`ch = *cp;`
			`if ( !dot && ch == '.' ) {`
			`scale = 1;`
			`dot = 1;`
			`} else if ( ch == '%' ) {`
			`scale = dot ? scale*100 : 100;`
			`cp++; /* % is always at the end */`
			`break;`
			`} else if ( ch >= '0' && ch <= '9' ) {`
			`if ( scale < 100000 ) {`
			`scale *= 10;`
			`num = (num*10) + (ch-'0');`
			`}`
			`} else {`
			`break;`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`}`
Fix compilation warnings. ... found by compiling them with gcc 2.95. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-30 06:17:21 +02:00			`cp++;`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`}`
			`*cp_p = cp;`

			`/* user says num divided by scale and we say internally that`
			`* is MAX_SCORE * num / scale.`
			`*/`
rename/copy score parsing updates. Better variant, which handles stuff like "4.5%" and rejects "192.168.0.1". Additionally, make sure numbers are unsigned (I'm making them unsigned long just for the hell of it), to make sure that artificial wraparound scenarios don't cause harm. -hpa [jc: with this, -M100 changes its meaning back to 10%. People wanting to say "pure renames only" should now say -M100% or -M1.0; sounds a bit like an earthquake, but arguably things are more consistent this way ;-)] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-21 23:17:12 +01:00			`return (num >= scale) ? MAX_SCORE : (MAX_SCORE * num / scale);`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`}`

			`int diff_scoreopt_parse(const char *opt)`
			`{`
[PATCH] diff: Update -B heuristics. As Linus pointed out on the mailing list discussion, -B should break a files that has many inserts even if it still keeps enough of the original contents, so that the broken pieces can later be matched with other files by -M or -C. However, if such a broken pair does not get picked up by -M or -C, we would want to apply different criteria; namely, regardless of the amount of new material in the result, the determination of "rewrite" should be done by looking at the amount of original material still left in the result. If you still have the original 97 lines from a 100-line document, it does not matter if you add your own 13 lines to make a 110-line document, or if you add 903 lines to make a 1000-line document. It is not a rewrite but an in-place edit. On the other hand, if you did lose 97 lines from the original, it does not matter if you added 27 lines to make a 30-line document or if you added 997 lines to make a 1000-line document. You did a complete rewrite in either case. This patch introduces a post-processing phase that runs after diffcore-rename matches up broken pairs diffcore-break creates. The purpose of this post-processing is to pick up these broken pieces and merge them back into in-place modifications. For this, the score parameter -B option takes is changed into a pair of numbers, and it takes "-B99/80" format when fully spelled out. The first number is the minimum amount of "edit" (same definition as what diffcore-rename uses, which is "sum of deletion and insertion") that a modification needs to have to be broken, and the second number is the minimum amount of "delete" a surviving broken pair must have to avoid being merged back together. It can be abbreviated to "-B" to use default for both, "-B9" or "-B9/" to use 90% for "edit" but default (80%) for merge avoidance, or "-B/75" to use default (99%) "edit" and 75% for merge avoidance. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:40:28 +02:00			`int opt1, opt2, cmd;`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00
			`if (*opt++ != '-')`
			`return -1;`
			`cmd = *opt++;`
			`if (cmd != 'M' && cmd != 'C' && cmd != 'B')`
			`return -1; /* that is not a -M, -C nor -B option */`

			`opt1 = parse_num(&opt);`
[PATCH] diff: Update -B heuristics. As Linus pointed out on the mailing list discussion, -B should break a files that has many inserts even if it still keeps enough of the original contents, so that the broken pieces can later be matched with other files by -M or -C. However, if such a broken pair does not get picked up by -M or -C, we would want to apply different criteria; namely, regardless of the amount of new material in the result, the determination of "rewrite" should be done by looking at the amount of original material still left in the result. If you still have the original 97 lines from a 100-line document, it does not matter if you add your own 13 lines to make a 110-line document, or if you add 903 lines to make a 1000-line document. It is not a rewrite but an in-place edit. On the other hand, if you did lose 97 lines from the original, it does not matter if you added 27 lines to make a 30-line document or if you added 997 lines to make a 1000-line document. You did a complete rewrite in either case. This patch introduces a post-processing phase that runs after diffcore-rename matches up broken pairs diffcore-break creates. The purpose of this post-processing is to pick up these broken pieces and merge them back into in-place modifications. For this, the score parameter -B option takes is changed into a pair of numbers, and it takes "-B99/80" format when fully spelled out. The first number is the minimum amount of "edit" (same definition as what diffcore-rename uses, which is "sum of deletion and insertion") that a modification needs to have to be broken, and the second number is the minimum amount of "delete" a surviving broken pair must have to avoid being merged back together. It can be abbreviated to "-B" to use default for both, "-B9" or "-B9/" to use 90% for "edit" but default (80%) for merge avoidance, or "-B/75" to use default (99%) "edit" and 75% for merge avoidance. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:40:28 +02:00			`if (cmd != 'B')`
			`opt2 = 0;`
			`else {`
			`if (*opt == 0)`
			`opt2 = 0;`
			`else if (*opt != '/')`
			`return -1; /* we expect -B80/99 or -B80 */`
			`else {`
			`opt++;`
			`opt2 = parse_num(&opt);`
			`}`
			`}`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`if (*opt != 0)`
			`return -1;`
[PATCH] diff: Update -B heuristics. As Linus pointed out on the mailing list discussion, -B should break a files that has many inserts even if it still keeps enough of the original contents, so that the broken pieces can later be matched with other files by -M or -C. However, if such a broken pair does not get picked up by -M or -C, we would want to apply different criteria; namely, regardless of the amount of new material in the result, the determination of "rewrite" should be done by looking at the amount of original material still left in the result. If you still have the original 97 lines from a 100-line document, it does not matter if you add your own 13 lines to make a 110-line document, or if you add 903 lines to make a 1000-line document. It is not a rewrite but an in-place edit. On the other hand, if you did lose 97 lines from the original, it does not matter if you added 27 lines to make a 30-line document or if you added 997 lines to make a 1000-line document. You did a complete rewrite in either case. This patch introduces a post-processing phase that runs after diffcore-rename matches up broken pairs diffcore-break creates. The purpose of this post-processing is to pick up these broken pieces and merge them back into in-place modifications. For this, the score parameter -B option takes is changed into a pair of numbers, and it takes "-B99/80" format when fully spelled out. The first number is the minimum amount of "edit" (same definition as what diffcore-rename uses, which is "sum of deletion and insertion") that a modification needs to have to be broken, and the second number is the minimum amount of "delete" a surviving broken pair must have to avoid being merged back together. It can be abbreviated to "-B" to use default for both, "-B9" or "-B9/" to use 90% for "edit" but default (80%) for merge avoidance, or "-B/75" to use default (99%) "edit" and 75% for merge avoidance. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:40:28 +02:00			`return opt1 \| (opt2 << 16);`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00			`}`

[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`struct diff_queue_struct diff_queued_diff;`

			`void diff_q(struct diff_queue_struct queue, struct diff_filepair dp)`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`{`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`if (queue->alloc <= queue->nr) {`
			`queue->alloc = alloc_nr(queue->alloc);`
			`queue->queue = xrealloc(queue->queue,`
			`sizeof(dp) * queue->alloc);`
[PATCH] The diff-raw format updates. Update the diff-raw format as Linus and I discussed, except that it does not use sequence of underscore '_' letters to express nonexistence. All '0' mode is used for that purpose instead. The new diff-raw format can express rename/copy, and the earlier restriction that -M and -C _must_ be used with the patch format output is no longer necessary. The patch makes -M and -C flags independent of -p flag, so you need to say git-whatchanged -M -p to get the diff/patch format. Updated are both documentations and tests. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:42:18 +02:00			`}`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`queue->queue[queue->nr++] = dp;`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`}`

[PATCH] Introducing software archaeologist's tool "pickaxe". This steals the "pickaxe" feature from JIT and make it available to the bare Plumbing layer. From the command line, the user gives a string he is intersted in. Using the diff-core infrastructure previously introduced, it filters the differences to limit the output only to the diffs between <src> and <dst> where the string appears only in one but not in the other. For example: $ ./git-rev-list HEAD \| ./git-diff-tree -Sdiff-tree-helper --stdin -M would show the diffs that touch the string "diff-tree-helper". In real software-archaeologist application, you would typically look for a few to several lines of code and see where that code came from. The "pickaxe" module runs after "rename/copy detection" module, so it even crosses the file rename boundary, as the above example demonstrates. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:40:01 +02:00			`struct diff_filepair diff_queue(struct diff_queue_struct queue,`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`struct diff_filespec *one,`
			`struct diff_filespec *two)`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`{`
[PATCH] Introducing software archaeologist's tool "pickaxe". This steals the "pickaxe" feature from JIT and make it available to the bare Plumbing layer. From the command line, the user gives a string he is intersted in. Using the diff-core infrastructure previously introduced, it filters the differences to limit the output only to the diffs between <src> and <dst> where the string appears only in one but not in the other. For example: $ ./git-rev-list HEAD \| ./git-diff-tree -Sdiff-tree-helper --stdin -M would show the diffs that touch the string "diff-tree-helper". In real software-archaeologist application, you would typically look for a few to several lines of code and see where that code came from. The "pickaxe" module runs after "rename/copy detection" module, so it even crosses the file rename boundary, as the above example demonstrates. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:40:01 +02:00			`struct diff_filepair dp = xmalloc(sizeof(dp));`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`dp->one = one;`
			`dp->two = two;`
[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00			`dp->score = 0;`
[PATCH] Fix rename/copy when dealing with temporarily broken pairs. When rename/copy uses a file that was broken by diffcore-break as the source, and the broken filepair gets merged back later, the output was mislabeled as a rename. In this case, the source file ends up staying in the output, so we should label it as a copy instead. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:55:20 +02:00			`dp->status = 0;`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`dp->source_stays = 0;`
[PATCH] Add -B flag to diff-* brothers. A new diffcore transformation, diffcore-break.c, is introduced. When the -B flag is given, a patch that represents a complete rewrite is broken into a deletion followed by a creation. This makes it easier to review such a complete rewrite patch. The -B flag takes the same syntax as the -M and -C flags to specify the minimum amount of non-source material the resulting file needs to have to be considered a complete rewrite, and defaults to 99% if not specified. As the new test t4008-diff-break-rewrite.sh demonstrates, if a file is a complete rewrite, it is broken into a delete/create pair, which can further be subjected to the usual rename detection if -M or -C is used. For example, if file0 gets completely rewritten to make it as if it were rather based on file1 which itself disappeared, the following happens: The original change looks like this: file0 --> file0' (quite different from file0) file1 --> /dev/null After diffcore-break runs, it would become this: file0 --> /dev/null /dev/null --> file0' file1 --> /dev/null Then diffcore-rename matches them up: file1 --> file0' The internal score values are finer grained now. Earlier maximum of 10000 has been raised to 60000; there is no user visible changes but there is no reason to waste available bits. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 09:08:37 +02:00			`dp->broken_pair = 0;`
Plug diff leaks. It is a bit embarrassing that it took this long for a fix since the problem was first reported on Aug 13th. Message-ID: <87y876gl1r.wl@mail2.atmark-techno.com> From: Yasushi SHOJI <yashi@atmark-techno.com> Newsgroups: gmane.comp.version-control.git Subject: [patch] possible memory leak in diff.c::diff_free_filepair() Date: Sat, 13 Aug 2005 19:58:56 +0900 This time I used valgrind to make sure that it does not overeagerly discard memory that is still being used. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 01:13:43 +02:00			`if (queue)`
			`diff_q(queue, dp);`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`return dp;`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`}`

[PATCH] Introduce diff_free_filepair() funcion. This introduces a new function to free a common data structure, and plugs some leaks. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:50:30 +02:00			`void diff_free_filepair(struct diff_filepair *p)`
			`{`
Revert "[PATCH] plug memory leak in diff.c::diff_free_filepair()" This reverts 068eac91ce04b9aca163acb1927c3878c45d1a07 commit. 2005-09-14 23:06:50 +02:00			`diff_free_filespec_data(p->one);`
			`diff_free_filespec_data(p->two);`
Plug diff leaks. It is a bit embarrassing that it took this long for a fix since the problem was first reported on Aug 13th. Message-ID: <87y876gl1r.wl@mail2.atmark-techno.com> From: Yasushi SHOJI <yashi@atmark-techno.com> Newsgroups: gmane.comp.version-control.git Subject: [patch] possible memory leak in diff.c::diff_free_filepair() Date: Sat, 13 Aug 2005 19:58:56 +0900 This time I used valgrind to make sure that it does not overeagerly discard memory that is still being used. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-16 01:13:43 +02:00			`free(p->one);`
			`free(p->two);`
[PATCH] Introduce diff_free_filepair() funcion. This introduces a new function to free a common data structure, and plugs some leaks. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:50:30 +02:00			`free(p);`
			`}`

diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`/* This is different from find_unique_abbrev() in that`
find_unique_abbrev() simplification. Earlier it did not grok the 0{40} SHA1 very well, but what it needed to do was to find the shortest 0{N} that is not used as a valid object name to be consistent with the way names of valid objects are abbreviated. This makes some users simpler. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-10 10:51:12 +01:00			`* it stuffs the result with dots for alignment.`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`*/`
			`const char diff_unique_abbrev(const unsigned char sha1, int len)`
			`{`
			`int abblen;`
			`const char *abbrev;`
			`if (len == 40)`
			`return sha1_to_hex(sha1);`

			`abbrev = find_unique_abbrev(sha1, len);`
find_unique_abbrev() simplification. Earlier it did not grok the 0{40} SHA1 very well, but what it needed to do was to find the shortest 0{N} that is not used as a valid object name to be consistent with the way names of valid objects are abbreviated. This makes some users simpler. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-10 10:51:12 +01:00			`if (!abbrev)`
			`return sha1_to_hex(sha1);`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`abblen = strlen(abbrev);`
			`if (abblen < 37) {`
			`static char hex[41];`
			`if (len < abblen && abblen <= len + 2)`
			`sprintf(hex, "%s%.*s", abbrev, len+3-abblen, "..");`
			`else`
			`sprintf(hex, "%s...", abbrev);`
			`return hex;`
			`}`
			`return sha1_to_hex(sha1);`
			`}`

[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`static void diff_flush_raw(struct diff_filepair *p,`
			`int line_termination,`
Diff: --name-status output format. The new output format shows only the status letter and paths. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:20:06 +02:00			`int inter_name_termination,`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`struct diff_options *options,`
			`int output_format)`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`{`
[PATCH] diff-raw format update take #2. This changes the diff-raw format again, following the mailing list discussion. The new format explicitly expresses which one is a rename and which one is a copy. The documentation and tests are updated to match this change. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 23:55:33 +02:00			`int two_paths;`
			`char status[10];`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`int abbrev = options->abbrev;`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`const char path_one, path_two;`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`path_one = p->one->path;`
			`path_two = p->two->path;`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`if (line_termination) {`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`path_one = quote_one(path_one);`
			`path_two = quote_one(path_two);`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`}`

[PATCH] Rework -B output. Patch for a completely rewritten file detected by the -B flag was shown as a pair of creation followed by deletion in earlier versions. This was an misguided attempt to make reviewing such a complete rewrite easier, and unnecessarily ended up confusing git-apply. Instead, show the entire contents of old version prefixed with '-', followed by the entire contents of new version prefixed with '+'. This gives the same easy-to-review for human consumer while keeping it a single, regular modification patch for machine consumption, something that even GNU patch can grok. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-19 22:17:50 +02:00			`if (p->score)`
			`sprintf(status, "%c%03d", p->status,`
			`(int)(0.5 + p->score * 100.0/MAX_SCORE));`
			`else {`
			`status[0] = p->status;`
			`status[1] = 0;`
			`}`
[PATCH] diff-raw format update take #2. This changes the diff-raw format again, following the mailing list discussion. The new format explicitly expresses which one is a rename and which one is a copy. The documentation and tests are updated to match this change. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 23:55:33 +02:00			`switch (p->status) {`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`case DIFF_STATUS_COPIED:`
			`case DIFF_STATUS_RENAMED:`
[PATCH] diff-raw format update take #2. This changes the diff-raw format again, following the mailing list discussion. The new format explicitly expresses which one is a rename and which one is a copy. The documentation and tests are updated to match this change. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 23:55:33 +02:00			`two_paths = 1;`
			`break;`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`case DIFF_STATUS_ADDED:`
			`case DIFF_STATUS_DELETED:`
[PATCH] Add -B flag to diff-* brothers. A new diffcore transformation, diffcore-break.c, is introduced. When the -B flag is given, a patch that represents a complete rewrite is broken into a deletion followed by a creation. This makes it easier to review such a complete rewrite patch. The -B flag takes the same syntax as the -M and -C flags to specify the minimum amount of non-source material the resulting file needs to have to be considered a complete rewrite, and defaults to 99% if not specified. As the new test t4008-diff-break-rewrite.sh demonstrates, if a file is a complete rewrite, it is broken into a delete/create pair, which can further be subjected to the usual rename detection if -M or -C is used. For example, if file0 gets completely rewritten to make it as if it were rather based on file1 which itself disappeared, the following happens: The original change looks like this: file0 --> file0' (quite different from file0) file1 --> /dev/null After diffcore-break runs, it would become this: file0 --> /dev/null /dev/null --> file0' file1 --> /dev/null Then diffcore-rename matches them up: file1 --> file0' The internal score values are finer grained now. Earlier maximum of 10000 has been raised to 60000; there is no user visible changes but there is no reason to waste available bits. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 09:08:37 +02:00			`two_paths = 0;`
			`break;`
[PATCH] diff-raw format update take #2. This changes the diff-raw format again, following the mailing list discussion. The new format explicitly expresses which one is a rename and which one is a copy. The documentation and tests are updated to match this change. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 23:55:33 +02:00			`default:`
			`two_paths = 0;`
			`break;`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`}`
Diff: --name-status output format. The new output format shows only the status letter and paths. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:20:06 +02:00			`if (output_format != DIFF_FORMAT_NAME_STATUS) {`
			`printf(":%06o %06o %s ",`
diff: --abbrev option When I show transcripts to explain how something works, I often find myself hand-editing the diff-raw output to shorten various object names in the output. This adds --abbrev option to the diff family, which shortens diff-raw output and diff-tree commit id headers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-14 02:21:41 +01:00			`p->one->mode, p->two->mode,`
			`diff_unique_abbrev(p->one->sha1, abbrev));`
			`printf("%s ",`
			`diff_unique_abbrev(p->two->sha1, abbrev));`
Diff: --name-status output format. The new output format shows only the status letter and paths. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:20:06 +02:00			`}`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`printf("%s%c%s", status, inter_name_termination, path_one);`
[PATCH] diff-raw format update take #2. This changes the diff-raw format again, following the mailing list discussion. The new format explicitly expresses which one is a rename and which one is a copy. The documentation and tests are updated to match this change. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 23:55:33 +02:00			`if (two_paths)`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`printf("%c%s", inter_name_termination, path_two);`
[PATCH] diff-raw format update take #2. This changes the diff-raw format again, following the mailing list discussion. The new format explicitly expresses which one is a rename and which one is a copy. The documentation and tests are updated to match this change. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 23:55:33 +02:00			`putchar(line_termination);`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`if (path_one != p->one->path)`
			`free((void*)path_one);`
			`if (path_two != p->two->path)`
			`free((void*)path_two);`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`}`

[PATCH] git-diff-*: --name-only and --name-only-z. Porcelain layers often want to find only names of changed files, and even with diff-raw output format they end up having to pick out only the filename. Support --name-only (and --name-only-z for xargs -0 and cpio -0 users that want to treat filenames with embedded newlines sanely) flag to help them. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:45:51 +02:00			`static void diff_flush_name(struct diff_filepair *p,`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`int inter_name_termination,`
[PATCH] git-diff-*: --name-only and --name-only-z. Porcelain layers often want to find only names of changed files, and even with diff-raw output format they end up having to pick out only the filename. Support --name-only (and --name-only-z for xargs -0 and cpio -0 users that want to treat filenames with embedded newlines sanely) flag to help them. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:45:51 +02:00			`int line_termination)`
			`{`
Update git-diff-* to use C-style quoting for funny pathnames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-15 06:56:46 +02:00			`char *path = p->two->path;`

			`if (line_termination)`
			`path = quote_one(p->two->path);`
			`else`
			`path = p->two->path;`
			`printf("%s%c", path, line_termination);`
			`if (p->two->path != path)`
			`free(path);`
[PATCH] git-diff-*: --name-only and --name-only-z. Porcelain layers often want to find only names of changed files, and even with diff-raw output format they end up having to pick out only the filename. Support --name-only (and --name-only-z for xargs -0 and cpio -0 users that want to treat filenames with embedded newlines sanely) flag to help them. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:45:51 +02:00			`}`

[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00			`int diff_unmodified_pair(struct diff_filepair *p)`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`{`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`/* This function is written stricter than necessary to support`
			`* the currently implemented transformers, but the idea is to`
[PATCH] Introducing software archaeologist's tool "pickaxe". This steals the "pickaxe" feature from JIT and make it available to the bare Plumbing layer. From the command line, the user gives a string he is intersted in. Using the diff-core infrastructure previously introduced, it filters the differences to limit the output only to the diffs between <src> and <dst> where the string appears only in one but not in the other. For example: $ ./git-rev-list HEAD \| ./git-diff-tree -Sdiff-tree-helper --stdin -M would show the diffs that touch the string "diff-tree-helper". In real software-archaeologist application, you would typically look for a few to several lines of code and see where that code came from. The "pickaxe" module runs after "rename/copy detection" module, so it even crosses the file rename boundary, as the above example demonstrates. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:40:01 +02:00			`* let transformers to produce diff_filepairs any way they want,`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`* and filter and clean them up here before producing the output.`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`*/`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`struct diff_filespec one, two;`

			`if (DIFF_PAIR_UNMERGED(p))`
			`return 0; /* unmerged is interesting */`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`one = p->one;`
			`two = p->two;`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`/* deletion, addition, mode or type change`
			`* and rename are all interesting.`
			`*/`
[PATCH] The diff-raw format updates. Update the diff-raw format as Linus and I discussed, except that it does not use sequence of underscore '_' letters to express nonexistence. All '0' mode is used for that purpose instead. The new diff-raw format can express rename/copy, and the earlier restriction that -M and -C _must_ be used with the patch format output is no longer necessary. The patch makes -M and -C flags independent of -p flag, so you need to say git-whatchanged -M -p to get the diff/patch format. Updated are both documentations and tests. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:42:18 +02:00			`if (DIFF_FILE_VALID(one) != DIFF_FILE_VALID(two) \|\|`
[PATCH] Diff updates to express type changes With the introduction of type 'T' in the diff-raw output, and the "apply-patch" program Linus has been quietly working on without much advertisement, it started to make sense to emit usable information in the "diff --git" patch output format as well. Earlier built-in diff driver punted and did not say anything about a symbolic link changing into a file or vice versa, but this version represents it as a pair of deletion and creation. It also fixes a minor problem dealing with old archive created with ancient git. The earlier code was reporting file mode change between 100664 and 100644 (we shouldn't). The linux-2.6 git tree has a good example that exposes this problem. A good test case is commit ce1dc02f76432a46db149241e015a4f782974623. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 11:24:30 +02:00			`DIFF_PAIR_MODE_CHANGED(p) \|\|`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`strcmp(one->path, two->path))`
			`return 0;`
[PATCH] diff overhaul This cleans up the way calls are made into the diff core from diff-tree family and diff-helper. Earlier, these programs had "if (generating_patch)" sprinkled all over the place, but those ugliness are gone and handled uniformly from the diff core, even when not generating patch format. This also allowed diff-cache and diff-files to acquire -R (reverse) option to generate diff in reverse. Users of diff-tree can swap two trees easily so I did not add -R there. [ Linus' note: I'll add -R to "diff-tree" too, since a "commit diff" doesn't have another tree to switch around: the other tree is always the parent(s) of the commit ] Also -M<digits-as-mantissa> suggestion made by Linus has been implemented. Documentation updates are also included. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 04:00:36 +02:00
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`/* both are valid and point at the same path. that is, we are`
			`* dealing with a change.`
[PATCH] diff overhaul This cleans up the way calls are made into the diff core from diff-tree family and diff-helper. Earlier, these programs had "if (generating_patch)" sprinkled all over the place, but those ugliness are gone and handled uniformly from the diff core, even when not generating patch format. This also allowed diff-cache and diff-files to acquire -R (reverse) option to generate diff in reverse. Users of diff-tree can swap two trees easily so I did not add -R there. [ Linus' note: I'll add -R to "diff-tree" too, since a "commit diff" doesn't have another tree to switch around: the other tree is always the parent(s) of the commit ] Also -M<digits-as-mantissa> suggestion made by Linus has been implemented. Documentation updates are also included. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 04:00:36 +02:00			`*/`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (one->sha1_valid && two->sha1_valid &&`
			`!memcmp(one->sha1, two->sha1, sizeof(one->sha1)))`
			`return 1; /* no change */`
			`if (!one->sha1_valid && !two->sha1_valid)`
			`return 1; /* both look at the same file on the filesystem. */`
			`return 0;`
[PATCH] diff overhaul This cleans up the way calls are made into the diff core from diff-tree family and diff-helper. Earlier, these programs had "if (generating_patch)" sprinkled all over the place, but those ugliness are gone and handled uniformly from the diff core, even when not generating patch format. This also allowed diff-cache and diff-files to acquire -R (reverse) option to generate diff in reverse. Users of diff-tree can swap two trees easily so I did not add -R there. [ Linus' note: I'll add -R to "diff-tree" too, since a "commit diff" doesn't have another tree to switch around: the other tree is always the parent(s) of the commit ] Also -M<digits-as-mantissa> suggestion made by Linus has been implemented. Documentation updates are also included. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 04:00:36 +02:00			`}`

diff: --full-index A new option, --full-index, is introduced to diff family. This causes the full object name of pre- and post-images to appear on the index line of patch formatted output, to be used in conjunction with --allow-binary-replacement option of git-apply. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 02:53:22 +01:00			`static void diff_flush_patch(struct diff_filepair p, struct diff_options o)`
[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00			`{`
			`if (diff_unmodified_pair(p))`
			`return;`

			`if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) \|\|`
			`(DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`return; /* no tree diffs in patch format */`
[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00
diff: --full-index A new option, --full-index, is introduced to diff family. This causes the full object name of pre- and post-images to appear on the index line of patch formatted output, to be used in conjunction with --allow-binary-replacement option of git-apply. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 02:53:22 +01:00			`run_diff(p, o);`
[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00			`}`

diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`static void diff_flush_stat(struct diff_filepair p, struct diff_options o,`
			`struct diffstat_t *diffstat)`
			`{`
			`if (diff_unmodified_pair(p))`
			`return;`

			`if ((DIFF_FILE_VALID(p->one) && S_ISDIR(p->one->mode)) \|\|`
			`(DIFF_FILE_VALID(p->two) && S_ISDIR(p->two->mode)))`
			`return; /* no tree diffs in patch format */`

			`run_diffstat(p, o, diffstat);`
			`}`

[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`int diff_queue_is_empty(void)`
[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00			`{`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`struct diff_queue_struct *q = &diff_queued_diff;`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`int i;`
			`for (i = 0; i < q->nr; i++)`
			`if (!diff_unmodified_pair(q->queue[i]))`
			`return 0;`
			`return 1;`
[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00			`}`

[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`#if DIFF_DEBUG`
			`void diff_debug_filespec(struct diff_filespec s, int x, const char one)`
			`{`
			`fprintf(stderr, "queue[%d] %s (%s) %s %06o %s\n",`
Fix ?: statements. Omitting the first branch in ?: is a GNU extension. Cute, but not supported by other compilers. Replaced mostly by explicit tests. Calls to getenv() simply are repeated on non-GNU compilers. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> 2005-08-23 22:34:07 +02:00			`x, one ? one : "",`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`s->path,`
			`DIFF_FILE_VALID(s) ? "valid" : "invalid",`
			`s->mode,`
			`s->sha1_valid ? sha1_to_hex(s->sha1) : "");`
			`fprintf(stderr, "queue[%d] %s size %lu flags %d\n",`
Fix ?: statements. Omitting the first branch in ?: is a GNU extension. Cute, but not supported by other compilers. Replaced mostly by explicit tests. Calls to getenv() simply are repeated on non-GNU compilers. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> 2005-08-23 22:34:07 +02:00			`x, one ? one : "",`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`s->size, s->xfrm_flags);`
			`}`

			`void diff_debug_filepair(const struct diff_filepair *p, int i)`
			`{`
			`diff_debug_filespec(p->one, i, "one");`
			`diff_debug_filespec(p->two, i, "two");`
[PATCH] Add -B flag to diff-* brothers. A new diffcore transformation, diffcore-break.c, is introduced. When the -B flag is given, a patch that represents a complete rewrite is broken into a deletion followed by a creation. This makes it easier to review such a complete rewrite patch. The -B flag takes the same syntax as the -M and -C flags to specify the minimum amount of non-source material the resulting file needs to have to be considered a complete rewrite, and defaults to 99% if not specified. As the new test t4008-diff-break-rewrite.sh demonstrates, if a file is a complete rewrite, it is broken into a delete/create pair, which can further be subjected to the usual rename detection if -M or -C is used. For example, if file0 gets completely rewritten to make it as if it were rather based on file1 which itself disappeared, the following happens: The original change looks like this: file0 --> file0' (quite different from file0) file1 --> /dev/null After diffcore-break runs, it would become this: file0 --> /dev/null /dev/null --> file0' file1 --> /dev/null Then diffcore-rename matches them up: file1 --> file0' The internal score values are finer grained now. Earlier maximum of 10000 has been raised to 60000; there is no user visible changes but there is no reason to waste available bits. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 09:08:37 +02:00			`fprintf(stderr, "score %d, status %c stays %d broken %d\n",`
Fix ?: statements. Omitting the first branch in ?: is a GNU extension. Cute, but not supported by other compilers. Replaced mostly by explicit tests. Calls to getenv() simply are repeated on non-GNU compilers. Signed-off-by: Jason Riedy <ejr@cs.berkeley.edu> 2005-08-23 22:34:07 +02:00			`p->score, p->status ? p->status : '?',`
[PATCH] Add -B flag to diff-* brothers. A new diffcore transformation, diffcore-break.c, is introduced. When the -B flag is given, a patch that represents a complete rewrite is broken into a deletion followed by a creation. This makes it easier to review such a complete rewrite patch. The -B flag takes the same syntax as the -M and -C flags to specify the minimum amount of non-source material the resulting file needs to have to be considered a complete rewrite, and defaults to 99% if not specified. As the new test t4008-diff-break-rewrite.sh demonstrates, if a file is a complete rewrite, it is broken into a delete/create pair, which can further be subjected to the usual rename detection if -M or -C is used. For example, if file0 gets completely rewritten to make it as if it were rather based on file1 which itself disappeared, the following happens: The original change looks like this: file0 --> file0' (quite different from file0) file1 --> /dev/null After diffcore-break runs, it would become this: file0 --> /dev/null /dev/null --> file0' file1 --> /dev/null Then diffcore-rename matches them up: file1 --> file0' The internal score values are finer grained now. Earlier maximum of 10000 has been raised to 60000; there is no user visible changes but there is no reason to waste available bits. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 09:08:37 +02:00			`p->source_stays, p->broken_pair);`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`}`

			`void diff_debug_queue(const char msg, struct diff_queue_struct q)`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`{`
			`int i;`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`if (msg)`
			`fprintf(stderr, "%s\n", msg);`
			`fprintf(stderr, "q->nr = %d\n", q->nr);`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`diff_debug_filepair(p, i);`
			`}`
			`}`
			`#endif`

			`static void diff_resolve_rename_copy(void)`
			`{`
			`int i, j;`
			`struct diff_filepair p, pp;`
			`struct diff_queue_struct *q = &diff_queued_diff;`

			`diff_debug_queue("resolve-rename-copy", q);`

			`for (i = 0; i < q->nr; i++) {`
			`p = q->queue[i];`
[PATCH] Fix type-change handling when assigning the status code to filepairs. The interim single-liner '?' fix resulted delete entries that should not have emitted coming out in the output as an unintended side effect; I caught this with the "rename" test in the test suite. This patch instead fixes the code that assigns the status code to each filepair. I verified this does not break the testcase in udev.git tree Kay Sievers gave us, by running git-diff-tree on that tree which showed 21 file to symlink changes. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 00:07:08 +02:00			`p->status = 0; /* undecided */`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`if (DIFF_PAIR_UNMERGED(p))`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_UNMERGED;`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`else if (!DIFF_FILE_VALID(p->one))`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_ADDED;`
[PATCH] diff: fix the culling of unneeded delete record. The commit 15d061b435a7e3b6bead39df3889f4af78c4b00a [PATCH] Fix the way diffcore-rename records unremoved source. still leaves unneeded delete records in its output stream by mistake, which was covered up by having an extra check to turn such a delete into a no-op downstream. Fix the check in the diffcore-rename to simplify the output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 09:08:07 +02:00			`else if (!DIFF_FILE_VALID(p->two))`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_DELETED;`
[PATCH] Fix type-change handling when assigning the status code to filepairs. The interim single-liner '?' fix resulted delete entries that should not have emitted coming out in the output as an unintended side effect; I caught this with the "rename" test in the test suite. This patch instead fixes the code that assigns the status code to each filepair. I verified this does not break the testcase in udev.git tree Kay Sievers gave us, by running git-diff-tree on that tree which showed 21 file to symlink changes. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 00:07:08 +02:00			`else if (DIFF_PAIR_TYPE_CHANGED(p))`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_TYPE_CHANGED;`
[PATCH] Fix type-change handling when assigning the status code to filepairs. The interim single-liner '?' fix resulted delete entries that should not have emitted coming out in the output as an unintended side effect; I caught this with the "rename" test in the test suite. This patch instead fixes the code that assigns the status code to each filepair. I verified this does not break the testcase in udev.git tree Kay Sievers gave us, by running git-diff-tree on that tree which showed 21 file to symlink changes. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 00:07:08 +02:00
			`/* from this point on, we are dealing with a pair`
			`* whose both sides are valid and of the same type, i.e.`
			`* either in-place edit or rename/copy edit.`
			`*/`
[PATCH] diff: code clean-up and removal of rename hack. A new macro, DIFF_PAIR_RENAME(), is introduced to distinguish a filepair that is a rename/copy (the definition of which is src and dst are different paths, of course). This removes the hack used in the record_rename_pair() to always put a non-zero value in the score field. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 01:56:48 +02:00			`else if (DIFF_PAIR_RENAME(p)) {`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`if (p->source_stays) {`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_COPIED;`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`continue;`
			`}`
			`/* See if there is some other filepair that`
			`* copies from the same source as us. If so`
Fix copy marking from diffcore-rename. When (A,B) ==> (B,C) rename-copy was detected, we incorrectly said that C was created by copying B. This is because we only check if the path of rename/copy source still exists in the resulting tree to see if the file is renamed out of existence. In this case, the new B is created by copying or renaming A, so the original B is lost and we should say C is a rename of B not a copy of B. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-10 21:42:32 +02:00			`* we are a copy. Otherwise we are either a`
			`* copy if the path stays, or a rename if it`
			`* does not, but we already handled "stays" case.`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`*/`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`for (j = i + 1; j < q->nr; j++) {`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`pp = q->queue[j];`
			`if (strcmp(pp->one->path, p->one->path))`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`continue; /* not us */`
[PATCH] diff: code clean-up and removal of rename hack. A new macro, DIFF_PAIR_RENAME(), is introduced to distinguish a filepair that is a rename/copy (the definition of which is src and dst are different paths, of course). This removes the hack used in the record_rename_pair() to always put a non-zero value in the score field. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 01:56:48 +02:00			`if (!DIFF_PAIR_RENAME(pp))`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`continue; /* not a rename/copy */`
			`/* pp is a rename/copy from the same source */`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_COPIED;`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`break;`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`}`
			`if (!p->status)`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_RENAMED;`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`}`
[PATCH] Mode only changes from diff. This fixes another bug. - Mode-only changes were pruned incorrectly from the output. - Added test to catch the above problem. - Normalize rename/copy similarity score in the diff-raw output to per-cent, no matter what scale we internally use. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-26 01:00:04 +02:00			`else if (memcmp(p->one->sha1, p->two->sha1, 20) \|\|`
			`p->one->mode != p->two->mode)`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_MODIFIED;`
[PATCH] diff: mode bits fixes The core GIT repository has trees that record regular file mode in 0664 instead of normalized 0644 pattern. Comparing such a tree with another tree that records the same file in 0644 pattern without content changes with git-diff-tree causes it to feed otherwise unmodified pairs to the diff_change() routine, which triggers a sanity check routine and barfs. This patch fixes the problem, along with the fix to another caller that uses unnormalized mode bits to call diff_change() routine in a similar way. Without this patch, you will see "fatal error" from diff-tree when you run git-deltafy-script on the core GIT repository itself. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-01 20:38:07 +02:00			`else {`
			`/* This is a "no-change" entry and should not`
			`* happen anymore, but prepare for broken callers.`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`*/`
[PATCH] diff: mode bits fixes The core GIT repository has trees that record regular file mode in 0664 instead of normalized 0644 pattern. Comparing such a tree with another tree that records the same file in 0644 pattern without content changes with git-diff-tree causes it to feed otherwise unmodified pairs to the diff_change() routine, which triggers a sanity check routine and barfs. This patch fixes the problem, along with the fix to another caller that uses unnormalized mode bits to call diff_change() routine in a similar way. Without this patch, you will see "fatal error" from diff-tree when you run git-deltafy-script on the core GIT repository itself. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-01 20:38:07 +02:00			`error("feeding unmodified %s to diffcore",`
			`p->one->path);`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`p->status = DIFF_STATUS_UNKNOWN;`
[PATCH] diff: mode bits fixes The core GIT repository has trees that record regular file mode in 0664 instead of normalized 0644 pattern. Comparing such a tree with another tree that records the same file in 0644 pattern without content changes with git-diff-tree causes it to feed otherwise unmodified pairs to the diff_change() routine, which triggers a sanity check routine and barfs. This patch fixes the problem, along with the fix to another caller that uses unnormalized mode bits to call diff_change() routine in a similar way. Without this patch, you will see "fatal error" from diff-tree when you run git-deltafy-script on the core GIT repository itself. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-01 20:38:07 +02:00			`}`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`}`
[PATCH] Redo rename/copy detection logic. Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 10:10:48 +02:00			`diff_debug_queue("resolve-rename-copy done", q);`
[PATCH] Prepare diffcore interface for diff-tree header supression. This does not actually supress the extra headers when pickaxe is used, but prepares enough support for diff-tree to implement it. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:40:36 +02:00			`}`

diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`static void flush_one_pair(struct diff_filepair *p,`
			`int diff_output_format,`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`struct diff_options *options,`
			`struct diffstat_t *diffstat)`
[PATCH] Prepare diffcore interface for diff-tree header supression. This does not actually supress the extra headers when pickaxe is used, but prepares enough support for diff-tree to implement it. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:40:36 +02:00			`{`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`int inter_name_termination = '\t';`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`int line_termination = options->line_termination;`
Split up "diff_format" into "format" and "line_termination". This removes the separate "formats" for name and name-with-zero- termination. It also removes the difference between HUMAN and MACHINE formats, and they both become DIFF_FORMAT_RAW, with the difference being just in the line and inter-filename termination. It also makes the code easier to understand. 2005-07-15 02:59:17 +02:00			`if (!line_termination)`
			`inter_name_termination = 0;`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`switch (p->status) {`
			`case DIFF_STATUS_UNKNOWN:`
			`break;`
			`case 0:`
			`die("internal error in diff-resolve-rename-copy");`
			`break;`
			`default:`
			`switch (diff_output_format) {`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`case DIFF_FORMAT_DIFFSTAT:`
			`diff_flush_stat(p, options, diffstat);`
			`break;`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`case DIFF_FORMAT_PATCH:`
			`diff_flush_patch(p, options);`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`break;`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`case DIFF_FORMAT_RAW:`
			`case DIFF_FORMAT_NAME_STATUS:`
			`diff_flush_raw(p, line_termination,`
			`inter_name_termination,`
			`options, diff_output_format);`
[PATCH] git-diff-*: --name-only and --name-only-z. Porcelain layers often want to find only names of changed files, and even with diff-raw output format they end up having to pick out only the filename. Support --name-only (and --name-only-z for xargs -0 and cpio -0 users that want to treat filenames with embedded newlines sanely) flag to help them. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-13 21:45:51 +02:00			`break;`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`case DIFF_FORMAT_NAME:`
			`diff_flush_name(p,`
			`inter_name_termination,`
			`line_termination);`
			`break;`
			`case DIFF_FORMAT_NO_OUTPUT:`
			`break;`
			`}`
			`}`
			`}`

			`void diff_flush(struct diff_options *options)`
			`{`
			`struct diff_queue_struct *q = &diff_queued_diff;`
			`int i;`
			`int diff_output_format = options->output_format;`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`struct diffstat_t *diffstat = NULL;`

diff-options: add --patch-with-stat With this option, git prepends a diffstat in front of the patch. Since I really, really do not know what a diffstat of a combined diff ("merge diff") should look like, the diffstat is not generated for these. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-15 13:41:18 +02:00			`if (diff_output_format == DIFF_FORMAT_DIFFSTAT \|\| options->with_stat) {`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`diffstat = xcalloc(sizeof (struct diffstat_t), 1);`
			`diffstat->xm.consume = diffstat_consume;`
			`}`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00
			`if (options->with_raw) {`
			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`flush_one_pair(p, DIFF_FORMAT_RAW, options, NULL);`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`}`
Separate the raw diff and patch with a newline More friendly for human reading I believe, and possibly friendlier to some parsers (although only by an epsilon). Signed-off-by: Petr Baudis <pasky@suse.cz> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 13:30:46 +02:00			`putchar(options->line_termination);`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`}`
diff-options: add --patch-with-stat With this option, git prepends a diffstat in front of the patch. Since I really, really do not know what a diffstat of a combined diff ("merge diff") should look like, the diffstat is not generated for these. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-15 13:41:18 +02:00			`if (options->with_stat) {`
			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
			`flush_one_pair(p, DIFF_FORMAT_DIFFSTAT, options,`
			`diffstat);`
			`}`
			`show_stats(diffstat);`
			`free(diffstat);`
			`diffstat = NULL;`
			`putchar(options->line_termination);`
			`}`
diff-* --patch-with-raw This new flag outputs the diff-raw output and diff-patch output at the same time. Requested by Cogito. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-11 02:36:53 +02:00			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00			`flush_one_pair(p, diff_output_format, options, diffstat);`
diff_flush(): leakfix. We were leaking filepairs when output-format was set to NO_OUTPUT. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-05 11:06:49 +02:00			`diff_free_filepair(p);`
[PATCH] possible memory leak in diff.c::diff_free_filepair() Here is a patch to fix the problem in the simplest way. 2005-08-21 09:14:16 +02:00			`}`
diff-options: add --stat (take 2) Now, you can say "git diff --stat" (to get an idea how many changes are uncommitted), or "git log --stat". Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-14 00:15:30 +02:00
			`if (diffstat) {`
			`show_stats(diffstat);`
			`free(diffstat);`
			`}`

[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`free(q->queue);`
			`q->queue = NULL;`
			`q->nr = q->alloc = 0;`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00			`}`

[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`static void diffcore_apply_filter(const char *filter)`
			`{`
			`int i;`
			`struct diff_queue_struct *q = &diff_queued_diff;`
			`struct diff_queue_struct outq;`
			`outq.queue = NULL;`
			`outq.nr = outq.alloc = 0;`

			`if (!filter)`
			`return;`

Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`if (strchr(filter, DIFF_STATUS_FILTER_AON)) {`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`int found;`
			`for (i = found = 0; !found && i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00			`if (((p->status == DIFF_STATUS_MODIFIED) &&`
			`((p->score &&`
			`strchr(filter, DIFF_STATUS_FILTER_BROKEN)) \|\|`
			`(!p->score &&`
			`strchr(filter, DIFF_STATUS_MODIFIED)))) \|\|`
			`((p->status != DIFF_STATUS_MODIFIED) &&`
			`strchr(filter, p->status)))`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`found++;`
			`}`
			`if (found)`
			`return;`

			`/* otherwise we will clear the whole queue`
			`* by copying the empty outq at the end of this`
			`* function, but first clear the current entries`
			`* in the queue.`
			`*/`
			`for (i = 0; i < q->nr; i++)`
			`diff_free_filepair(q->queue[i]);`
			`}`
			`else {`
			`/* Only the matching ones */`
			`for (i = 0; i < q->nr; i++) {`
			`struct diff_filepair *p = q->queue[i];`
Use symbolic constants for diff-raw status indicators. Both Cogito and StGIT prefer to see 'A' for new files. The current 'N' is visually harder to distinguish from 'M', which is used for modified files. Prepare the internals to use symbolic constants to make the change easier. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-25 22:05:44 +02:00
			`if (((p->status == DIFF_STATUS_MODIFIED) &&`
			`((p->score &&`
			`strchr(filter, DIFF_STATUS_FILTER_BROKEN)) \|\|`
			`(!p->score &&`
			`strchr(filter, DIFF_STATUS_MODIFIED)))) \|\|`
			`((p->status != DIFF_STATUS_MODIFIED) &&`
			`strchr(filter, p->status)))`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`diff_q(&outq, p);`
			`else`
			`diff_free_filepair(p);`
			`}`
			`}`
			`free(q->queue);`
			`*q = outq;`
			`}`

Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`void diffcore_std(struct diff_options *options)`
[PATCH] diff: consolidate various calls into diffcore. The three diff-* brothers had a sequence of calls into diffcore that were almost identical. Introduce a new diffcore_std() function that takes all the necessary arguments to consolidate it. This will make later enhancements and changing the order of diffcore application simpler. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 01:56:13 +02:00			`{`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->break_opt != -1)`
			`diffcore_break(options->break_opt);`
			`if (options->detect_rename)`
Diff: -l<num> to limit rename/copy detection. When many paths are modified, rename detection takes a lot of time. The new option -l<num> can be used to disable rename detection when more than <num> paths are possibly created as renames. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:18:27 +02:00			`diffcore_rename(options);`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->break_opt != -1)`
[PATCH] diff: Update -B heuristics. As Linus pointed out on the mailing list discussion, -B should break a files that has many inserts even if it still keeps enough of the original contents, so that the broken pieces can later be matched with other files by -M or -C. However, if such a broken pair does not get picked up by -M or -C, we would want to apply different criteria; namely, regardless of the amount of new material in the result, the determination of "rewrite" should be done by looking at the amount of original material still left in the result. If you still have the original 97 lines from a 100-line document, it does not matter if you add your own 13 lines to make a 110-line document, or if you add 903 lines to make a 1000-line document. It is not a rewrite but an in-place edit. On the other hand, if you did lose 97 lines from the original, it does not matter if you added 27 lines to make a 30-line document or if you added 997 lines to make a 1000-line document. You did a complete rewrite in either case. This patch introduces a post-processing phase that runs after diffcore-rename matches up broken pairs diffcore-break creates. The purpose of this post-processing is to pick up these broken pieces and merge them back into in-place modifications. For this, the score parameter -B option takes is changed into a pair of numbers, and it takes "-B99/80" format when fully spelled out. The first number is the minimum amount of "edit" (same definition as what diffcore-rename uses, which is "sum of deletion and insertion") that a modification needs to have to be broken, and the second number is the minimum amount of "delete" a surviving broken pair must have to avoid being merged back together. It can be abbreviated to "-B" to use default for both, "-B9" or "-B9/" to use 90% for "edit" but default (80%) for merge avoidance, or "-B/75" to use default (99%) "edit" and 75% for merge avoidance. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:40:28 +02:00			`diffcore_merge_broken();`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->pickaxe)`
			`diffcore_pickaxe(options->pickaxe, options->pickaxe_opts);`
			`if (options->orderfile)`
			`diffcore_order(options->orderfile);`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`diff_resolve_rename_copy();`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`diffcore_apply_filter(options->filter);`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`}`


Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`void diffcore_std_no_resolve(struct diff_options *options)`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`{`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->pickaxe)`
			`diffcore_pickaxe(options->pickaxe, options->pickaxe_opts);`
			`if (options->orderfile)`
			`diffcore_order(options->orderfile);`
			`diffcore_apply_filter(options->filter);`
[PATCH] diff: consolidate various calls into diffcore. The three diff-* brothers had a sequence of calls into diffcore that were almost identical. Introduce a new diffcore_std() function that takes all the necessary arguments to consolidate it. This will make later enhancements and changing the order of diffcore application simpler. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-30 01:56:13 +02:00			`}`

Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`void diff_addremove(struct diff_options *options,`
			`int addremove, unsigned mode,`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`const unsigned char *sha1,`
			`const char base, const char path)`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`{`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`char concatpath[PATH_MAX];`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`struct diff_filespec one, two;`

			`/* This may look odd, but it is a preparation for`
			`* feeding "there are unchanged files which should`
			`* not produce diffs, but when you are doing copy`
			`* detection you would need them, so here they are"`
			`* entries to the diff-core. They will be prefixed`
			`* with something like '=' or '*' (I haven't decided`
			`* which but should not make any difference).`
[PATCH] Rename/copy detection fix. The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-23 06:26:09 +02:00			`* Feeding the same new and old to diff_change()`
[PATCH] Fix diff-pruning logic which was running prune too early. For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-24 03:14:03 +02:00			`* also has the same effect.`
			`* Before the final output happens, they are pruned after`
			`* merged into rename/copy pairs as appropriate.`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`*/`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->reverse_diff)`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`addremove = (addremove == '+' ? '-' :`
			`addremove == '-' ? '+' : addremove);`
[PATCH] Simplify "reverse-diff" logic in the diff core. Instead of swapping the arguments just before output, this patch makes the swapping happen on the input side of the diff core, when "reverse-diff" is in effect. This greatly simplifies the logic, but more importantly it is necessary for upcoming "copy detection" work. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 18:54:07 +02:00
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (!path) path = "";`
			`sprintf(concatpath, "%s%s", base, path);`
			`one = alloc_filespec(concatpath);`
			`two = alloc_filespec(concatpath);`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (addremove != '+')`
			`fill_filespec(one, sha1, mode);`
			`if (addremove != '-')`
			`fill_filespec(two, sha1, mode);`
[PATCH] Detect renames in diff family. This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-19 12:32:35 +02:00
[PATCH] Prepare diffcore interface for diff-tree header supression. This does not actually supress the extra headers when pickaxe is used, but prepares enough support for diff-tree to implement it. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:40:36 +02:00			`diff_queue(&diff_queued_diff, one, two);`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`}`

Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`void diff_change(struct diff_options *options,`
			`unsigned old_mode, unsigned new_mode,`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`const unsigned char *old_sha1,`
			`const unsigned char *new_sha1,`
[PATCH] The diff-raw format updates. Update the diff-raw format as Linus and I discussed, except that it does not use sequence of underscore '_' letters to express nonexistence. All '0' mode is used for that purpose instead. The new diff-raw format can express rename/copy, and the earlier restriction that -M and -C _must_ be used with the patch format output is no longer necessary. The patch makes -M and -C flags independent of -p flag, so you need to say git-whatchanged -M -p to get the diff/patch format. Updated are both documentations and tests. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:42:18 +02:00			`const char base, const char path)`
			`{`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`char concatpath[PATH_MAX];`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`struct diff_filespec one, two;`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`if (options->reverse_diff) {`
[PATCH] Simplify "reverse-diff" logic in the diff core. Instead of swapping the arguments just before output, this patch makes the swapping happen on the input side of the diff core, when "reverse-diff" is in effect. This greatly simplifies the logic, but more importantly it is necessary for upcoming "copy detection" work. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 18:54:07 +02:00			`unsigned tmp;`
			`const unsigned char *tmp_c;`
			`tmp = old_mode; old_mode = new_mode; new_mode = tmp;`
			`tmp_c = old_sha1; old_sha1 = new_sha1; new_sha1 = tmp_c;`
			`}`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`if (!path) path = "";`
			`sprintf(concatpath, "%s%s", base, path);`
			`one = alloc_filespec(concatpath);`
			`two = alloc_filespec(concatpath);`
			`fill_filespec(one, old_sha1, old_mode);`
			`fill_filespec(two, new_sha1, new_mode);`

[PATCH] Prepare diffcore interface for diff-tree header supression. This does not actually supress the extra headers when pickaxe is used, but prepares enough support for diff-tree to implement it. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 04:40:36 +02:00			`diff_queue(&diff_queued_diff, one, two);`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`}`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`void diff_unmerge(struct diff_options *options,`
			`const char *path)`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`{`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00			`struct diff_filespec one, two;`
			`one = alloc_filespec(path);`
			`two = alloc_filespec(path);`
			`diff_queue(&diff_queued_diff, one, two);`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`}`