mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-15 05:33:04 +01:00

797 lines

19 KiB

C

Raw Normal View History

[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`/*`
			`* Copyright (C) 2005 Junio C Hamano`
			`*/`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`#include "cache.h"`
[PATCH] Make sq_expand() available as sq_quote(). A useful shell safety helper sq_expand() was hidden as a static function in diff.c. Extract it out and make it available as sq_quote(). Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-08 08:58:32 +02:00			`#include "quote.h"`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`#include "commit.h"`
[PATCH] Split external diff command interface to a separate file. With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 03:22:47 +02:00			`#include "diff.h"`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00			`#include "diffcore.h"`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`#include "revision.h"`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`#include "cache-tree.h"`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`#include "path-list.h"`
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00			`#include "unpack-trees.h"`
[PATCH] Diff overhaul, adding half of copy detection. This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-21 11:39:09 +02:00
Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`/*`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`* diff-files`
Optimize diff-cache -p --cached This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-04 10:45:24 +02:00			`*/`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`static int read_directory(const char path, struct path_list list)`
			`{`
			`DIR *dir;`
			`struct dirent *e;`

			`if (!(dir = opendir(path)))`
			`return error("Could not open directory %s", path);`

			`while ((e = readdir(dir)))`
			`if (strcmp(".", e->d_name) && strcmp("..", e->d_name))`
diff-lib.c: don't strdup twice The static function read_directory in diff-lib.c is only ever called with struct path_list lists with .strdup_paths turned on, i.e. path_list_insert will strdup the paths for us (again). Let's take advantage of that and stop doing it twice. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-07 20:19:08 +02:00			`path_list_insert(e->d_name, list);`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00
			`closedir(dir);`
			`return 0;`
			`}`

diff --no-index: support /dev/null as filename This allows us to create "new file" and "delete file" patches. It also cleans up the code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:36:31 +01:00			`static int get_mode(const char path, int mode)`
			`{`
			`struct stat st;`

			`if (!path \|\| !strcmp(path, "/dev/null"))`
			`*mode = 0;`
			`else if (!strcmp(path, "-"))`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`*mode = create_ce_mode(0666);`
diff --no-index: support /dev/null as filename This allows us to create "new file" and "delete file" patches. It also cleans up the code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:36:31 +01:00			`else if (stat(path, &st))`
			`return error("Could not access '%s'", path);`
			`else`
			`*mode = st.st_mode;`
			`return 0;`
			`}`

Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`static int queue_diff(struct diff_options *o,`
			`const char name1, const char name2)`
			`{`
			`int mode1 = 0, mode2 = 0;`

diff --no-index: support /dev/null as filename This allows us to create "new file" and "delete file" patches. It also cleans up the code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:36:31 +01:00			`if (get_mode(name1, &mode1) \|\| get_mode(name2, &mode2))`
			`return -1;`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00
			`if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2))`
			`return error("file/directory conflict: %s, %s", name1, name2);`

			`if (S_ISDIR(mode1) \|\| S_ISDIR(mode2)) {`
			`char buffer1[PATH_MAX], buffer2[PATH_MAX];`
			`struct path_list p1 = {NULL, 0, 0, 1}, p2 = {NULL, 0, 0, 1};`
			`int len1 = 0, len2 = 0, i1, i2, ret = 0;`

			`if (name1 && read_directory(name1, &p1))`
			`return -1;`
			`if (name2 && read_directory(name2, &p2)) {`
			`path_list_clear(&p1, 0);`
			`return -1;`
			`}`

			`if (name1) {`
			`len1 = strlen(name1);`
			`if (len1 > 0 && name1[len1 - 1] == '/')`
			`len1--;`
			`memcpy(buffer1, name1, len1);`
			`buffer1[len1++] = '/';`
			`}`

			`if (name2) {`
			`len2 = strlen(name2);`
			`if (len2 > 0 && name2[len2 - 1] == '/')`
			`len2--;`
			`memcpy(buffer2, name2, len2);`
			`buffer2[len2++] = '/';`
			`}`

			`for (i1 = i2 = 0; !ret && (i1 < p1.nr \|\| i2 < p2.nr); ) {`
			`const char n1, n2;`
			`int comp;`

			`if (i1 == p1.nr)`
			`comp = 1;`
			`else if (i2 == p2.nr)`
			`comp = -1;`
			`else`
			`comp = strcmp(p1.items[i1].path,`
			`p2.items[i2].path);`

			`if (comp > 0)`
			`n1 = NULL;`
			`else {`
			`n1 = buffer1;`
			`strncpy(buffer1 + len1, p1.items[i1++].path,`
			`PATH_MAX - len1);`
			`}`

			`if (comp < 0)`
			`n2 = NULL;`
			`else {`
			`n2 = buffer2;`
			`strncpy(buffer2 + len2, p2.items[i2++].path,`
			`PATH_MAX - len2);`
			`}`

			`ret = queue_diff(o, n1, n2);`
			`}`
			`path_list_clear(&p1, 0);`
			`path_list_clear(&p2, 0);`

			`return ret;`
			`} else {`
			`struct diff_filespec d1, d2;`

Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (DIFF_OPT_TST(o, REVERSE_DIFF)) {`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`unsigned tmp;`
			`const char *tmp_c;`
			`tmp = mode1; mode1 = mode2; mode2 = tmp;`
			`tmp_c = name1; name1 = name2; name2 = tmp_c;`
			`}`

			`if (!name1)`
			`name1 = "/dev/null";`
			`if (!name2)`
			`name2 = "/dev/null";`
			`d1 = alloc_filespec(name1);`
			`d2 = alloc_filespec(name2);`
			`fill_filespec(d1, null_sha1, mode1);`
			`fill_filespec(d2, null_sha1, mode2);`

			`diff_queue(&diff_queued_diff, d1, d2);`
			`return 0;`
			`}`
			`}`

Do not default to --no-index when given two directories. git-diff -- a/ b/ always defaulted to --no-index, primarily because the function is_in_index() was implemented quite incorrectly. Noticed by Patrick Maaß and Simon Schubert independently, initial patch was provided by Patrick but I fixed it differently. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 12:23:20 +02:00			`/*`
			`* Does the path name a blob in the working tree, or a directory`
			`* in the working tree?`
			`*/`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`static int is_in_index(const char *path)`
			`{`
Do not default to --no-index when given two directories. git-diff -- a/ b/ always defaulted to --no-index, primarily because the function is_in_index() was implemented quite incorrectly. Noticed by Patrick Maaß and Simon Schubert independently, initial patch was provided by Patrick but I fixed it differently. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 12:23:20 +02:00			`int len, pos;`
			`struct cache_entry *ce;`

			`len = strlen(path);`
			`while (path[len-1] == '/')`
			`len--;`
			`if (!len)`
			`return 1; /* "." */`
			`pos = cache_name_pos(path, len);`
			`if (0 <= pos)`
			`return 1;`
			`pos = -1 - pos;`
			`while (pos < active_nr) {`
			`ce = active_cache[pos++];`
			`if (ce_namelen(ce) <= len \|\|`
			`strncmp(ce->name, path, len) \|\|`
			`(ce->name[len] > '/'))`
			`break; /* path cannot be a prefix */`
			`if (ce->name[len] == '/')`
			`return 1;`
			`}`
			`return 0;`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`}`

			`static int handle_diff_files_args(struct rev_info *revs,`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`int argc, const char **argv,`
			`unsigned int *options)`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`{`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`*options = 0;`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00
			`/* revs->max_count == -2 means --no-index */`
			`while (1 < argc && argv[1][0] == '-') {`
			`if (!strcmp(argv[1], "--base"))`
			`revs->max_count = 1;`
			`else if (!strcmp(argv[1], "--ours"))`
			`revs->max_count = 2;`
			`else if (!strcmp(argv[1], "--theirs"))`
			`revs->max_count = 3;`
			`else if (!strcmp(argv[1], "-n") \|\|`
Allow git-diff exit with codes similar to diff(1) This introduces a new command-line option: --exit-code. The diff programs will return 1 for differences, return 0 for equality, and something else for errors. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-14 01:17:04 +01:00			`!strcmp(argv[1], "--no-index")) {`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`revs->max_count = -2;`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&revs->diffopt, EXIT_WITH_STATUS);`
			`DIFF_OPT_SET(&revs->diffopt, NO_INDEX);`
Allow git-diff exit with codes similar to diff(1) This introduces a new command-line option: --exit-code. The diff programs will return 1 for differences, return 0 for equality, and something else for errors. Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-14 01:17:04 +01:00			`}`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`else if (!strcmp(argv[1], "-q"))`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`*options \|= DIFF_SILENT_ON_REMOVED;`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`else`
			`return error("invalid option: %s", argv[1]);`
			`argv++; argc--;`
			`}`

			`if (revs->max_count == -1 && revs->diffopt.nr_paths == 2) {`
			`/*`
			`* If two files are specified, and at least one is untracked,`
			`* default to no-index.`
			`*/`
			`read_cache();`
			`if (!is_in_index(revs->diffopt.paths[0]) \|\|`
diff: squelch empty diffs even more When we compare two non-tracked files, or explicitly specify --no-index, the suggestion to run git-status is not helpful. The patch adds a new diff_options bitfield member, no_index, that is used instead of the special value of -2 of the rev_info field max_count to indicate that the index is not to be used. This makes it possible to pass that flag down to diffcore_skip_stat_unmatch(), which only has one diff_options parameter. This could even become a cleanup if we removed all assignments of max_count to a value of -2 (viz. replacement of a magic value with a self-documenting field name) but I didn't dare to do that so late in the rc game.. The no_index bit, if set, then tells diffcore_skip_stat_unmatch() to not account for any skipped stat-mismatches, which avoids the suggestion to run git-status. Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-15 00:41:00 +02:00			`!is_in_index(revs->diffopt.paths[1])) {`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`revs->max_count = -2;`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&revs->diffopt, NO_INDEX);`
diff: squelch empty diffs even more When we compare two non-tracked files, or explicitly specify --no-index, the suggestion to run git-status is not helpful. The patch adds a new diff_options bitfield member, no_index, that is used instead of the special value of -2 of the rev_info field max_count to indicate that the index is not to be used. This makes it possible to pass that flag down to diffcore_skip_stat_unmatch(), which only has one diff_options parameter. This could even become a cleanup if we removed all assignments of max_count to a value of -2 (viz. replacement of a magic value with a self-documenting field name) but I didn't dare to do that so late in the rc game.. The no_index bit, if set, then tells diffcore_skip_stat_unmatch() to not account for any skipped stat-mismatches, which avoids the suggestion to run git-status. Signed-off-by: Rene Scharfe <rene.scharfe@lsfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-15 00:41:00 +02:00			`}`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`}`

			`/*`
			`* Make sure there are NO revision (i.e. pending object) parameter,`
			`* rev.max_count is reasonable (0 <= n <= 3),`
			`* there is no other revision filtering parameters.`
			`*/`
			`if (revs->pending.nr \|\| revs->max_count > 3 \|\|`
			`revs->min_age != -1 \|\| revs->max_age != -1)`
			`return error("no revision allowed with diff-files");`

			`if (revs->max_count == -1 &&`
			`(revs->diffopt.output_format & DIFF_FORMAT_PATCH))`
			`revs->combine_merges = revs->dense_combined_merges = 1;`

			`return 0;`
			`}`

diff: make more cases implicit --no-index When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:35:27 +01:00			`static int is_outside_repo(const char path, int nongit, const char prefix)`
			`{`
			`int i;`
Use is_absolute_path() in diff-lib.c, lockfile.c, setup.c, trace.c Using the helper function to test for absolute paths makes porting easier. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-25 23:29:03 +01:00			`if (nongit \|\| !strcmp(path, "-") \|\| is_absolute_path(path))`
diff: make more cases implicit --no-index When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:35:27 +01:00			`return 1;`
			`if (prefixcmp(path, "../"))`
			`return 0;`
			`if (!prefix)`
			`return 1;`
			`for (i = strlen(prefix); !prefixcmp(path, "../"); ) {`
			`while (i > 0 && prefix[i - 1] != '/')`
			`i--;`
			`if (--i < 0)`
			`return 1;`
			`path += 3;`
			`}`
			`return 0;`
			`}`

			`int setup_diff_no_index(struct rev_info *revs,`
			`int argc, const char ** argv, int nongit, const char *prefix)`
			`{`
			`int i;`
			`for (i = 1; i < argc; i++)`
diff: support reading a file from stdin via "-" This allows you to say echo Hello World \| git diff x - to compare the contents of file "x" with the line "Hello World". This automatically switches to --no-index mode. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:36:10 +01:00			`if (argv[i][0] != '-' \|\| argv[i][1] == '\0')`
diff: make more cases implicit --no-index When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:35:27 +01:00			`break;`
			`else if (!strcmp(argv[i], "--")) {`
			`i++;`
			`break;`
			`} else if (i < argc - 3 && !strcmp(argv[i], "--no-index")) {`
			`i = argc - 3;`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&revs->diffopt, EXIT_WITH_STATUS);`
diff: make more cases implicit --no-index When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:35:27 +01:00			`break;`
			`}`
			`if (argc != i + 2 \|\| (!is_outside_repo(argv[i + 1], nongit, prefix) &&`
			`!is_outside_repo(argv[i], nongit, prefix)))`
			`return -1;`

			`diff_setup(&revs->diffopt);`
			`for (i = 1; i < argc - 2; )`
			`if (!strcmp(argv[i], "--no-index"))`
			`i++;`
			`else {`
			`int j = diff_opt_parse(&revs->diffopt,`
			`argv + i, argc - i);`
			`if (!j)`
			`die("invalid diff option/value: %s", argv[i]);`
			`i += j;`
			`}`
diff-ni: allow running from a subdirectory. When run from a subdirectory of a repository, the command forgot to adjust paths given to it with prefix. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-04 08:45:14 +01:00
			`if (prefix) {`
			`int len = strlen(prefix);`

			`revs->diffopt.paths = xcalloc(2, sizeof(char*));`
			`for (i = 0; i < 2; i++) {`
diff-ni: fix the diff with standard input The earlier commit to read from stdin was full of problems, and this corrects them. - The mode bits should have been set to satisify S_ISREG(); we forgot to the S_IFREG bits and hardcoded 0644; - We did not give escape hatch to name a path whose name is really "-". Allow users to say "./-" for that; - Use of xread() was not prepared to see short read (e.g. reading from tty) nor handing read errors. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-04 09:17:27 +01:00			`const char *p = argv[argc - 2 + i];`
			`/*`
			`* stdin should be spelled as '-'; if you have`
			`* path that is '-', spell it as ./-.`
			`*/`
			`p = (strcmp(p, "-")`
			`? xstrdup(prefix_filename(prefix, len, p))`
			`: p);`
			`revs->diffopt.paths[i] = p;`
diff-ni: allow running from a subdirectory. When run from a subdirectory of a repository, the command forgot to adjust paths given to it with prefix. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-04 08:45:14 +01:00			`}`
			`}`
			`else`
			`revs->diffopt.paths = argv + argc - 2;`
diff: make more cases implicit --no-index When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:35:27 +01:00			`revs->diffopt.nr_paths = 2;`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`DIFF_OPT_SET(&revs->diffopt, NO_INDEX);`
diff: make more cases implicit --no-index When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:35:27 +01:00			`revs->max_count = -2;`
diff --no-index: do not forget to run diff_setup_done() Code inspection by Linus found this. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-14 21:12:32 +02:00			`if (diff_setup_done(&revs->diffopt) < 0)`
			`die("diff_setup_done failed");`
diff: make more cases implicit --no-index When specifying an absolute path, or a relative path pointing outside the working tree, do not fail, but roll your own diffopt parsing, and execute a --no-index diff. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:35:27 +01:00			`return 0;`
			`}`

Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`int run_diff_files_cmd(struct rev_info revs, int argc, const char *argv)`
			`{`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`unsigned int options;`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`if (handle_diff_files_args(revs, argc, argv, &options))`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`return -1;`

Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (DIFF_OPT_TST(&revs->diffopt, NO_INDEX)) {`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`if (revs->diffopt.nr_paths != 2)`
			`return error("need two files/directories with --no-index");`
diff --no-index: also imitate the exit status of diff(1) diff sets the exit status to 0 when no changes were found, to 1 when changes were found, and 2 means error. We imitate this to be able to use "git diff" in the test scripts. (Actually, keeping in line with the rest of git, -1 is returned on error, which corresponds to an exit status 255). To find out if the diff is not empty, a member called "found_changes" was introduced in struct diff_options, which is set in builtin_diff() and fn_out_consume(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:34:54 +01:00			`if (queue_diff(&revs->diffopt, revs->diffopt.paths[0],`
			`revs->diffopt.paths[1]))`
			`return -1;`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`diffcore_std(&revs->diffopt);`
			`diff_flush(&revs->diffopt);`
diff --no-index: also imitate the exit status of diff(1) diff sets the exit status to 0 when no changes were found, to 1 when changes were found, and 2 means error. We imitate this to be able to use "git diff" in the test scripts. (Actually, keeping in line with the rest of git, -1 is returned on error, which corresponds to an exit status 255). To find out if the diff is not empty, a member called "found_changes" was introduced in struct diff_options, which is set in builtin_diff() and fn_out_consume(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-25 23:34:54 +01:00			`/*`
			`* The return code for --no-index imitates diff(1):`
			`* 0 = no changes, 1 = changes, else error`
			`*/`
			`return revs->diffopt.found_changes;`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`}`

Evil Merge branch 'jc/status' (early part) into js/diff-ni * 'jc/status' (early part): run_diff_{files,index}(): update calling convention. update-index: do not die too early in a read-only repository. git-status: do not be totally useless in a read-only repository. This is to resolve semantic conflict (which is not textual) that changes the calling convention of run_diff_files() early. 2007-02-24 11:20:13 +01:00			`if (read_cache() < 0) {`
			`perror("read_cache");`
			`return -1;`
			`}`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`return run_diff_files(revs, options);`
Teach git-diff-files the new option `--no-index` With this flag and given two paths, git-diff-files behaves as a GNU diff lookalike (plus the git goodies like --check, colour, etc.). This flag is also available in git-diff. It also works outside of a git repository. In addition, if git-diff{,-files} is called without revision or stage parameter, and with exactly two paths at least one of which is not tracked, the default is --no-index. So, you can now say git diff /etc/inittab /etc/fstab and it actually works! This also unifies the duplicated argument parsing between cmd_diff_files() and builtin_diff_files(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-22 21:50:10 +01:00			`}`

ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`int run_diff_files(struct rev_info *revs, unsigned int option)`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`{`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`int entries, i;`
			`int diff_unmerged_stage = revs->max_count;`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`int silent_on_removed = option & DIFF_SILENT_ON_REMOVED;`
git-add: make the entry stat-clean after re-adding the same contents Earlier in commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 03:22:52 +01:00			`unsigned ce_option = ((option & DIFF_RACY_IS_MODIFIED)`
			`? CE_MATCH_RACY_IS_DIRTY : 0);`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`if (diff_unmerged_stage < 0)`
			`diff_unmerged_stage = 2;`
run_diff_{files,index}(): update calling convention. They used to open and read index themselves, but they now expect their callers to do so. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-10 03:51:40 +01:00			`entries = active_nr;`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`for (i = 0; i < entries; i++) {`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00			`struct stat st;`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`unsigned int oldmode, newmode;`
			`struct cache_entry *ce = active_cache[i];`
			`int changed;`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (DIFF_OPT_TST(&revs->diffopt, QUIET) &&`
			`DIFF_OPT_TST(&revs->diffopt, HAS_CHANGES))`
Teach --quiet to diff backends. This teaches git-diff-files, git-diff-index and git-diff-tree backends to exit early under --quiet option. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-14 19:12:51 +01:00			`break;`

Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`if (!ce_path_match(ce, revs->prune_data))`
Make git diff-generation use a simpler spawn-like interface Instead of depending of fork() and execve() and doing things in between the two, make the git diff functions do everything up front, and then do a single "spawn_prog()" invocation to run the actual external diff program (if any is even needed). This actually ends up simplifying the code, and should make it much easier to make it efficient under broken operating systems (read: Windows). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 00:51:24 +01:00			`continue;`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`if (ce_stage(ce)) {`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`struct combine_diff_path *dpath;`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`int num_compare_stages = 0;`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`size_t path_len;`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`path_len = ce_namelen(ce);`

diff --cc: fix display of symlink conflicts during a merge. "git-diff-files --cc" to show conflicts during merge did not pass the correct mode information for the working tree down, and showed bogus combined diff. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 07:24:47 +01:00			`dpath = xmalloc(combine_diff_path_size(5, path_len));`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`dpath->path = (char *) &(dpath->parent[5]);`

			`dpath->next = NULL;`
			`dpath->len = path_len;`
			`memcpy(dpath->path, ce->name, path_len);`
			`dpath->path[path_len] = '\0';`
Convert memset(hash,0,20) to hashclr(hash). In the same spirit as hashcmp() and hashcpy(). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-23 22:57:23 +02:00			`hashclr(dpath->sha1);`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`memset(&(dpath->parent[0]), 0,`
diff --cc: fix display of symlink conflicts during a merge. "git-diff-files --cc" to show conflicts during merge did not pass the correct mode information for the working tree down, and showed bogus combined diff. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 07:24:47 +01:00			`sizeof(struct combine_diff_parent)*5);`

			`if (lstat(ce->name, &st) < 0) {`
			`if (errno != ENOENT && errno != ENOTDIR) {`
			`perror(ce->name);`
			`continue;`
			`}`
			`if (silent_on_removed)`
			`continue;`
			`}`
			`else`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`dpath->mode = ce_mode_from_stat(ce, st.st_mode);`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00
			`while (i < entries) {`
			`struct cache_entry *nce = active_cache[i];`
			`int stage;`

			`if (strcmp(ce->name, nce->name))`
			`break;`

			`/* Stage #2 (ours) is the first parent,`
			`* stage #3 (theirs) is the second.`
			`*/`
			`stage = ce_stage(nce);`
			`if (2 <= stage) {`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`int mode = nce->ce_mode;`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`num_compare_stages++;`
Convert memcpy(a,b,20) to hashcpy(a,b). This abstracts away the size of the hash values when copying them from memory location to memory location, much as the introduction of hashcmp abstracted away hash value comparsion. A few call sites were using char* rather than unsigned char* so I added the cast rather than open hashcpy to be void. This is a reasonable tradeoff as most call sites already use unsigned char and the existing hashcmp is also declared to be unsigned char*. [jc: Splitted the patch to "master" part, to be followed by a patch for merge-recursive.c which is not in "master" yet. Fixed the cast in the latter hunk to combine-diff.c which was wrong in the original. Also converted ones left-over in combine-diff.c, diff-lib.c and upload-pack.c ] Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-23 08:49:00 +02:00			`hashcpy(dpath->parent[stage-2].sha1, nce->sha1);`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`dpath->parent[stage-2].mode = ce_mode_from_stat(nce, mode);`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`dpath->parent[stage-2].status =`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`DIFF_STATUS_MODIFIED;`
			`}`

			`/* diff against the proper unmerged stage */`
			`if (stage == diff_unmerged_stage)`
			`ce = nce;`
			`i++;`
			`}`
			`/*`
			`* Compensate for loop update`
[PATCH] Optimize diff-tree -[CM] --stdin This attempts to optimize "diff-tree -[CM] --stdin", which compares successible tree pairs. This optimization does not make much sense for other commands in the diff-* brothers. When reading from --stdin and using rename/copy detection, the patch makes diff-tree to read the current index file first. This is done to reuse the optimization used by diff-cache in the non-cached case. Similarity estimator can avoid expanding a blob if the index says what is in the work tree has an exact copy of that blob already expanded. Another optimization the patch makes is to check only file sizes first to terminate similarity estimation early. In order for this to work, it needs a way to tell the size of the blob without expanding it. Since an obvious way of doing it, which is to keep all the blobs previously used in the memory, is too costly, it does so by keeping the filesize for each object it has already seen in memory. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:56:38 +02:00			`*/`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`i--;`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`if (revs->combine_merges && num_compare_stages == 2) {`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`show_combined_diff(dpath, 2,`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`revs->dense_combined_merges,`
			`revs);`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`free(dpath);`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`continue;`
rename/copy score parsing updates. Better variant, which handles stuff like "4.5%" and rejects "192.168.0.1". Additionally, make sure numbers are unsigned (I'm making them unsigned long just for the hell of it), to make sure that artificial wraparound scenarios don't cause harm. -hpa [jc: with this, -M100 changes its meaning back to 10%. People wanting to say "pure renames only" should now say -M100% or -M1.0; sounds a bit like an earthquake, but arguably things are more consistent this way ;-)] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-21 23:17:12 +01:00			`}`
Don't instantiate structures with FAMs. Since structures with `flexible array members' are an incomplete datatype ANSI C99 forbids creating instances of them. This patch removes such an instance from `diff-lib.c' and replaces it with a pointer to a `struct combine_diff_path'. Since all neccessary memory is allocated at once the number of calls to `xmalloc' is not increased. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:05 +02:00			`free(dpath);`
			`dpath = NULL;`
[PATCH] diff: Clean up diff_scoreopt_parse(). This cleans up diff_scoreopt_parse() function that is used to parse the fractional notation -B, -C and -M option takes. The callers are modified to check for errors and complain. Earlier they silently ignored malformed input and falled back on the default. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:37:54 +02:00
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`/*`
			`* Show the diff for the 'ce' if we found the one`
			`* from the desired stage.`
			`*/`
diff-index --cached --raw: show tree entry on the LHS for unmerged entries. This updates the way diffcore represents an unmerged pair somewhat. It used to be that entries with mode=0 on both sides were used to represent an unmerged pair, but now it has an explicit flag. This is to allow diff-index --cached to report the entry from the tree when the path is unmerged in the index. This is used in updating "git reset <tree> -- <path>" to restore absense of the path in the index from the tree. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-05 10:25:18 +01:00			`diff_unmerge(&revs->diffopt, ce->name, 0, null_sha1);`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`if (ce_stage(ce) != diff_unmerged_stage)`
			`continue;`
[PATCH] diff: Update -B heuristics. As Linus pointed out on the mailing list discussion, -B should break a files that has many inserts even if it still keeps enough of the original contents, so that the broken pieces can later be matched with other files by -M or -C. However, if such a broken pair does not get picked up by -M or -C, we would want to apply different criteria; namely, regardless of the amount of new material in the result, the determination of "rewrite" should be done by looking at the amount of original material still left in the result. If you still have the original 97 lines from a 100-line document, it does not matter if you add your own 13 lines to make a 110-line document, or if you add 903 lines to make a 1000-line document. It is not a rewrite but an in-place edit. On the other hand, if you did lose 97 lines from the original, it does not matter if you added 27 lines to make a 30-line document or if you added 997 lines to make a 1000-line document. You did a complete rewrite in either case. This patch introduces a post-processing phase that runs after diffcore-rename matches up broken pairs diffcore-break creates. The purpose of this post-processing is to pick up these broken pieces and merge them back into in-place modifications. For this, the score parameter -B option takes is changed into a pair of numbers, and it takes "-B99/80" format when fully spelled out. The first number is the minimum amount of "edit" (same definition as what diffcore-rename uses, which is "sum of deletion and insertion") that a modification needs to have to be broken, and the second number is the minimum amount of "delete" a surviving broken pair must have to avoid being merged back together. It can be abbreviated to "-B" to use default for both, "-B9" or "-B9/" to use 90% for "edit" but default (80%) for merge avoidance, or "-B/75" to use default (99%) "edit" and 75% for merge avoidance. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 10:40:28 +02:00			`}`
[PATCH] Diffcore updates. This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-22 19:04:37 +02:00
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00			`if (ce_uptodate(ce))`
			`continue;`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`if (lstat(ce->name, &st) < 0) {`
			`if (errno != ENOENT && errno != ENOTDIR) {`
			`perror(ce->name);`
[PATCH] Fix the way diffcore-rename records unremoved source. Earier version of diffcore-rename used to keep unmodified filepair in its output so that the last stage of the processing that tells renames from copies can make all of rename/copy to copies. However this had a bad interaction with other diffcore filters that wanted to run after diffcore-rename, in that such unmodified filepair must be retained for proper distinction between renames and copies to happen. This patch fixes the problem by changing the way diffcore-rename records the information needed to distinguish "all are copies" case and "the last one is a rename" case. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-28 00:55:55 +02:00			`continue;`
			`}`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`if (silent_on_removed)`
			`continue;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`diff_addremove(&revs->diffopt, '-', ce->ce_mode,`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`ce->sha1, ce->name, NULL);`
			`continue;`
[PATCH] Add --diff-filter= output restriction to diff-* family. This is a halfway between debugging aid and a helper to write an ultra-smart merge scripts. The new option takes a string that consists of a list of "status" letters, and limits the diff output to only those classes of changes, with two exceptions: - A broken pair (aka "complete rewrite"), does not match D (deleted) or N (created). Use B to look for them. - The letter "A" in the diff-filter string does not match anything itself, but causes the entire diff that contains selected patches to be output (this behaviour is similar to that of --pickaxe-all for the -S option). For example, $ git-rev-list HEAD \| git-diff-tree --stdin -s -v -B -C --diff-filter=BCR shows a list of commits that have complete rewrite, copy, or rename. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-12 05:57:13 +02:00			`}`
git-add: make the entry stat-clean after re-adding the same contents Earlier in commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 03:22:52 +01:00			`changed = ce_match_stat(ce, &st, ce_option);`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`if (!changed && !DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`continue;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`oldmode = ce->ce_mode;`
			`newmode = ce_mode_from_stat(ce, st.st_mode);`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`diff_change(&revs->diffopt, oldmode, newmode,`
			`ce->sha1, (changed ? null_sha1 : ce->sha1),`
			`ce->name, NULL);`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00
[PATCH] Simplify "reverse-diff" logic in the diff core. Instead of swapping the arguments just before output, this patch makes the swapping happen on the input side of the diff core, when "reverse-diff" is in effect. This greatly simplifies the logic, but more importantly it is necessary for upcoming "copy detection" work. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-20 18:54:07 +02:00			`}`
Libify diff-files. This is the first installment to libify diff brothers. The updated diff-files uses revision.c::setup_revisions() infrastructure to parse its command line arguments, which means the pathname arguments are checked more strictly than before. The tests are adjusted to separate possibly missing paths from the rest of arguments with double-dashes, to show the kosher way. As Linus pointed out, renaming diff.c to diff-lib.c was simply stupid, so I am renaming it back. The new diff-lib.c is to contain pieces extracted from diff brothers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 08:57:45 +02:00			`diffcore_std(&revs->diffopt);`
			`diff_flush(&revs->diffopt);`
			`return 0;`
[PATCH] Reworked external diff interface. This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-27 18:21:00 +02:00			`}`
[PATCH] Diff-tree-helper take two. This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 18:25:05 +02:00
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`/*`
			`* diff-index`
			`*/`

			`/* A file entry went away or appeared */`
			`static void diff_index_show_file(struct rev_info *revs,`
			`const char *prefix,`
			`struct cache_entry *ce,`
			`unsigned char *sha1, unsigned int mode)`
			`{`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`diff_addremove(&revs->diffopt, prefix[0], mode,`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`sha1, ce->name, NULL);`
			`}`

			`static int get_stat_data(struct cache_entry *ce,`
			`unsigned char **sha1p,`
			`unsigned int *modep,`
			`int cached, int match_missing)`
			`{`
			`unsigned char *sha1 = ce->sha1;`
			`unsigned int mode = ce->ce_mode;`

			`if (!cached) {`
			`static unsigned char no_sha1[20];`
			`int changed;`
			`struct stat st;`
			`if (lstat(ce->name, &st) < 0) {`
			`if (errno == ENOENT && match_missing) {`
			`*sha1p = sha1;`
			`*modep = mode;`
			`return 0;`
			`}`
			`return -1;`
			`}`
			`changed = ce_match_stat(ce, &st, 0);`
			`if (changed) {`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00			`mode = ce_mode_from_stat(ce, st.st_mode);`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`sha1 = no_sha1;`
			`}`
			`}`

			`*sha1p = sha1;`
			`*modep = mode;`
			`return 0;`
			`}`

			`static void show_new_file(struct rev_info *revs,`
			`struct cache_entry *new,`
			`int cached, int match_missing)`
			`{`
			`unsigned char *sha1;`
			`unsigned int mode;`

			`/* New file in the index: it might actually be different in`
			`* the working copy.`
			`*/`
			`if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0)`
			`return;`

			`diff_index_show_file(revs, "+", new, sha1, mode);`
			`}`

			`static int show_modified(struct rev_info *revs,`
			`struct cache_entry *old,`
			`struct cache_entry *new,`
			`int report_missing,`
			`int cached, int match_missing)`
			`{`
			`unsigned int mode, oldmode;`
			`unsigned char *sha1;`

			`if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0) {`
			`if (report_missing)`
			`diff_index_show_file(revs, "-", old,`
			`old->sha1, old->ce_mode);`
			`return -1;`
			`}`

diff-index --cc shows a 3-way diff between HEAD, index and working tree. This implements a 3-way diff between the HEAD commit, the state in the index, and the working directory. This is like the n-way diff for a merge, and uses much of the same code. It is invoked with the -c flag to git-diff-index, which it already accepted and did nothing with. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-04 13:38:40 +02:00			`if (revs->combine_merges && !cached &&`
			`(hashcmp(sha1, old->sha1) \|\| hashcmp(old->sha1, new->sha1))) {`
			`struct combine_diff_path *p;`
			`int pathlen = ce_namelen(new);`

			`p = xmalloc(combine_diff_path_size(2, pathlen));`
			`p->path = (char *) &p->parent[2];`
			`p->next = NULL;`
			`p->len = pathlen;`
			`memcpy(p->path, new->name, pathlen);`
			`p->path[pathlen] = 0;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`p->mode = mode;`
diff-index --cc shows a 3-way diff between HEAD, index and working tree. This implements a 3-way diff between the HEAD commit, the state in the index, and the working directory. This is like the n-way diff for a merge, and uses much of the same code. It is invoked with the -c flag to git-diff-index, which it already accepted and did nothing with. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-04 13:38:40 +02:00			`hashclr(p->sha1);`
			`memset(p->parent, 0, 2 * sizeof(struct combine_diff_parent));`
			`p->parent[0].status = DIFF_STATUS_MODIFIED;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`p->parent[0].mode = new->ce_mode;`
diff-index --cc shows a 3-way diff between HEAD, index and working tree. This implements a 3-way diff between the HEAD commit, the state in the index, and the working directory. This is like the n-way diff for a merge, and uses much of the same code. It is invoked with the -c flag to git-diff-index, which it already accepted and did nothing with. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-04 13:38:40 +02:00			`hashcpy(p->parent[0].sha1, new->sha1);`
			`p->parent[1].status = DIFF_STATUS_MODIFIED;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`p->parent[1].mode = old->ce_mode;`
diff-index --cc shows a 3-way diff between HEAD, index and working tree. This implements a 3-way diff between the HEAD commit, the state in the index, and the working directory. This is like the n-way diff for a merge, and uses much of the same code. It is invoked with the -c flag to git-diff-index, which it already accepted and did nothing with. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-04 13:38:40 +02:00			`hashcpy(p->parent[1].sha1, old->sha1);`
			`show_combined_diff(p, 2, revs->dense_combined_merges, revs);`
			`free(p);`
			`return 0;`
			`}`

Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`oldmode = old->ce_mode;`
Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length. Introduces global inline: hashcmp(const unsigned char sha1, const unsigned char sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-17 20:54:57 +02:00			`if (mode == oldmode && !hashcmp(sha1, old->sha1) &&`
Make the diff_options bitfields be an unsigned with explicit masks. reverse_diff was a bit-value in disguise, it's merged in the flags now. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 20:05:14 +01:00			`!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`return 0;`

			`diff_change(&revs->diffopt, oldmode, mode,`
			`old->sha1, sha1, old->name, NULL);`
			`return 0;`
			`}`

			`/*`
			`* This turns all merge entries into "stage 3". That guarantees that`
			`* when we read in the new tree (into "stage 1"), we won't lose sight`
			`* of the fact that we had unmerged entries.`
			`*/`
			`static void mark_merge_entries(void)`
			`{`
			`int i;`
			`for (i = 0; i < active_nr; i++) {`
			`struct cache_entry *ce = active_cache[i];`
			`if (!ce_stage(ce))`
			`continue;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`ce->ce_flags \|= CE_STAGEMASK;`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`}`
			`}`

Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00			`/*`
			`* This gets a mix of an existing index and a tree, one pathname entry`
			`* at a time. The index entry may be a single stage-0 one, but it could`
			`* also be multiple unmerged entries (in which case idx_pos/idx_nr will`
			`* give you the position and number of entries in the index).`
			`*/`
			`static void do_oneway_diff(struct unpack_trees_options *o,`
			`struct cache_entry *idx,`
			`struct cache_entry *tree,`
			`int idx_pos, int idx_nr)`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`{`
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00			`struct rev_info *revs = o->unpack_data;`
			`int match_missing, cached;`
Libified diff-index: backward compatibility fix. "diff-index -m" does not mean "do not ignore merges", but means "pretend missing files match the index". The previous round tried to address this, but failed because setup_revisions() ate "-m" flag before the caller had a chance to intervene. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 12:58:04 +02:00
War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-07 09:04:01 +02:00			`/*`
Libified diff-index: backward compatibility fix. "diff-index -m" does not mean "do not ignore merges", but means "pretend missing files match the index". The previous round tried to address this, but failed because setup_revisions() ate "-m" flag before the caller had a chance to intervene. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 12:58:04 +02:00			`* Backward compatibility wart - "diff-index -m" does`
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00			`* not mean "do not ignore merges", but "match_missing".`
			`*`
			`* But with the revision flag parsing, that's found in`
			`* "!revs->ignore_merges".`
			`*/`
			`cached = o->index_only;`
			`match_missing = !revs->ignore_merges;`

			`if (cached && idx && ce_stage(idx)) {`
			`if (tree)`
			`diff_unmerge(&revs->diffopt, idx->name, idx->ce_mode, idx->sha1);`
			`return;`
			`}`

			`/*`
			`* Something added to the tree?`
			`*/`
			`if (!tree) {`
			`show_new_file(revs, idx, cached, match_missing);`
			`return;`
			`}`

			`/*`
			`* Something removed from the tree?`
Libified diff-index: backward compatibility fix. "diff-index -m" does not mean "do not ignore merges", but means "pretend missing files match the index". The previous round tried to address this, but failed because setup_revisions() ate "-m" flag before the caller had a chance to intervene. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 12:58:04 +02:00			`*/`
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00			`if (!idx) {`
			`diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);`
			`return;`
			`}`

			`/* Show difference between old and new */`
			`show_modified(revs, tree, idx, 1, cached, match_missing);`
			`}`

			`/*`
			`* Count how many index entries go with the first one`
			`*/`
			`static inline int count_skip(const struct cache_entry *src, int pos)`
			`{`
			`int skip = 1;`

			`/* We can only have multiple entries if the first one is not stage-0 */`
			`if (ce_stage(src)) {`
			`struct cache_entry **p = active_cache + pos;`
			`int namelen = ce_namelen(src);`

			`for (;;) {`
			`const struct cache_entry *ce;`
			`pos++;`
			`if (pos >= active_nr)`
			`break;`
			`ce = *++p;`
			`if (ce_namelen(ce) != namelen)`
			`break;`
			`if (memcmp(ce->name, src->name, namelen))`
			`break;`
			`skip++;`
			`}`
			`}`
			`return skip;`
			`}`

			`/*`
			`* The unpack_trees() interface is designed for merging, so`
			`* the different source entries are designed primarily for`
			`* the source trees, with the old index being really mainly`
			`* used for being replaced by the result.`
			`*`
			`* For diffing, the index is more important, and we only have a`
			`* single tree.`
			`*`
			`* We're supposed to return how many index entries we want to skip.`
			`*`
			`* This wrapper makes it all more readable, and takes care of all`
			`* the fairly complex unpack_trees() semantic requirements, including`
			`* the skipping, the path matching, the type conflict cases etc.`
			`*/`
			`static int oneway_diff(struct cache_entry **src,`
			`struct unpack_trees_options *o,`
			`int index_pos)`
			`{`
			`int skip = 0;`
			`struct cache_entry *idx = src[0];`
			`struct cache_entry *tree = src[1];`
			`struct rev_info *revs = o->unpack_data;`

			`if (index_pos >= 0)`
			`skip = count_skip(idx, index_pos);`

			`/*`
			`* Unpack-trees generates a DF/conflict entry if`
			`* there was a directory in the index and a tree`
			`* in the tree. From a diff standpoint, that's a`
			`* delete of the tree and a create of the file.`
			`*/`
			`if (tree == o->df_conflict_entry)`
			`tree = NULL;`

			`if (ce_path_match(idx ? idx : tree, revs->prune_data))`
			`do_oneway_diff(o, idx, tree, index_pos, skip);`

			`return skip;`
			`}`

			`int run_diff_index(struct rev_info *revs, int cached)`
			`{`
			`struct object *ent;`
			`struct tree *tree;`
			`const char *tree_name;`
			`struct unpack_trees_options opts;`
			`struct tree_desc t;`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00
			`mark_merge_entries();`

Add "named object array" concept We've had this notion of a "object_list" for a long time, which eventually grew a "name" member because some users (notably git-rev-list) wanted to name each object as it is generated. That object_list is great for some things, but it isn't all that wonderful for others, and the "name" member is generally not used by everybody. This patch splits the users of the object_list array up into two: the traditional list users, who want the list-like format, and who don't actually use or want the name. And another class of users that really used the list as an extensible array, and generally wanted to name the objects. The patch is fairly straightforward, but it's also biggish. Most of it really just cleans things up: switching the revision parsing and listing over to the array makes things like the builtin-diff usage much simpler (we now see exactly how many members the array has, and we don't get the objects reversed from the order they were on the command line). One of the main reasons for doing this at all is that the malloc overhead of the simple object list was actually pretty high, and the array is just a lot denser. So this patch brings down memory usage by git-rev-list by just under 3% (on top of all the other memory use optimizations) on the mozilla archive. It does add more lines than it removes, and more importantly, it adds a whole new infrastructure for maintaining lists of objects, but on the other hand, the new dynamic array code is pretty obvious. The change to builtin-diff-tree.c shows a fairly good example of why an array interface is sometimes more natural, and just much simpler for everybody. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-20 02:42:35 +02:00			`ent = revs->pending.objects[0].item;`
			`tree_name = revs->pending.objects[0].name;`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`tree = parse_tree_indirect(ent->sha1);`
			`if (!tree)`
			`return error("bad tree object %s", tree_name);`
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00
			`memset(&opts, 0, sizeof(opts));`
			`opts.head_idx = 1;`
			`opts.index_only = cached;`
			`opts.merge = 1;`
			`opts.fn = oneway_diff;`
			`opts.unpack_data = revs;`

			`init_tree_desc(&t, tree->buffer, tree->size);`
Allow callers of unpack_trees() to handle failure Return an error from unpack_trees() instead of calling die(), and exit with an error in read-tree, builtin-commit, and diff-lib. merge-recursive already expected an error return from unpack_trees, so it doesn't need to be changed. The merge function can return negative to abort. This will be used in builtin-checkout -m. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> 2008-02-07 17:39:48 +01:00			`if (unpack_trees(1, &t, &opts))`
			`exit(128);`
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`diffcore_std(&revs->diffopt);`
			`diff_flush(&revs->diffopt);`
Make run_diff_index() use unpack_trees(), not read_tree() A plain "git commit" would still run lstat() a lot more than necessary, because wt_status_print() would cause the index to be repeatedly flushed and re-read by wt_read_cache(), and that would cause the CE_UPTODATE bit to be lost, resulting in the files in the index being lstat'ed three times each. The reason why wt-status.c ended up invalidating and re-reading the cache multiple times was that it uses "run_diff_index()", which in turn uses "read_tree()" to populate the index with both the old index and the tree we want to compare against. So this patch re-writes run_diff_index() to not use read_tree(), but instead use "unpack_trees()" to diff the index to a tree. That, in turn, means that we don't need to modify the index itself, which then means that we don't need to invalidate it and re-read it! This, together with the lstat() optimizations, means that "git commit" on the kernel tree really only needs to lstat() the index entries once. That noticeably cuts down on the cached timings. Best time before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.399s user 0m0.232s sys 0m0.164s Best time after: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.254s user 0m0.140s sys 0m0.112s so it's a noticeable improvement in addition to being a nice conceptual cleanup (it's really not that pretty that "run_diff_index()" dirties the index!) Doing an "strace -c" on it also shows that as it cuts the number of lstat() calls by two thirds, it goes from being lstat()-limited to being limited by getdents() (which is the readdir system call): Before: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 60.69 0.000704 0 69230 31 lstat 23.62 0.000274 0 5522 getdents 8.36 0.000097 0 5508 2638 open 2.59 0.000030 0 2869 close 2.50 0.000029 0 274 write 1.47 0.000017 0 2844 fstat After: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 45.17 0.000276 0 5522 getdents 26.51 0.000162 0 23112 31 lstat 19.80 0.000121 0 5503 2638 open 4.91 0.000030 0 2864 close 1.48 0.000020 0 274 write 1.34 0.000018 0 2844 fstat ... It passes the test-suite for me, but this is another of one of those really core functions, and certainly pretty subtle, so.. NOTE! The Linux lstat() system call is really quite cheap when everything is cached, so the fact that this is quite noticeable on Linux is likely to mean that it is much more noticeable on other operating systems. I bet you'll see a much bigger performance improvement from this on Windows in particular. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 02:27:12 +01:00			`return 0;`
Libify diff-index. The second installment to libify diff brothers. The pathname arguments are checked more strictly than before because we now use the revision.c::setup_revisions() infrastructure. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-22 11:43:00 +02:00			`}`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`int do_diff_cache(const unsigned char tree_sha1, struct diff_options opt)`
			`{`
			`struct tree *tree;`
			`struct rev_info revs;`
			`int i;`
			`struct cache_entry **dst;`
			`struct cache_entry *last = NULL;`
Also use unpack_trees() in do_diff_cache() As in run_diff_index(), we call unpack_trees() with the oneway_diff() function in do_diff_cache() now. This makes the function diff_cache() obsolete. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 16:19:56 +01:00			`struct unpack_trees_options opts;`
			`struct tree_desc t;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00
			`/*`
			`* This is used by git-blame to run diff-cache internally;`
			`* it potentially needs to repeatedly run this, so we will`
			`* start by removing the higher order entries the last round`
			`* left behind.`
			`*/`
			`dst = active_cache;`
			`for (i = 0; i < active_nr; i++) {`
			`struct cache_entry *ce = active_cache[i];`
			`if (ce_stage(ce)) {`
			`if (last && !strcmp(ce->name, last->name))`
			`continue;`
			`cache_tree_invalidate_path(active_cache_tree,`
			`ce->name);`
			`last = ce;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`ce->ce_flags \|= CE_REMOVE;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`}`
			`*dst++ = ce;`
			`}`
			`active_nr = dst - active_cache;`

			`init_revisions(&revs, NULL);`
			`revs.prune_data = opt->paths;`
			`tree = parse_tree_indirect(tree_sha1);`
			`if (!tree)`
			`die("bad tree object %s", sha1_to_hex(tree_sha1));`
Also use unpack_trees() in do_diff_cache() As in run_diff_index(), we call unpack_trees() with the oneway_diff() function in do_diff_cache() now. This makes the function diff_cache() obsolete. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 16:19:56 +01:00
			`memset(&opts, 0, sizeof(opts));`
			`opts.head_idx = 1;`
			`opts.index_only = 1;`
			`opts.merge = 1;`
			`opts.fn = oneway_diff;`
			`opts.unpack_data = &revs;`

			`init_tree_desc(&t, tree->buffer, tree->size);`
Allow callers of unpack_trees() to handle failure Return an error from unpack_trees() instead of calling die(), and exit with an error in read-tree, builtin-commit, and diff-lib. merge-recursive already expected an error return from unpack_trees, so it doesn't need to be changed. The merge function can return negative to abort. This will be used in builtin-checkout -m. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> 2008-02-07 17:39:48 +01:00			`if (unpack_trees(1, &t, &opts))`
			`exit(128);`
Also use unpack_trees() in do_diff_cache() As in run_diff_index(), we call unpack_trees() with the oneway_diff() function in do_diff_cache() now. This makes the function diff_cache() obsolete. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-20 16:19:56 +01:00			`return 0;`
git-blame: no rev means start from the working tree file. Warning: this changes the semantics. This makes "git blame" without any positive rev to start digging from the working tree copy, which is made into a fake commit whose sole parent is the HEAD. It also adds --contents <file> option to pretend as if the working tree copy has the contents of the named file. You can use '-' to make the command read from the standard input. If you want the command to start annotating from the HEAD commit, you need to explicitly give HEAD parameter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-30 10:11:08 +01:00			`}`