mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-17 22:44:49 +01:00

768 lines

19 KiB

C

Raw Normal View History

Add copyright notices. The tool interface sucks (especially "committing" information, which is just me doing everything by hand from the command line), but I think this is in theory actually a viable way of describing the world. So copyright it. 2005-04-08 00:16:10 +02:00			`/*`
			`* GIT - The information manager from hell`
			`*`
			`* Copyright (C) Linus Torvalds, 2005`
			`*/`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`#include "cache.h"`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`#include "cache-tree.h"`

			`/* Index extensions.`
			`*`
			`* The first letter should be 'A'..'Z' for extensions that are not`
			`* necessary for a correct operation (i.e. optimization data).`
			`* When new extensions are added that _needs_ to be understood in`
			`* order to correctly interpret the index file, pick character that`
			`* is outside the range, to cause the reader to abort.`
			`*/`

			`#define CACHE_EXT(s) ( (s[0]<<24)\|(s[1]<<16)\|(s[2]<<8)\|(s[3]) )`
			`#define CACHE_EXT_TREE 0x54524545 /* "TREE" */`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
			`struct cache_entry **active_cache = NULL;`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`static time_t index_file_timestamp;`
Revert bogus optimization that avoids index file writes It didn't properly mark all cache updates as being dirty, and causes merge errors due to that. In particular, it didn't notice when a file was force-removed. Besides, it was ugly as hell. I've put in place a slightly cleaner version, but I've not enabled the optimization because I don't want to be burned again. 2005-05-07 01:48:43 +02:00			`unsigned int active_nr = 0, active_alloc = 0, active_cache_changed = 0;`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`struct cache_tree *active_cache_tree = NULL;`

[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`/*`
			`* This only updates the "non-critical" parts of the directory`
			`* cache, ie the parts that aren't tracked by GIT, and only used`
			`* to validate the cache.`
			`*/`
			`void fill_stat_cache_info(struct cache_entry ce, struct stat st)`
			`{`
			`ce->ce_ctime.sec = htonl(st->st_ctime);`
			`ce->ce_mtime.sec = htonl(st->st_mtime);`
Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00			`#ifdef USE_NSEC`
[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`ce->ce_ctime.nsec = htonl(st->st_ctim.tv_nsec);`
			`ce->ce_mtime.nsec = htonl(st->st_mtim.tv_nsec);`
			`#endif`
			`ce->ce_dev = htonl(st->st_dev);`
			`ce->ce_ino = htonl(st->st_ino);`
			`ce->ce_uid = htonl(st->st_uid);`
			`ce->ce_gid = htonl(st->st_gid);`
			`ce->ce_size = htonl(st->st_size);`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
			`if (assume_unchanged)`
			`ce->ce_flags \|= htons(CE_VALID);`
[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`}`

Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`static int ce_compare_data(struct cache_entry ce, struct stat st)`
			`{`
			`int match = -1;`
			`int fd = open(ce->name, O_RDONLY);`

			`if (fd >= 0) {`
			`unsigned char sha1[20];`
			`if (!index_fd(sha1, fd, st, 0, NULL))`
			`match = memcmp(sha1, ce->sha1, 20);`
			`close(fd);`
			`}`
			`return match;`
			`}`

			`static int ce_compare_link(struct cache_entry *ce, unsigned long expected_size)`
			`{`
			`int match = -1;`
			`char *target;`
			`void *buffer;`
			`unsigned long size;`
			`char type[10];`
			`int len;`

			`target = xmalloc(expected_size);`
			`len = readlink(ce->name, target, expected_size);`
			`if (len != expected_size) {`
			`free(target);`
			`return -1;`
			`}`
			`buffer = read_sha1_file(ce->sha1, type, &size);`
			`if (!buffer) {`
			`free(target);`
			`return -1;`
			`}`
			`if (size == expected_size)`
			`match = memcmp(buffer, target, size);`
			`free(buffer);`
			`free(target);`
			`return match;`
			`}`

			`static int ce_modified_check_fs(struct cache_entry ce, struct stat st)`
			`{`
			`switch (st->st_mode & S_IFMT) {`
			`case S_IFREG:`
			`if (ce_compare_data(ce, st))`
			`return DATA_CHANGED;`
			`break;`
			`case S_IFLNK:`
			`if (ce_compare_link(ce, st->st_size))`
			`return DATA_CHANGED;`
			`break;`
			`default:`
			`return TYPE_CHANGED;`
			`}`
			`return 0;`
			`}`

Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`static int ce_match_stat_basic(struct cache_entry ce, struct stat st)`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`{`
			`unsigned int changed = 0;`

[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`switch (ntohl(ce->ce_mode) & S_IFMT) {`
			`case S_IFREG:`
			`changed \|= !S_ISREG(st->st_mode) ? TYPE_CHANGED : 0;`
Use core.filemode. With "[core] filemode = false", you can tell git to ignore differences in the working tree file only in executable bit. * "git-update-index --refresh" does not say "needs update" if index entry and working tree file differs only in executable bit. * "git-update-index" on an existing path takes executable bit from the existing index entry, if the path and index entry are both regular files. * "git-diff-files" and "git-diff-index" without --cached flag pretend the path on the filesystem has the same executable bit as the existing index entry, if the path and index entry are both regular files. If you are on a filesystem with unreliable mode bits, you may need to force the executable bit after registering the path in the index. * "git-update-index --chmod=+x foo" flips the executable bit of the index file entry for path "foo" on. Use "--chmod=-x" to flip it off. Note that --chmod only works in index file and does not look at nor update the working tree. So if you are on a filesystem and do not have working executable bit, you would do: 1. set the appropriate .git/config option; 2. "git-update-index --add new-file.c" 3. "git-ls-files --stage new-file.c" to see if it has the desired mode bits. If not, e.g. to drop executable bit picked up from the filesystem, say "git-update-index --chmod=-x new-file.c". Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 03:45:33 +02:00			`/* We consider only the owner x bit to be relevant for`
			`* "mode changes"`
			`*/`
			`if (trust_executable_bit &&`
			`(0100 & (ntohl(ce->ce_mode) ^ st->st_mode)))`
[PATCH] fix compare symlink against readlink not data Fix update-cache to compare the blob of a symlink against the link-target and not the file it points to. Also ignore all permissions applied to links. Thanks to Greg for recognizing this while he added our list of symlinks back to the udev repository. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 15:45:01 +02:00			`changed \|= MODE_CHANGED;`
[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`break;`
			`case S_IFLNK:`
			`changed \|= !S_ISLNK(st->st_mode) ? TYPE_CHANGED : 0;`
			`break;`
			`default:`
			`die("internal error: ce_mode is %o", ntohl(ce->ce_mode));`
			`}`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (ce->ce_mtime.sec != htonl(st->st_mtime))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= MTIME_CHANGED;`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (ce->ce_ctime.sec != htonl(st->st_ctime))`
			`changed \|= CTIME_CHANGED;`

Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00			`#ifdef USE_NSEC`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`/*`
			`* nsec seems unreliable - not all filesystems support it, so`
			`* as long as it is in the inode cache you get right nsec`
			`* but after it gets flushed, you get zero nsec.`
			`*/`
Fix NSEC compile problem, and properly parse the rev-tree cmd line. The rev-tree thing just happened to work. It shouldn't have. 2005-04-21 18:58:24 +02:00			`if (ce->ce_mtime.nsec != htonl(st->st_mtim.tv_nsec))`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`changed \|= MTIME_CHANGED;`
Fix NSEC compile problem, and properly parse the rev-tree cmd line. The rev-tree thing just happened to work. It shouldn't have. 2005-04-21 18:58:24 +02:00			`if (ce->ce_ctime.nsec != htonl(st->st_ctim.tv_nsec))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= CTIME_CHANGED;`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`#endif`

			`if (ce->ce_uid != htonl(st->st_uid) \|\|`
			`ce->ce_gid != htonl(st->st_gid))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= OWNER_CHANGED;`
Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00			`if (ce->ce_ino != htonl(st->st_ino))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= INODE_CHANGED;`
Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00
			`#ifdef USE_STDEV`
			`/*`
			`* st_dev breaks on network filesystems where different`
			`* clients will have different views of what "device"`
			`* the filesystem is on`
			`*/`
			`if (ce->ce_dev != htonl(st->st_dev))`
			`changed \|= INODE_CHANGED;`
			`#endif`

Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (ce->ce_size != htonl(st->st_size))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= DATA_CHANGED;`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`return changed;`
			`}`

"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`int ce_match_stat(struct cache_entry ce, struct stat st, int ignore_valid)`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`{`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`unsigned int changed;`

			`/*`
			`* If it's marked as always valid in the index, it's`
			`* valid whatever the checked-out copy says.`
			`*/`
			`if (!ignore_valid && (ce->ce_flags & htons(CE_VALID)))`
			`return 0;`

			`changed = ce_match_stat_basic(ce, st);`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`/*`
			`* Within 1 second of this sequence:`
			`* echo xyzzy >file && git-update-index --add file`
			`* running this command:`
			`* echo frotz >file`
			`* would give a falsely clean cache entry. The mtime and`
			`* length match the cache, and other stat fields do not change.`
			`*`
			`* We could detect this at update-index time (the cache entry`
			`* being registered/updated records the same time as "now")`
			`* and delay the return from git-update-index, but that would`
			`* effectively mean we can make at most one commit per second,`
			`* which is not acceptable. Instead, we check cache entries`
			`* whose mtime are the same as the index file timestamp more`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`* carefully than others.`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`*/`
			`if (!changed &&`
			`index_file_timestamp &&`
			`index_file_timestamp <= ntohl(ce->ce_mtime.sec))`
			`changed \|= ce_modified_check_fs(ce, st);`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`return changed;`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`}`

"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`int ce_modified(struct cache_entry ce, struct stat st, int really)`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`{`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`int changed, changed_fs;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`changed = ce_match_stat(ce, st, really);`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`if (!changed)`
			`return 0;`
			`/*`
			`* If the mode or type has changed, there's no point in trying`
			`* to refresh the entry - it's not going to match`
			`*/`
			`if (changed & (MODE_CHANGED \| TYPE_CHANGED))`
			`return changed;`

			`/* Immediately after read-tree or update-index --cacheinfo,`
			`* the length field is zero. For other cases the ce_size`
			`* should match the SHA1 recorded in the index entry.`
			`*/`
			`if ((changed & DATA_CHANGED) && ce->ce_size != htonl(0))`
			`return changed;`

Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`changed_fs = ce_modified_check_fs(ce, st);`
			`if (changed_fs)`
			`return changed \| changed_fs;`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`return 0;`
			`}`

Introduce "base_name_compare()" helper function This one compares two pathnames that may be partial basenames, not full paths. We need to get the path sorting right, since a directory name will sort as if it had the final '/' at the end. 2005-05-20 18:09:18 +02:00			`int base_name_compare(const char *name1, int len1, int mode1,`
			`const char *name2, int len2, int mode2)`
			`{`
			`unsigned char c1, c2;`
			`int len = len1 < len2 ? len1 : len2;`
			`int cmp;`

			`cmp = memcmp(name1, name2, len);`
			`if (cmp)`
			`return cmp;`
			`c1 = name1[len];`
			`c2 = name2[len];`
			`if (!c1 && S_ISDIR(mode1))`
			`c1 = '/';`
			`if (!c2 && S_ISDIR(mode2))`
			`c2 = '/';`
			`return (c1 < c2) ? -1 : (c1 > c2) ? 1 : 0;`
			`}`

Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00			`int cache_name_compare(const char name1, int flags1, const char name2, int flags2)`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`{`
Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00			`int len1 = flags1 & CE_NAMEMASK;`
			`int len2 = flags2 & CE_NAMEMASK;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`int len = len1 < len2 ? len1 : len2;`
			`int cmp;`

			`cmp = memcmp(name1, name2, len);`
			`if (cmp)`
			`return cmp;`
			`if (len1 < len2)`
			`return -1;`
			`if (len1 > len2)`
			`return 1;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
cache_name_compare() compares name and stage, nothing else. The code was a bit unclear in expressing what it wants to compare. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-13 08:46:25 +01:00			`/* Compare stages */`
			`flags1 &= CE_STAGEMASK;`
			`flags2 &= CE_STAGEMASK;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00			`if (flags1 < flags2)`
			`return -1;`
			`if (flags1 > flags2)`
			`return 1;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`return 0;`
			`}`

			`int cache_name_pos(const char *name, int namelen)`
			`{`
			`int first, last;`

			`first = 0;`
			`last = active_nr;`
			`while (last > first) {`
			`int next = (last + first) >> 1;`
			`struct cache_entry *ce = active_cache[next];`
[PATCH] Use ntohs instead of htons to convert ce_flags to host byte order Use ntohs instead of htons to convert ce_flags to host byte order Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-07 22:35:56 +02:00			`int cmp = cache_name_compare(name, namelen, ce->name, ntohs(ce->ce_flags));`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`if (!cmp)`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`return next;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`if (cmp < 0) {`
			`last = next;`
			`continue;`
			`}`
			`first = next+1;`
			`}`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`return -first-1;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`}`

When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`/* Remove entry, return true if there are more entries to go.. */`
Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`int remove_cache_entry_at(int pos)`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`{`
Revert bogus optimization that avoids index file writes It didn't properly mark all cache updates as being dirty, and causes merge errors due to that. In particular, it didn't notice when a file was force-removed. Besides, it was ugly as hell. I've put in place a slightly cleaner version, but I've not enabled the optimization because I don't want to be burned again. 2005-05-07 01:48:43 +02:00			`active_cache_changed = 1;`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`active_nr--;`
			`if (pos >= active_nr)`
			`return 0;`
			`memmove(active_cache + pos, active_cache + pos + 1, (active_nr - pos) * sizeof(struct cache_entry *));`
			`return 1;`
			`}`

Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`int remove_file_from_cache(const char *path)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`{`
			`int pos = cache_name_pos(path, strlen(path));`
[PATCH] update-cache --remove marks the path merged. When update-cache --remove is run, resolve unmerged state for the path. This is consistent with the update-cache --add behaviour. Essentially, the user is telling us how he wants to resolve the merge by running update-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Fixed to do the right thing at the end. Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-17 18:53:35 +02:00			`if (pos < 0)`
			`pos = -pos-1;`
			`while (pos < active_nr && !strcmp(active_cache[pos]->name, path))`
Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`remove_cache_entry_at(pos);`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return 0;`
			`}`

Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`int ce_same_name(struct cache_entry a, struct cache_entry b)`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`{`
			`int len = ce_namelen(a);`
			`return ce_namelen(b) == len && !memcmp(a->name, b->name, len);`
			`}`

Make "ce_match_path()" a generic helper function ... and make git-diff-files use it too. This all _should_ make the diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the path matching early interally. 2005-07-15 01:55:06 +02:00			`int ce_path_match(const struct cache_entry ce, const char *pathspec)`
			`{`
			`const char match, name;`
			`int len;`

			`if (!pathspec)`
			`return 1;`

			`len = ce_namelen(ce);`
			`name = ce->name;`
			`while ((match = *pathspec++) != NULL) {`
			`int matchlen = strlen(match);`
			`if (matchlen > len)`
			`continue;`
			`if (memcmp(name, match, matchlen))`
			`continue;`
			`if (matchlen && name[matchlen-1] == '/')`
			`return 1;`
			`if (name[matchlen] == '/' \|\| !name[matchlen])`
			`return 1;`
[PATCH] Improve handling of "." and ".." in git-diff-* This fixes up usage of ".." (without an ending slash) and "." (with or without the ending slash) in the git diff family. It also fixes pathspec matching for the case of an empty pathspec, since a "." in the top-level directory (or enough ".." under subdirectories) will result in an empty pathspec. We used to not match it against anything, but it should in fact match everything. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 05:44:32 +02:00			`if (!matchlen)`
			`return 1;`
Make "ce_match_path()" a generic helper function ... and make git-diff-files use it too. This all _should_ make the diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the path matching early interally. 2005-07-15 01:55:06 +02:00			`}`
			`return 0;`
			`}`

Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`/*`
			`* Do we have another file that has the beginning components being a`
			`* proper superset of the name we're trying to add?`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`*/`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`static int has_file_name(const struct cache_entry *ce, int pos, int ok_to_replace)`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`{`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`int retval = 0;`
			`int len = ce_namelen(ce);`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`int stage = ce_stage(ce);`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`const char *name = ce->name;`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`while (pos < active_nr) {`
			`struct cache_entry *p = active_cache[pos++];`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (len >= ce_namelen(p))`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`break;`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (memcmp(name, p->name, len))`
			`break;`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`if (ce_stage(p) != stage)`
			`continue;`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (p->name[len] != '/')`
			`continue;`
			`retval = -1;`
			`if (!ok_to_replace)`
			`break;`
			`remove_cache_entry_at(--pos);`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`}`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`return retval;`
			`}`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`/*`
			`* Do we have another file with a pathname that is a proper`
			`* subset of the name we're trying to add?`
			`*/`
			`static int has_dir_name(const struct cache_entry *ce, int pos, int ok_to_replace)`
			`{`
			`int retval = 0;`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`int stage = ce_stage(ce);`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`const char *name = ce->name;`
			`const char *slash = name + ce_namelen(ce);`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`for (;;) {`
			`int len;`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`for (;;) {`
			`if (*--slash == '/')`
			`break;`
			`if (slash <= ce->name)`
			`return retval;`
			`}`
			`len = slash - name;`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`pos = cache_name_pos(name, ntohs(create_ce_flags(len, stage)));`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (pos >= 0) {`
			`retval = -1;`
			`if (ok_to_replace)`
			`break;`
Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`remove_cache_entry_at(pos);`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`continue;`
			`}`

			`/*`
			`* Trivial optimization: if we find an entry that`
			`* already matches the sub-directory, then we know`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`* we're ok, and we can exit.`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`*/`
			`pos = -pos-1;`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`while (pos < active_nr) {`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`struct cache_entry *p = active_cache[pos];`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`if ((ce_namelen(p) <= len) \|\|`
			`(p->name[len] != '/') \|\|`
			`memcmp(p->name, name, len))`
			`break; /* not our subdirectory */`
			`if (ce_stage(p) == stage)`
			`/* p is at the same stage as our entry, and`
			`* is a subdirectory of what we are looking`
			`* at, so we cannot have conflicts at our`
			`* level or anything shorter.`
			`*/`
			`return retval;`
			`pos++;`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`}`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`}`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`return retval;`
			`}`

			`/* We may be in a situation where we already have path/file and path`
			`* is being added, or we already have path and path/file is being`
			`* added. Either one would result in a nonsense tree that has path`
			`* twice when git-write-tree tries to write it out. Prevent it.`
			`*`
			`* If ok-to-replace is specified, we remove the conflicting entries`
			`* from the cache so the caller should recompute the insert position.`
			`* When this happens, we return non-zero.`
			`*/`
			`static int check_file_directory_conflict(const struct cache_entry *ce, int pos, int ok_to_replace)`
			`{`
			`/*`
			`* We check if the path is a sub-path of a subsequent pathname`
			`* first, since removing those will not change the position`
			`* in the array`
			`*/`
			`int retval = has_file_name(ce, pos, ok_to_replace);`
			`/*`
			`* Then check if the path might have a clashing sub-directory`
			`* before it.`
			`*/`
			`return retval + has_dir_name(ce, pos, ok_to_replace);`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`}`

Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`int add_cache_entry(struct cache_entry *ce, int option)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`{`
			`int pos;`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`int ok_to_add = option & ADD_CACHE_OK_TO_ADD;`
			`int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE;`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
[PATCH] Use ntohs instead of htons to convert ce_flags to host byte order Use ntohs instead of htons to convert ce_flags to host byte order Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-07 22:35:56 +02:00			`pos = cache_name_pos(ce->name, ntohs(ce->ce_flags));`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
Use core.filemode. With "[core] filemode = false", you can tell git to ignore differences in the working tree file only in executable bit. * "git-update-index --refresh" does not say "needs update" if index entry and working tree file differs only in executable bit. * "git-update-index" on an existing path takes executable bit from the existing index entry, if the path and index entry are both regular files. * "git-diff-files" and "git-diff-index" without --cached flag pretend the path on the filesystem has the same executable bit as the existing index entry, if the path and index entry are both regular files. If you are on a filesystem with unreliable mode bits, you may need to force the executable bit after registering the path in the index. * "git-update-index --chmod=+x foo" flips the executable bit of the index file entry for path "foo" on. Use "--chmod=-x" to flip it off. Note that --chmod only works in index file and does not look at nor update the working tree. So if you are on a filesystem and do not have working executable bit, you would do: 1. set the appropriate .git/config option; 2. "git-update-index --add new-file.c" 3. "git-ls-files --stage new-file.c" to see if it has the desired mode bits. If not, e.g. to drop executable bit picked up from the filesystem, say "git-update-index --chmod=-x new-file.c". Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 03:45:33 +02:00			`/* existing match? Just replace it. */`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`if (pos >= 0) {`
Revert bogus optimization that avoids index file writes It didn't properly mark all cache updates as being dirty, and causes merge errors due to that. In particular, it didn't notice when a file was force-removed. Besides, it was ugly as hell. I've put in place a slightly cleaner version, but I've not enabled the optimization because I don't want to be burned again. 2005-05-07 01:48:43 +02:00			`active_cache_changed = 1;`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`active_cache[pos] = ce;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return 0;`
			`}`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`pos = -pos-1;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`/*`
			`* Inserting a merged entry ("stage 0") into the index`
			`* will always replace all non-merged entries..`
			`*/`
			`if (pos < active_nr && ce_stage(ce) == 0) {`
Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`while (ce_same_name(active_cache[pos], ce)) {`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`ok_to_add = 1;`
Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`if (!remove_cache_entry_at(pos))`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`break;`
			`}`
			`}`

Make "update-cache" a bit friendlier to use (and harder to mis-use). It now requires the "--add" flag before you add any new files, and a "--remove" file if you want to mark files for removal. And giving it the "--refresh" flag makes it just update all the files that it already knows about. 2005-04-10 20:32:54 +02:00			`if (!ok_to_add)`
			`return -1;`

Use core.filemode. With "[core] filemode = false", you can tell git to ignore differences in the working tree file only in executable bit. * "git-update-index --refresh" does not say "needs update" if index entry and working tree file differs only in executable bit. * "git-update-index" on an existing path takes executable bit from the existing index entry, if the path and index entry are both regular files. * "git-diff-files" and "git-diff-index" without --cached flag pretend the path on the filesystem has the same executable bit as the existing index entry, if the path and index entry are both regular files. If you are on a filesystem with unreliable mode bits, you may need to force the executable bit after registering the path in the index. * "git-update-index --chmod=+x foo" flips the executable bit of the index file entry for path "foo" on. Use "--chmod=-x" to flip it off. Note that --chmod only works in index file and does not look at nor update the working tree. So if you are on a filesystem and do not have working executable bit, you would do: 1. set the appropriate .git/config option; 2. "git-update-index --add new-file.c" 3. "git-ls-files --stage new-file.c" to see if it has the desired mode bits. If not, e.g. to drop executable bit picked up from the filesystem, say "git-update-index --chmod=-x new-file.c". Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 03:45:33 +02:00			`if (!skip_df_check &&`
			`check_file_directory_conflict(ce, pos, ok_to_replace)) {`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`if (!ok_to_replace)`
			`return -1;`
[PATCH] Use ntohs instead of htons to convert ce_flags to host byte order Use ntohs instead of htons to convert ce_flags to host byte order Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-07 22:35:56 +02:00			`pos = cache_name_pos(ce->name, ntohs(ce->ce_flags));`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`pos = -pos-1;`
			`}`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`/* Make sure the array is big enough .. */`
			`if (active_nr == active_alloc) {`
			`active_alloc = alloc_nr(active_alloc);`
[PATCH] introduce xmalloc and xrealloc Introduce xmalloc and xrealloc to die gracefully with a descriptive message when out of memory, rather than taking a SIGSEGV. Signed-off-by: Christopher Li<chrislgit@chrisli.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-26 21:00:58 +02:00			`active_cache = xrealloc(active_cache, active_alloc * sizeof(struct cache_entry *));`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`}`

			`/* Add it in.. */`
			`active_nr++;`
			`if (active_nr > pos)`
			`memmove(active_cache + pos + 1, active_cache + pos, (active_nr - pos - 1) * sizeof(ce));`
			`active_cache[pos] = ce;`
Revert bogus optimization that avoids index file writes It didn't properly mark all cache updates as being dirty, and causes merge errors due to that. In particular, it didn't notice when a file was force-removed. Besides, it was ugly as hell. I've put in place a slightly cleaner version, but I've not enabled the optimization because I don't want to be burned again. 2005-05-07 01:48:43 +02:00			`active_cache_changed = 1;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return 0;`
			`}`

index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`static int verify_hdr(struct cache_header *hdr, unsigned long size)`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`{`
			`SHA_CTX c;`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`unsigned char sha1[20];`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`return error("bad signature");`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (hdr->hdr_version != htonl(2))`
			`return error("bad index version");`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`SHA1_Init(&c);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`SHA1_Update(&c, hdr, size - 20);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`SHA1_Final(sha1, &c);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (memcmp(sha1, (void *)hdr + size - 20, 20))`
			`return error("bad index file sha1 signature");`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`return 0;`
			`}`

index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`static int read_index_extension(const char ext, void data, unsigned long sz)`
			`{`
			`switch (CACHE_EXT(ext)) {`
			`case CACHE_EXT_TREE:`
			`active_cache_tree = cache_tree_read(data, sz);`
			`break;`
			`default:`
			`if (ext < 'A' \|\| 'Z' < ext)`
			`return error("index uses %.4s extension, which we do not understand",`
			`ext);`
			`fprintf(stderr, "ignoring %.4s extension\n", ext);`
			`break;`
			`}`
			`return 0;`
			`}`

			`int read_cache(void)`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`{`
			`int fd, i;`
			`struct stat st;`
			`unsigned long size, offset;`
			`void *map;`
			`struct cache_header *hdr;`

			`errno = EBUSY;`
			`if (active_cache)`
[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-01 22:24:27 +02:00			`return active_nr;`

Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`errno = ENOENT;`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`index_file_timestamp = 0;`
Add support for a "GIT_INDEX_FILE" environment variable. We use that to specify alternative index files, which can be useful if you want to (for example) generate a temporary index file to do some specific operation that you don't want to mess with your main one with. It defaults to the regular ".git/index" if it hasn't been specified. 2005-04-21 19:55:18 +02:00			`fd = open(get_index_file(), O_RDONLY);`
[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-01 22:24:27 +02:00			`if (fd < 0) {`
			`if (errno == ENOENT)`
			`return 0;`
			`die("index file open failed (%s)", strerror(errno));`
			`}`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
Use "-Wall -O2" for the compiler to get more warnings. And fix up the warnings that it pointed out. Let's keep the tree clean from early on. Not that the code is very beautiful anyway ;) 2005-04-08 18:59:28 +02:00			`size = 0; // avoid gcc warning`
[PATCH] mmap error handling I have reviewed all occurrences of mmap() in git and fixed three types of errors/defects: 1) The result is not checked. 2) The file descriptor is closed if mmap() succeeds, but not when it fails. 3) Various casts applied to -1 are used instead of MAP_FAILED, which is specifically defined to check mmap() return value. [jc: This is a second round of Pavel's patch. He fixed up the problem that close() potentially clobbering the errno from mmap, which the first round had.] Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-29 16:49:14 +02:00			`map = MAP_FAILED;`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`if (!fstat(fd, &st)) {`
			`size = st.st_size;`
			`errno = EINVAL;`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (size >= sizeof(struct cache_header) + 20)`
Allow writing to the private index file mapping. We now modify the in-memory copy of the index file in "diff-cache", so we need to add PROT_WRITE. 2005-04-27 04:27:27 +02:00			`map = mmap(NULL, size, PROT_READ \| PROT_WRITE, MAP_PRIVATE, fd, 0);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`}`
			`close(fd);`
[PATCH] mmap error handling I have reviewed all occurrences of mmap() in git and fixed three types of errors/defects: 1) The result is not checked. 2) The file descriptor is closed if mmap() succeeds, but not when it fails. 3) Various casts applied to -1 are used instead of MAP_FAILED, which is specifically defined to check mmap() return value. [jc: This is a second round of Pavel's patch. He fixed up the problem that close() potentially clobbering the errno from mmap, which the first round had.] Signed-off-by: Pavel Roskin <proski@gnu.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-29 16:49:14 +02:00			`if (map == MAP_FAILED)`
[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-01 22:24:27 +02:00			`die("index file mmap failed (%s)", strerror(errno));`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
			`hdr = map;`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`if (verify_hdr(hdr, size) < 0)`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`goto unmap;`

Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`active_nr = ntohl(hdr->hdr_entries);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`active_alloc = alloc_nr(active_nr);`
read-cache.c: use xcalloc() not calloc() Elsewhere we use xcalloc(); we should consistently do so. Signed-off-by: Yakov Lerner <iler.ml@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-09 18:14:00 +02:00			`active_cache = xcalloc(active_alloc, sizeof(struct cache_entry *));`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
			`offset = sizeof(*hdr);`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`for (i = 0; i < active_nr; i++) {`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`struct cache_entry *ce = map + offset;`
			`offset = offset + ce_size(ce);`
			`active_cache[i] = ce;`
			`}`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`index_file_timestamp = st.st_mtime;`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`while (offset <= size - 20 - 8) {`
			`/* After an array of active_nr index entries,`
			`* there can be arbitrary number of extended`
			`* sections, each of which is prefixed with`
			`* extension name (4-byte) and section length`
			`* in 4-byte network byte order.`
			`*/`
			`unsigned long extsize;`
			`memcpy(&extsize, map + offset + 4, 4);`
			`extsize = ntohl(extsize);`
			`if (read_index_extension(map + offset,`
			`map + offset + 8, extsize) < 0)`
			`goto unmap;`
			`offset += 8;`
			`offset += extsize;`
			`}`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`return active_nr;`

			`unmap:`
			`munmap(map, size);`
			`errno = EINVAL;`
[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-01 22:24:27 +02:00			`die("index file corrupt");`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`}`

Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`#define WRITE_BUFFER_SIZE 8192`
[PATCH] Kill a bunch of pointer sign warnings for gcc4 - Raw hashes should be unsigned char. - String functions want signed char. - Hash and compress functions want unsigned char. Signed-off By: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-18 14:14:09 +02:00			`static unsigned char write_buffer[WRITE_BUFFER_SIZE];`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`static unsigned long write_buffer_len;`

Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`static int ce_write(SHA_CTX context, int fd, void data, unsigned int len)`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`{`
			`while (len) {`
			`unsigned int buffered = write_buffer_len;`
			`unsigned int partial = WRITE_BUFFER_SIZE - buffered;`
			`if (partial > len)`
			`partial = len;`
			`memcpy(write_buffer + buffered, data, partial);`
			`buffered += partial;`
			`if (buffered == WRITE_BUFFER_SIZE) {`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`SHA1_Update(context, write_buffer, WRITE_BUFFER_SIZE);`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`if (write(fd, write_buffer, WRITE_BUFFER_SIZE) != WRITE_BUFFER_SIZE)`
			`return -1;`
			`buffered = 0;`
			`}`
			`write_buffer_len = buffered;`
			`len -= partial;`
			`data += partial;`
			`}`
			`return 0;`
			`}`

index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`static int write_index_ext_header(SHA_CTX *context, int fd,`
			`unsigned long ext, unsigned long sz)`
			`{`
			`ext = htonl(ext);`
			`sz = htonl(sz);`
			`if ((ce_write(context, fd, &ext, 4) < 0) \|\|`
			`(ce_write(context, fd, &sz, 4) < 0))`
			`return -1;`
			`return 0;`
			`}`

			`static int ce_flush(SHA_CTX *context, int fd)`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`{`
			`unsigned int left = write_buffer_len;`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`if (left) {`
			`write_buffer_len = 0;`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`SHA1_Update(context, write_buffer, left);`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`}`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00
[PATCH] Fix buffer overflow in ce_flush(). Add a check before appending SHA1 signature to write_buffer, flush it first if necessary. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-11 15:27:47 +02:00			`/* Flush first if not enough space for SHA1 signature */`
			`if (left + 20 > WRITE_BUFFER_SIZE) {`
			`if (write(fd, write_buffer, left) != left)`
			`return -1;`
			`left = 0;`
			`}`

Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`/* Append the SHA1 signature at the end */`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`SHA1_Final(write_buffer + left, context);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`left += 20;`
			`if (write(fd, write_buffer, left) != left)`
			`return -1;`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`return 0;`
			`}`

Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`static void ce_smudge_racily_clean_entry(struct cache_entry *ce)`
			`{`
			`/*`
			`* The only thing we care about in this function is to smudge the`
			`* falsely clean entry due to touch-update-touch race, so we leave`
			`* everything else as they are. We are called for entries whose`
			`* ce_mtime match the index file mtime.`
			`*/`
			`struct stat st;`

			`if (lstat(ce->name, &st) < 0)`
			`return;`
			`if (ce_match_stat_basic(ce, &st))`
			`return;`
			`if (ce_modified_check_fs(ce, &st)) {`
ce_smudge_racily_clean_entry: explain why it works. This is a tricky code and warrants extra commenting. I wasted 30 minutes trying to break it until I realized why it works. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 23:18:47 +01:00			`/* This is "racily clean"; smudge it. Note that this`
			`* is a tricky code. At first glance, it may appear`
			`* that it can break with this sequence:`
			`*`
			`* $ echo xyzzy >frotz`
			`* $ git-update-index --add frotz`
			`* $ : >frotz`
			`* $ sleep 3`
			`* $ echo filfre >nitfol`
			`* $ git-update-index --add nitfol`
			`*`
			`* but it does not. Whe the second update-index runs,`
			`* it notices that the entry "frotz" has the same timestamp`
			`* as index, and if we were to smudge it by resetting its`
			`* size to zero here, then the object name recorded`
			`* in index is the 6-byte file but the cached stat information`
			`* becomes zero --- which would then match what we would`
			`* obtain from the filesystem next time we stat("frotz").`
			`*`
			`* However, the second update-index, before calling`
			`* this function, notices that the cached size is 6`
			`* bytes and what is on the filesystem is an empty`
			`* file, and never calls us, so the cached size information`
			`* for "frotz" stays 6 which does not match the filesystem.`
			`*/`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`ce->ce_size = htonl(0);`
			`}`
			`}`

index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`int write_cache(int newfd, struct cache_entry **cache, int entries)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`{`
			`SHA_CTX c;`
			`struct cache_header hdr;`
[PATCH] Bugfix: read-cache.c:write_cache() misrecords number of entries. When we choose to omit deleted entries, we should subtract numbers of such entries from the total number in the header. Signed-off-by: Junio C Hamano <junkio@cox.net> Oops. Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-10 10:32:37 +02:00			`int i, removed;`

			`for (i = removed = 0; i < entries; i++)`
			`if (!cache[i]->ce_mode)`
			`removed++;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`hdr.hdr_signature = htonl(CACHE_SIGNATURE);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`hdr.hdr_version = htonl(2);`
[PATCH] Bugfix: read-cache.c:write_cache() misrecords number of entries. When we choose to omit deleted entries, we should subtract numbers of such entries from the total number in the header. Signed-off-by: Junio C Hamano <junkio@cox.net> Oops. Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-10 10:32:37 +02:00			`hdr.hdr_entries = htonl(entries - removed);`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
			`SHA1_Init(&c);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (ce_write(&c, newfd, &hdr, sizeof(hdr)) < 0)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return -1;`

			`for (i = 0; i < entries; i++) {`
			`struct cache_entry *ce = cache[i];`
git-read-tree: remove deleted files in the working directory Only when "-u" is used of course. 2005-06-10 00:34:04 +02:00			`if (!ce->ce_mode)`
			`continue;`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`if (index_file_timestamp &&`
			`index_file_timestamp <= ntohl(ce->ce_mtime.sec))`
			`ce_smudge_racily_clean_entry(ce);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (ce_write(&c, newfd, ce, ce_size(ce)) < 0)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return -1;`
			`}`
read-cache/write-cache: optionally return cache checksum SHA1. read_cache_1() and write_cache_1() takes an extra parameter *sha1 that returns the checksum of the index file when non-NULL. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-24 01:52:08 +02:00
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`/* Write extension data here */`
			`if (active_cache_tree) {`
			`unsigned long sz;`
			`void *data = cache_tree_write(active_cache_tree, &sz);`
			`if (data &&`
			`!write_index_ext_header(&c, newfd, CACHE_EXT_TREE, sz) &&`
			`!ce_write(&c, newfd, data, sz))`
			`;`
			`else {`
			`free(data);`
			`return -1;`
			`}`
			`}`
			`return ce_flush(&c, newfd);`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`}`