mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-02 15:28:21 +01:00

1196 lines

30 KiB

C

Raw Normal View History

Add copyright notices. The tool interface sucks (especially "committing" information, which is just me doing everything by hand from the command line), but I think this is in theory actually a viable way of describing the world. So copyright it. 2005-04-08 00:16:10 +02:00			`/*`
			`* GIT - The information manager from hell`
			`*`
			`* Copyright (C) Linus Torvalds, 2005`
			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`#define NO_THE_INDEX_COMPATIBILITY_MACROS`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`#include "cache.h"`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`#include "cache-tree.h"`
Teach core object handling functions about gitlinks This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:20:29 +02:00			`#include "refs.h"`
git-add: Add support for --refresh option. This allows to refresh only a subset of the project files, based on the specified pathspecs. Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-11 23:59:01 +02:00			`#include "dir.h"`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00
			`/* Index extensions.`
			`*`
			`* The first letter should be 'A'..'Z' for extensions that are not`
			`* necessary for a correct operation (i.e. optimization data).`
			`* When new extensions are added that _needs_ to be understood in`
			`* order to correctly interpret the index file, pick character that`
			`* is outside the range, to cause the reader to abort.`
			`*/`

			`#define CACHE_EXT(s) ( (s[0]<<24)\|(s[1]<<16)\|(s[2]<<8)\|(s[3]) )`
			`#define CACHE_EXT_TREE 0x54524545 /* "TREE" */`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
Move index-related variables into a structure. This defines a index_state structure and moves index-related global variables into it. Currently there is one instance of it, the_index, and everybody accesses it, so there is no code change. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 03:14:06 +02:00			`struct index_state the_index;`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00
[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`/*`
			`* This only updates the "non-critical" parts of the directory`
			`* cache, ie the parts that aren't tracked by GIT, and only used`
			`* to validate the cache.`
			`*/`
			`void fill_stat_cache_info(struct cache_entry ce, struct stat st)`
			`{`
			`ce->ce_ctime.sec = htonl(st->st_ctime);`
			`ce->ce_mtime.sec = htonl(st->st_mtime);`
Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00			`#ifdef USE_NSEC`
[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`ce->ce_ctime.nsec = htonl(st->st_ctim.tv_nsec);`
			`ce->ce_mtime.nsec = htonl(st->st_mtim.tv_nsec);`
			`#endif`
			`ce->ce_dev = htonl(st->st_dev);`
			`ce->ce_ino = htonl(st->st_ino);`
			`ce->ce_uid = htonl(st->st_uid);`
			`ce->ce_gid = htonl(st->st_gid);`
			`ce->ce_size = htonl(st->st_size);`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
			`if (assume_unchanged)`
			`ce->ce_flags \|= htons(CE_VALID);`
[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`}`

Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`static int ce_compare_data(struct cache_entry ce, struct stat st)`
			`{`
			`int match = -1;`
			`int fd = open(ce->name, O_RDONLY);`

			`if (fd >= 0) {`
			`unsigned char sha1[20];`
index_fd(): pass optional path parameter as hint for blob conversion Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-28 20:52:04 +01:00			`if (!index_fd(sha1, fd, st, 0, OBJ_BLOB, ce->name))`
Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length. Introduces global inline: hashcmp(const unsigned char sha1, const unsigned char sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-17 20:54:57 +02:00			`match = hashcmp(sha1, ce->sha1);`
Fix double "close()" in ce_compare_data Doing an "strace" on "git diff" shows that we close() a file descriptor twice (getting EBADFD on the second one) when we end up in ce_compare_data if the index does not match the checked-out stat information. The "index_fd()" function will already have closed the fd for us, so we should not close it again. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-31 18:55:15 +02:00			`/* index_fd() closed the file descriptor already */`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`}`
			`return match;`
			`}`

Cast 64 bit off_t to 32 bit size_t Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:37 +01:00			`static int ce_compare_link(struct cache_entry *ce, size_t expected_size)`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`{`
			`int match = -1;`
			`char *target;`
			`void *buffer;`
			`unsigned long size;`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`enum object_type type;`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`int len;`

			`target = xmalloc(expected_size);`
			`len = readlink(ce->name, target, expected_size);`
			`if (len != expected_size) {`
			`free(target);`
			`return -1;`
			`}`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`buffer = read_sha1_file(ce->sha1, &type, &size);`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`if (!buffer) {`
			`free(target);`
			`return -1;`
			`}`
			`if (size == expected_size)`
			`match = memcmp(buffer, target, size);`
			`free(buffer);`
			`free(target);`
			`return match;`
			`}`

Teach core object handling functions about gitlinks This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:20:29 +02:00			`static int ce_compare_gitlink(struct cache_entry *ce)`
			`{`
			`unsigned char sha1[20];`

			`/*`
			`* We don't actually require that the .git directory`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`* under GITLINK directory be a valid git directory. It`
Teach core object handling functions about gitlinks This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:20:29 +02:00			`* might even be missing (in case nobody populated that`
			`* sub-project).`
			`*`
			`* If so, we consider it always to match.`
			`*/`
			`if (resolve_gitlink_ref(ce->name, "HEAD", sha1) < 0)`
			`return 0;`
			`return hashcmp(sha1, ce->sha1);`
			`}`

Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`static int ce_modified_check_fs(struct cache_entry ce, struct stat st)`
			`{`
			`switch (st->st_mode & S_IFMT) {`
			`case S_IFREG:`
			`if (ce_compare_data(ce, st))`
			`return DATA_CHANGED;`
			`break;`
			`case S_IFLNK:`
Cast 64 bit off_t to 32 bit size_t Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:37 +01:00			`if (ce_compare_link(ce, xsize_t(st->st_size)))`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`return DATA_CHANGED;`
			`break;`
Fix gitlink index entry filesystem matching The code to match up index entries with the filesystem was stupidly broken. We shouldn't compare the filesystem stat() information with S_IFDIRLNK, since that's purely a git-internal value, and not what the filesystem uses (on the filesystem, it's just a regular directory). Also, don't bother to make the stat() time comparisons etc for DIRLNK entries in ce_match_stat_basic(), since we do an exact match for these things, and the hints in the stat data simply doesn't matter. This fixes "git status" with submodules that haven't been checked out in the supermodule. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 18:24:13 +02:00			`case S_IFDIR:`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`if (S_ISGITLINK(ntohl(ce->ce_mode)))`
Fix gitlink index entry filesystem matching The code to match up index entries with the filesystem was stupidly broken. We shouldn't compare the filesystem stat() information with S_IFDIRLNK, since that's purely a git-internal value, and not what the filesystem uses (on the filesystem, it's just a regular directory). Also, don't bother to make the stat() time comparisons etc for DIRLNK entries in ce_match_stat_basic(), since we do an exact match for these things, and the hints in the stat data simply doesn't matter. This fixes "git status" with submodules that haven't been checked out in the supermodule. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 18:24:13 +02:00			`return 0;`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`default:`
			`return TYPE_CHANGED;`
			`}`
			`return 0;`
			`}`

Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`static int ce_match_stat_basic(struct cache_entry ce, struct stat st)`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`{`
			`unsigned int changed = 0;`

[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`switch (ntohl(ce->ce_mode) & S_IFMT) {`
			`case S_IFREG:`
			`changed \|= !S_ISREG(st->st_mode) ? TYPE_CHANGED : 0;`
Use core.filemode. With "[core] filemode = false", you can tell git to ignore differences in the working tree file only in executable bit. * "git-update-index --refresh" does not say "needs update" if index entry and working tree file differs only in executable bit. * "git-update-index" on an existing path takes executable bit from the existing index entry, if the path and index entry are both regular files. * "git-diff-files" and "git-diff-index" without --cached flag pretend the path on the filesystem has the same executable bit as the existing index entry, if the path and index entry are both regular files. If you are on a filesystem with unreliable mode bits, you may need to force the executable bit after registering the path in the index. * "git-update-index --chmod=+x foo" flips the executable bit of the index file entry for path "foo" on. Use "--chmod=-x" to flip it off. Note that --chmod only works in index file and does not look at nor update the working tree. So if you are on a filesystem and do not have working executable bit, you would do: 1. set the appropriate .git/config option; 2. "git-update-index --add new-file.c" 3. "git-ls-files --stage new-file.c" to see if it has the desired mode bits. If not, e.g. to drop executable bit picked up from the filesystem, say "git-update-index --chmod=-x new-file.c". Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 03:45:33 +02:00			`/* We consider only the owner x bit to be relevant for`
			`* "mode changes"`
			`*/`
			`if (trust_executable_bit &&`
			`(0100 & (ntohl(ce->ce_mode) ^ st->st_mode)))`
[PATCH] fix compare symlink against readlink not data Fix update-cache to compare the blob of a symlink against the link-target and not the file it points to. Also ignore all permissions applied to links. Thanks to Greg for recognizing this while he added our list of symlinks back to the udev repository. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-06 15:45:01 +02:00			`changed \|= MODE_CHANGED;`
[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`break;`
			`case S_IFLNK:`
Add core.symlinks to mark filesystems that do not support symbolic links. Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-02 22:11:30 +01:00			`if (!S_ISLNK(st->st_mode) &&`
			`(has_symlinks \|\| !S_ISREG(st->st_mode)))`
			`changed \|= TYPE_CHANGED;`
[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`break;`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`case S_IFGITLINK:`
Teach core object handling functions about gitlinks This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:20:29 +02:00			`if (!S_ISDIR(st->st_mode))`
			`changed \|= TYPE_CHANGED;`
			`else if (ce_compare_gitlink(ce))`
			`changed \|= DATA_CHANGED;`
Fix gitlink index entry filesystem matching The code to match up index entries with the filesystem was stupidly broken. We shouldn't compare the filesystem stat() information with S_IFDIRLNK, since that's purely a git-internal value, and not what the filesystem uses (on the filesystem, it's just a regular directory). Also, don't bother to make the stat() time comparisons etc for DIRLNK entries in ce_match_stat_basic(), since we do an exact match for these things, and the hints in the stat data simply doesn't matter. This fixes "git status" with submodules that haven't been checked out in the supermodule. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-13 18:24:13 +02:00			`return changed;`
git-blame shouldn't crash if run in an unmerged tree If we are in the middle of resolving a merge conflict there may be one or more files whose entries in the index represent an unmerged state (index entries in the higher-order stages). Attempting to run git-blame on any file in such a working directory resulted in "fatal: internal error: ce_mode is 0" as we use the magic marker for an unmerged entry is 0 (set up by things like diff-lib.c's do_diff_cache() and builtin-read-tree.c's read_tree_unmerged()) and the ce_match_stat_basic() function gets upset about this. I'm not entirely sure that the whole "ce_mode = 0" case is a good idea to begin with, and maybe the right thing to do is to remove that horrid freakish special case, but removing the internal error seems to be the simplest fix for now. Linus [sp: Thanks to Björn Steinbrink for the test case] Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-18 08:31:30 +02:00			`case 0: /* Special case: unmerged file in index */`
			`return MODE_CHANGED \| DATA_CHANGED \| TYPE_CHANGED;`
[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`default:`
			`die("internal error: ce_mode is %o", ntohl(ce->ce_mode));`
			`}`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (ce->ce_mtime.sec != htonl(st->st_mtime))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= MTIME_CHANGED;`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (ce->ce_ctime.sec != htonl(st->st_ctime))`
			`changed \|= CTIME_CHANGED;`

Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00			`#ifdef USE_NSEC`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`/*`
			`* nsec seems unreliable - not all filesystems support it, so`
			`* as long as it is in the inode cache you get right nsec`
			`* but after it gets flushed, you get zero nsec.`
			`*/`
Fix NSEC compile problem, and properly parse the rev-tree cmd line. The rev-tree thing just happened to work. It shouldn't have. 2005-04-21 18:58:24 +02:00			`if (ce->ce_mtime.nsec != htonl(st->st_mtim.tv_nsec))`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`changed \|= MTIME_CHANGED;`
Fix NSEC compile problem, and properly parse the rev-tree cmd line. The rev-tree thing just happened to work. It shouldn't have. 2005-04-21 18:58:24 +02:00			`if (ce->ce_ctime.nsec != htonl(st->st_ctim.tv_nsec))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= CTIME_CHANGED;`
War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-07 09:04:01 +02:00			`#endif`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00
			`if (ce->ce_uid != htonl(st->st_uid) \|\|`
			`ce->ce_gid != htonl(st->st_gid))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= OWNER_CHANGED;`
Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00			`if (ce->ce_ino != htonl(st->st_ino))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= INODE_CHANGED;`
Don't care about st_dev in the index file Thomas Glanzmann points out that it doesn't work well with different clients accessing the repository over NFS - they have different views on what the "device" for the filesystem is. Of course, other filesystems may not even have stable inode numbers. But we don't care. At least for now. 2005-05-23 00:08:15 +02:00
			`#ifdef USE_STDEV`
			`/*`
			`* st_dev breaks on network filesystems where different`
			`* clients will have different views of what "device"`
			`* the filesystem is on`
			`*/`
			`if (ce->ce_dev != htonl(st->st_dev))`
			`changed \|= INODE_CHANGED;`
			`#endif`

Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (ce->ce_size != htonl(st->st_size))`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`changed \|= DATA_CHANGED;`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`return changed;`
			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int ie_match_stat(struct index_state *istate,`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`struct cache_entry ce, struct stat st,`
			`unsigned int options)`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`{`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`unsigned int changed;`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`int ignore_valid = options & CE_MATCH_IGNORE_VALID;`
			`int assume_racy_is_modified = options & CE_MATCH_RACY_IS_DIRTY;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
			`/*`
			`* If it's marked as always valid in the index, it's`
			`* valid whatever the checked-out copy says.`
			`*/`
			`if (!ignore_valid && (ce->ce_flags & htons(CE_VALID)))`
			`return 0;`

			`changed = ce_match_stat_basic(ce, st);`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`/*`
			`* Within 1 second of this sequence:`
			`* echo xyzzy >file && git-update-index --add file`
			`* running this command:`
			`* echo frotz >file`
			`* would give a falsely clean cache entry. The mtime and`
			`* length match the cache, and other stat fields do not change.`
			`*`
			`* We could detect this at update-index time (the cache entry`
			`* being registered/updated records the same time as "now")`
			`* and delay the return from git-update-index, but that would`
			`* effectively mean we can make at most one commit per second,`
			`* which is not acceptable. Instead, we check cache entries`
			`* whose mtime are the same as the index file timestamp more`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`* carefully than others.`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`*/`
			`if (!changed &&`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->timestamp &&`
			`istate->timestamp <= ntohl(ce->ce_mtime.sec)) {`
Add check program "git-check-racy" This will help counting the racily clean paths, but it should be useless for daily use. Do not even enable it in the makefile. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-16 06:38:07 +02:00			`if (assume_racy_is_modified)`
			`changed \|= DATA_CHANGED;`
			`else`
			`changed \|= ce_modified_check_fs(ce, st);`
			`}`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`return changed;`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int ie_modified(struct index_state *istate,`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`struct cache_entry ce, struct stat st, unsigned int options)`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`{`
Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`int changed, changed_fs;`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00
			`changed = ie_match_stat(istate, ce, st, options);`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`if (!changed)`
			`return 0;`
			`/*`
			`* If the mode or type has changed, there's no point in trying`
			`* to refresh the entry - it's not going to match`
			`*/`
			`if (changed & (MODE_CHANGED \| TYPE_CHANGED))`
			`return changed;`

			`/* Immediately after read-tree or update-index --cacheinfo,`
			`* the length field is zero. For other cases the ce_size`
			`* should match the SHA1 recorded in the index entry.`
			`*/`
			`if ((changed & DATA_CHANGED) && ce->ce_size != htonl(0))`
			`return changed;`

Racy GIT This fixes the longstanding "Racy GIT" problem, which was pretty much there from the beginning of time, but was first demonstrated by Pasky in this message on October 24, 2005: http://marc.theaimsgroup.com/?l=git&m=113014629716878 If you run the following sequence of commands: echo frotz >infocom git update-index --add infocom echo xyzzy >infocom so that the second update to file "infocom" does not change st_mtime, what is recorded as the stat information for the cache entry "infocom" exactly matches what is on the filesystem (owner, group, inum, mtime, ctime, mode, length). After this sequence, we incorrectly think "infocom" file still has string "frotz" in it, and get really confused. E.g. git-diff-files would say there is no change, git-update-index --refresh would not even look at the filesystem to correct the situation. Some ways of working around this issue were already suggested by Linus in the same thread on the same day, including waiting until the next second before returning from update-index if a cache entry written out has the current timestamp, but that means we can make at most one commit per second, and given that the e-mail patch workflow used by Linus needs to process at least 5 commits per second, it is not an acceptable solution. Linus notes that git-apply is primarily used to update the index while processing e-mailed patches, which is true, and git-apply's up-to-date check is fooled by the same problem but luckily in the other direction, so it is not really a big issue, but still it is disturbing. The function ce_match_stat() is called to bypass the comparison against filesystem data when the stat data recorded in the cache entry matches what stat() returns from the filesystem. This patch tackles the problem by changing it to actually go to the filesystem data for cache entries that have the same mtime as the index file itself. This works as long as the index file and working tree files are on the filesystems that share the same monotonic clock. Files on network mounted filesystems sometimes get skewed timestamps compared to "date" output, but as long as working tree files' timestamps are skewed the same way as the index file's, this approach still works. The only problematic files are the ones that have the same timestamp as the index file's, because two file updates that sandwitch the index file update must happen within the same second to trigger the problem. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 09:02:15 +01:00			`changed_fs = ce_modified_check_fs(ce, st);`
			`if (changed_fs)`
			`return changed \| changed_fs;`
Show modified files in git-ls-files Add -m/--modified to show files that have been modified wrt. the index. [jc: The original came from Brian Gerst on Sep 1st but it only checked if the paths were cache dirty without actually checking the files were modified. I also added the usage string and a new test.] Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:11:15 +02:00			`return 0;`
			`}`

Introduce "base_name_compare()" helper function This one compares two pathnames that may be partial basenames, not full paths. We need to get the path sorting right, since a directory name will sort as if it had the final '/' at the end. 2005-05-20 18:09:18 +02:00			`int base_name_compare(const char *name1, int len1, int mode1,`
			`const char *name2, int len2, int mode2)`
			`{`
			`unsigned char c1, c2;`
			`int len = len1 < len2 ? len1 : len2;`
			`int cmp;`

			`cmp = memcmp(name1, name2, len);`
			`if (cmp)`
			`return cmp;`
			`c1 = name1[len];`
			`c2 = name2[len];`
Fix thinko in subproject entry sorting This fixes a total thinko in my original series: subprojects do not sort like directories, because the index is sorted purely by full pathname, and since a subproject shows up in the index as a normal NUL-terminated string, it never has the issues with sorting with the '/' at the end. So if you have a subproject "proj" and a file "proj.c", the subproject sorts alphabetically before the file in the index (and must thus also sort that way in a tree object, since trees sort as the index). In contrast, it you have two files "proj/file" and "proj.c", the "proj.c" will sort alphabetically before "proj/file" in the index. The index itself, of course, does not actually contain an entry "proj/", but in the tree that gets written out, the tree entry "proj" will sort after the file entry "proj.c", which is the only real magic sorting rule. In other words: the magic sorting rule only affects tree entries, and only affects tree entries that point to other trees (ie are of the type S_IFDIR). Anyway, that thinko just means that we should remove the special case to make S_ISDIRLNK entries sort like S_ISDIR entries. They don't. They sort like normal files. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:39:12 +02:00			`if (!c1 && S_ISDIR(mode1))`
Introduce "base_name_compare()" helper function This one compares two pathnames that may be partial basenames, not full paths. We need to get the path sorting right, since a directory name will sort as if it had the final '/' at the end. 2005-05-20 18:09:18 +02:00			`c1 = '/';`
Fix thinko in subproject entry sorting This fixes a total thinko in my original series: subprojects do not sort like directories, because the index is sorted purely by full pathname, and since a subproject shows up in the index as a normal NUL-terminated string, it never has the issues with sorting with the '/' at the end. So if you have a subproject "proj" and a file "proj.c", the subproject sorts alphabetically before the file in the index (and must thus also sort that way in a tree object, since trees sort as the index). In contrast, it you have two files "proj/file" and "proj.c", the "proj.c" will sort alphabetically before "proj/file" in the index. The index itself, of course, does not actually contain an entry "proj/", but in the tree that gets written out, the tree entry "proj" will sort after the file entry "proj.c", which is the only real magic sorting rule. In other words: the magic sorting rule only affects tree entries, and only affects tree entries that point to other trees (ie are of the type S_IFDIR). Anyway, that thinko just means that we should remove the special case to make S_ISDIRLNK entries sort like S_ISDIR entries. They don't. They sort like normal files. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:39:12 +02:00			`if (!c2 && S_ISDIR(mode2))`
Introduce "base_name_compare()" helper function This one compares two pathnames that may be partial basenames, not full paths. We need to get the path sorting right, since a directory name will sort as if it had the final '/' at the end. 2005-05-20 18:09:18 +02:00			`c2 = '/';`
			`return (c1 < c2) ? -1 : (c1 > c2) ? 1 : 0;`
			`}`

Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00			`int cache_name_compare(const char name1, int flags1, const char name2, int flags2)`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`{`
Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00			`int len1 = flags1 & CE_NAMEMASK;`
			`int len2 = flags2 & CE_NAMEMASK;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`int len = len1 < len2 ? len1 : len2;`
			`int cmp;`

			`cmp = memcmp(name1, name2, len);`
			`if (cmp)`
			`return cmp;`
			`if (len1 < len2)`
			`return -1;`
			`if (len1 > len2)`
			`return 1;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
cache_name_compare() compares name and stage, nothing else. The code was a bit unclear in expressing what it wants to compare. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-13 08:46:25 +01:00			`/* Compare stages */`
			`flags1 &= CE_STAGEMASK;`
			`flags2 &= CE_STAGEMASK;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00			`if (flags1 < flags2)`
			`return -1;`
			`if (flags1 > flags2)`
			`return 1;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`return 0;`
			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int index_name_pos(struct index_state istate, const char name, int namelen)`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`{`
			`int first, last;`

			`first = 0;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`last = istate->cache_nr;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`while (last > first) {`
			`int next = (last + first) >> 1;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`struct cache_entry *ce = istate->cache[next];`
[PATCH] Use ntohs instead of htons to convert ce_flags to host byte order Use ntohs instead of htons to convert ce_flags to host byte order Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-07 22:35:56 +02:00			`int cmp = cache_name_compare(name, namelen, ce->name, ntohs(ce->ce_flags));`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`if (!cmp)`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`return next;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`if (cmp < 0) {`
			`last = next;`
			`continue;`
			`}`
			`first = next+1;`
			`}`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`return -first-1;`
Make "cache_name_pos()" available to others. It finds the cache entry position for a given name, and is generally useful. Sure, everybody can just scan the active cache array, but since it's sorted, you actually want to search it with a binary search, so let's not duplicate that logic all over the place. 2005-04-09 18:26:55 +02:00			`}`

When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`/* Remove entry, return true if there are more entries to go.. */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int remove_index_entry_at(struct index_state *istate, int pos)`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`{`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_changed = 1;`
			`istate->cache_nr--;`
			`if (pos >= istate->cache_nr)`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`return 0;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`memmove(istate->cache + pos,`
			`istate->cache + pos + 1,`
			`(istate->cache_nr - pos) * sizeof(struct cache_entry *));`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`return 1;`
			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int remove_file_from_index(struct index_state istate, const char path)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`{`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int pos = index_name_pos(istate, path, strlen(path));`
[PATCH] update-cache --remove marks the path merged. When update-cache --remove is run, resolve unmerged state for the path. This is consistent with the update-cache --add behaviour. Essentially, the user is telling us how he wants to resolve the merge by running update-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Fixed to do the right thing at the end. Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-17 18:53:35 +02:00			`if (pos < 0)`
			`pos = -pos-1;`
Simplify cache API Earlier, add_file_to_index() invalidated the path in the cache-tree but remove_file_from_cache() did not, and the user of the latter needed to invalidate the entry himself. This led to a few bugs due to missed invalidate calls already. This patch makes the management of cache-tree less error prone by making more invalidate calls from lower level cache API functions. The rules are: - If you are going to write the index, you should either maintain cache_tree correctly. - If you cannot, alternatively you can remove the entire cache_tree by calling cache_tree_free() before you call write_cache(). - When you modify the index, cache_tree_invalidate_path() should be called with the path you are modifying, to discard the entry from the cache-tree structure. - The following cache API functions exported from read-cache.c (and the macro whose names have "cache" instead of "index") automatically call cache_tree_invalidate_path() for you: - remove_file_from_index(); - add_file_to_index(); - add_index_entry(); You can modify the index bypassing the above API functions (e.g. find an existing cache entry from the index and modify it in place). You need to call cache_tree_invalidate_path() yourself in such a case. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-14 05:33:11 +02:00			`cache_tree_invalidate_path(istate->cache_tree, path);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`while (pos < istate->cache_nr && !strcmp(istate->cache[pos]->name, path))`
			`remove_index_entry_at(istate, pos);`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return 0;`
			`}`

git add: respect core.filemode with unmerged entries When a merge left unmerged entries, git add failed to pick up the file mode from the index, when core.filemode == 0. If more than one unmerged entry is there, the order of stage preference is 2, 1, 3. Noticed by Johannes Sixt. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-29 19:32:46 +02:00			`static int compare_name(struct cache_entry ce, const char path, int namelen)`
			`{`
			`return namelen != ce_namelen(ce) \|\| memcmp(path, ce->name, namelen);`
			`}`

			`static int index_name_pos_also_unmerged(struct index_state *istate,`
			`const char *path, int namelen)`
			`{`
			`int pos = index_name_pos(istate, path, namelen);`
			`struct cache_entry *ce;`

			`if (pos >= 0)`
			`return pos;`

			`/* maybe unmerged? */`
			`pos = -1 - pos;`
			`if (pos >= istate->cache_nr \|\|`
			`compare_name((ce = istate->cache[pos]), path, namelen))`
			`return -1;`

			`/* order of preference: stage 2, 1, 3 */`
			`if (ce_stage(ce) == 1 && pos + 1 < istate->cache_nr &&`
			`ce_stage((ce = istate->cache[pos + 1])) == 2 &&`
			`!compare_name(ce, path, namelen))`
			`pos++;`
			`return pos;`
			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int add_file_to_index(struct index_state istate, const char path, int verbose)`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00			`{`
add_file_to_index: skip rehashing if the cached stat already matches An earlier commit 366bfcb6 broke git-add by moving read_cache() call down, because it wanted the directory walking code to grab paths that are already in the index. The change serves its purpose, but introduces a regression because the responsibility of avoiding unnecessary reindexing by matching the cached stat is shifted nowhere. This makes it the job of add_file_to_index() function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-31 02:12:58 +02:00			`int size, namelen, pos;`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00			`struct stat st;`
			`struct cache_entry *ce;`
git-add: make the entry stat-clean after re-adding the same contents Earlier in commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 03:22:52 +01:00			`unsigned ce_option = CE_MATCH_IGNORE_VALID\|CE_MATCH_RACY_IS_DIRTY;`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00
			`if (lstat(path, &st))`
			`die("%s: unable to stat (%s)", path, strerror(errno));`

Teach core object handling functions about gitlinks This teaches the really fundamental core SHA1 object handling routines about gitlinks. We can compare trees with gitlinks in them (although we can not actually generate patches for them yet - just raw git diffs), and they show up as commits in "git ls-tree". We also know to compare gitlinks as if they were directories (ie the normal "sort as trees" rules apply). [jc: amended a cut&paste error] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:20:29 +02:00			`if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode) && !S_ISDIR(st.st_mode))`
			`die("%s: can only add regular files, symbolic links or git-directories", path);`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00
			`namelen = strlen(path);`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`if (S_ISDIR(st.st_mode)) {`
			`while (namelen && path[namelen-1] == '/')`
			`namelen--;`
			`}`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00			`size = cache_entry_size(namelen);`
			`ce = xcalloc(1, size);`
			`memcpy(ce->name, path, namelen);`
			`ce->ce_flags = htons(namelen);`
			`fill_stat_cache_info(ce, &st);`

Add core.symlinks to mark filesystems that do not support symbolic links. Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-02 22:11:30 +01:00			`if (trust_executable_bit && has_symlinks)`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00			`ce->ce_mode = create_ce_mode(st.st_mode);`
			`else {`
Add core.symlinks to mark filesystems that do not support symbolic links. Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-02 22:11:30 +01:00			`/* If there is an existing entry, pick the mode bits and type`
			`* from it, otherwise assume unexecutable regular file.`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00			`*/`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00			`struct cache_entry *ent;`
git add: respect core.filemode with unmerged entries When a merge left unmerged entries, git add failed to pick up the file mode from the index, when core.filemode == 0. If more than one unmerged entry is there, the order of stage preference is 2, 1, 3. Noticed by Johannes Sixt. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-29 19:32:46 +02:00			`int pos = index_name_pos_also_unmerged(istate, path, namelen);`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`ent = (0 <= pos) ? istate->cache[pos] : NULL;`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00			`ce->ce_mode = ce_mode_from_stat(ent, st.st_mode);`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00			`}`

add_file_to_index: skip rehashing if the cached stat already matches An earlier commit 366bfcb6 broke git-add by moving read_cache() call down, because it wanted the directory walking code to grab paths that are already in the index. The change serves its purpose, but introduces a regression because the responsibility of avoiding unnecessary reindexing by matching the cached stat is shifted nowhere. This makes it the job of add_file_to_index() function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-31 02:12:58 +02:00			`pos = index_name_pos(istate, ce->name, namelen);`
			`if (0 <= pos &&`
			`!ce_stage(istate->cache[pos]) &&`
git-add: make the entry stat-clean after re-adding the same contents Earlier in commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe (add_file_to_index: skip rehashing if the cached stat already matches), add_file_to_index() were taught not to re-add the path if it already matches the index. The change meant well, but was not executed quite right. It used ie_modified() to see if the file on the work tree is really different from the index, and skipped adding the contents if the function says "not modified". This was wrong. There are three possible comparison results between the index and the file in the work tree: - with lstat(2) we _know_ they are different. E.g. if the length or the owner in the cached stat information is different from the length we just obtained from lstat(2), we can tell the file is modified without looking at the actual contents. - with lstat(2) we _know_ they are the same. The same length, the same owner, the same everything (but this has a twist, as described below). - we cannot tell from lstat(2) information alone and need to go to the filesystem to actually compare. The last case arises from what we call 'racy git' situation, that can be caused with this sequence: $ echo hello >file $ git add file $ echo aeiou >file ;# the same length If the second "echo" is done within the same filesystem timestamp granularity as the first "echo", then the timestamp recorded by "git add" and the timestamp we get from lstat(2) will be the same, and we can mistakenly say the file is not modified. The path is called 'racily clean'. We need to reliably detect racily clean paths are in fact modified. To solve this problem, when we write out the index, we mark the index entry that has the same timestamp as the index file itself (that is the time from the point of view of the filesystem) to tell any later code that does the lstat(2) comparison not to trust the cached stat info, and ie_modified() then actually goes to the filesystem to compare the contents for such a path. That's all good, but it should not be used for this "git add" optimization, as the goal of "git add" is to actually update the path in the index and make it stat-clean. With the false optimization, we did _not_ cause any data loss (after all, what we failed to do was only to update the cached stat information), but it made the following sequence leave the file stat dirty: $ echo hello >file $ git add file $ echo hello >file ;# the same contents $ git add file The solution is not to use ie_modified() which goes to the filesystem to see if it is really clean, but instead use ie_match_stat() with "assume racily clean paths are dirty" option, to force re-adding of such a path. There was another problem with "git add -u". The codepath shares the same issue when adding the paths that are found to be modified, but in addition, it asked "git diff-files" machinery run_diff_files() function (which is "git diff-files") to list the paths that are modified. But "git diff-files" machinery uses the same ie_modified() call so that it does not report racily clean _and_ actually clean paths as modified, which is not what we want. The patch allows the callers of run_diff_files() to pass the same "assume racily clean paths are dirty" option, and makes "git-add -u" codepath to use that option, to discover and re-add racily clean _and_ actually clean paths. We could further optimize on top of this patch to differentiate the case where the path really needs re-adding (i.e. the content of the racily clean entry was indeed different) and the case where only the cached stat information needs to be refreshed (i.e. the racily clean entry was actually clean), but I do not think it is worth it. This patch applies to maint and all the way up. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 03:22:52 +01:00			`!ie_match_stat(istate, istate->cache[pos], &st, ce_option)) {`
add_file_to_index: skip rehashing if the cached stat already matches An earlier commit 366bfcb6 broke git-add by moving read_cache() call down, because it wanted the directory walking code to grab paths that are already in the index. The change serves its purpose, but introduces a regression because the responsibility of avoiding unnecessary reindexing by matching the cached stat is shifted nowhere. This makes it the job of add_file_to_index() function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-31 02:12:58 +02:00			`/* Nothing changed, really */`
			`free(ce);`
			`return 0;`
			`}`

Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00			`if (index_path(ce->sha1, path, &st, 1))`
			`die("unable to index file %s", path);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (add_index_entry(istate, ce, ADD_CACHE_OK_TO_ADD\|ADD_CACHE_OK_TO_REPLACE))`
Make git-mv a builtin This also moves add_file_to_index() to read-cache.c. Oh, and while touching builtin-add.c, it also removes a duplicate git_config() call. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 03:52:35 +02:00			`die("unable to add %s to index",path);`
			`if (verbose)`
			`printf("add '%s'\n", path);`
			`return 0;`
			`}`

Move make_cache_entry() from merge-recursive.c into read-cache.c The function make_cache_entry() is too useful to be hidden away in merge-recursive. So move it to libgit.a (exposing it via cache.h). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-11 05:17:28 +02:00			`struct cache_entry *make_cache_entry(unsigned int mode,`
			`const unsigned char sha1, const char path, int stage,`
			`int refresh)`
			`{`
			`int size, len;`
			`struct cache_entry *ce;`

			`if (!verify_path(path))`
			`return NULL;`

			`len = strlen(path);`
			`size = cache_entry_size(len);`
			`ce = xcalloc(1, size);`

			`hashcpy(ce->sha1, sha1);`
			`memcpy(ce->name, path, len);`
			`ce->ce_flags = create_ce_flags(len, stage);`
			`ce->ce_mode = create_ce_mode(mode);`

			`if (refresh)`
			`return refresh_cache_entry(ce, 0);`

			`return ce;`
			`}`

Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`int ce_same_name(struct cache_entry a, struct cache_entry b)`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`{`
			`int len = ce_namelen(a);`
			`return ce_namelen(b) == len && !memcmp(a->name, b->name, len);`
			`}`

Make "ce_match_path()" a generic helper function ... and make git-diff-files use it too. This all _should_ make the diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the path matching early interally. 2005-07-15 01:55:06 +02:00			`int ce_path_match(const struct cache_entry ce, const char *pathspec)`
			`{`
			`const char match, name;`
			`int len;`

			`if (!pathspec)`
			`return 1;`

			`len = ce_namelen(ce);`
			`name = ce->name;`
			`while ((match = *pathspec++) != NULL) {`
			`int matchlen = strlen(match);`
			`if (matchlen > len)`
			`continue;`
			`if (memcmp(name, match, matchlen))`
			`continue;`
			`if (matchlen && name[matchlen-1] == '/')`
			`return 1;`
			`if (name[matchlen] == '/' \|\| !name[matchlen])`
			`return 1;`
[PATCH] Improve handling of "." and ".." in git-diff-* This fixes up usage of ".." (without an ending slash) and "." (with or without the ending slash) in the git diff family. It also fixes pathspec matching for the case of an empty pathspec, since a "." in the top-level directory (or enough ".." under subdirectories) will result in an empty pathspec. We used to not match it against anything, but it should in fact match everything. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 05:44:32 +02:00			`if (!matchlen)`
			`return 1;`
Make "ce_match_path()" a generic helper function ... and make git-diff-files use it too. This all _should_ make the diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the path matching early interally. 2005-07-15 01:55:06 +02:00			`}`
			`return 0;`
			`}`

Prevent bogus paths from being added to the index. With this one, it's now a fatal error to try to add a pathname that cannot be added with "git add", i.e. [torvalds@g5 git]$ git add .git/config fatal: unable to add .git/config to index and [torvalds@g5 git]$ git add foo/../bar fatal: unable to add foo/../bar to index instead of the old "Ignoring path xyz" warning that would end up silently succeeding on any other paths. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-18 21:07:31 +02:00			`/*`
			`* We fundamentally don't like some paths: we don't want`
			`* dot or dot-dot anywhere, and for obvious reasons don't`
			`* want to recurse into ".git" either.`
			`*`
			`* Also, we don't want double slashes or slashes at the`
			`* end that can make pathnames ambiguous.`
			`*/`
			`static int verify_dotfile(const char *rest)`
			`{`
			`/*`
			`* The first character was '.', but that`
			`* has already been discarded, we now test`
			`* the rest.`
			`*/`
			`switch (*rest) {`
			`/* "." is not allowed */`
			`case '\0': case '/':`
			`return 0;`

			`/*`
			`* ".git" followed by NUL or slash is bad. This`
			`* shares the path end test with the ".." case.`
			`*/`
			`case 'g':`
			`if (rest[1] != 'i')`
			`break;`
			`if (rest[2] != 't')`
			`break;`
			`rest += 2;`
			`/* fallthrough */`
			`case '.':`
			`if (rest[1] == '\0' \|\| rest[1] == '/')`
			`return 0;`
			`}`
			`return 1;`
			`}`

			`int verify_path(const char *path)`
			`{`
			`char c;`

			`goto inside;`
			`for (;;) {`
			`if (!c)`
			`return 1;`
			`if (c == '/') {`
			`inside:`
			`c = *path++;`
			`switch (c) {`
			`default:`
			`continue;`
			`case '/': case '\0':`
			`break;`
			`case '.':`
			`if (verify_dotfile(path))`
			`continue;`
			`}`
			`return 0;`
			`}`
			`c = *path++;`
			`}`
			`}`

Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`/*`
			`* Do we have another file that has the beginning components being a`
			`* proper superset of the name we're trying to add?`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`static int has_file_name(struct index_state *istate,`
			`const struct cache_entry *ce, int pos, int ok_to_replace)`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`{`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`int retval = 0;`
			`int len = ce_namelen(ce);`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`int stage = ce_stage(ce);`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`const char *name = ce->name;`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`while (pos < istate->cache_nr) {`
			`struct cache_entry *p = istate->cache[pos++];`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (len >= ce_namelen(p))`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`break;`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (memcmp(name, p->name, len))`
			`break;`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`if (ce_stage(p) != stage)`
			`continue;`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (p->name[len] != '/')`
			`continue;`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`if (!ce_stage(p) && !p->ce_mode)`
			`continue;`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`retval = -1;`
			`if (!ok_to_replace)`
			`break;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`remove_index_entry_at(istate, --pos);`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`}`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`return retval;`
			`}`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`/*`
			`* Do we have another file with a pathname that is a proper`
			`* subset of the name we're trying to add?`
			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`static int has_dir_name(struct index_state *istate,`
			`const struct cache_entry *ce, int pos, int ok_to_replace)`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`{`
			`int retval = 0;`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`int stage = ce_stage(ce);`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`const char *name = ce->name;`
			`const char *slash = name + ce_namelen(ce);`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`for (;;) {`
			`int len;`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`for (;;) {`
			`if (*--slash == '/')`
			`break;`
			`if (slash <= ce->name)`
			`return retval;`
			`}`
			`len = slash - name;`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`pos = index_name_pos(istate, name, ntohs(create_ce_flags(len, stage)));`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`if (pos >= 0) {`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`/*`
			`* Found one, but not so fast. This could`
			`* be a marker that says "I was here, but`
			`* I am being removed". Such an entry is`
			`* not a part of the resulting tree, and`
			`* it is Ok to have a directory at the same`
			`* path.`
			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (stage \|\| istate->cache[pos]->ce_mode) {`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`retval = -1;`
			`if (!ok_to_replace)`
			`break;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`remove_index_entry_at(istate, pos);`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`continue;`
			`}`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`}`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`else`
			`pos = -pos-1;`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00
			`/*`
			`* Trivial optimization: if we find an entry that`
			`* already matches the sub-directory, then we know`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`* we're ok, and we can exit.`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`while (pos < istate->cache_nr) {`
			`struct cache_entry *p = istate->cache[pos];`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`if ((ce_namelen(p) <= len) \|\|`
			`(p->name[len] != '/') \|\|`
			`memcmp(p->name, name, len))`
			`break; /* not our subdirectory */`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`if (ce_stage(p) == stage && (stage \|\| p->ce_mode))`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`/* p is at the same stage as our entry, and`
			`* is a subdirectory of what we are looking`
			`* at, so we cannot have conflicts at our`
			`* level or anything shorter.`
			`*/`
			`return retval;`
			`pos++;`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`}`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`}`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`return retval;`
			`}`

			`/* We may be in a situation where we already have path/file and path`
			`* is being added, or we already have path and path/file is being`
			`* added. Either one would result in a nonsense tree that has path`
			`* twice when git-write-tree tries to write it out. Prevent it.`
War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-07 09:04:01 +02:00			`*`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`* If ok-to-replace is specified, we remove the conflicting entries`
			`* from the cache so the caller should recompute the insert position.`
			`* When this happens, we return non-zero.`
			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`static int check_file_directory_conflict(struct index_state *istate,`
			`const struct cache_entry *ce,`
			`int pos, int ok_to_replace)`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`{`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`int retval;`

			`/*`
			`* When ce is an "I am going away" entry, we allow it to be added`
			`*/`
			`if (!ce_stage(ce) && !ce->ce_mode)`
			`return 0;`

Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`/*`
			`* We check if the path is a sub-path of a subsequent pathname`
			`* first, since removing those will not change the position`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00			`* in the array.`
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`retval = has_file_name(istate, ce, pos, ok_to_replace);`
add_cache_entry(): removal of file foo does not conflict with foo/bar Similarly, removal of file foo/bar does not conflict with a file foo. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-30 10:55:37 +02:00
Re-implement "check_file_directory_conflict()" This is (imho) more readable, and is also a lot faster. The expense of looking up sub-directory beginnings was killing us on things like "git-diff-cache", even though that one didn't even care at all about the file vs directory conflicts. We really only care when somebody tries to add a conflicting name to stage 0. We should go through the conflict rules more carefully some day. 2005-06-19 05:21:34 +02:00			`/*`
			`* Then check if the path might have a clashing sub-directory`
			`* before it.`
			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`return retval + has_dir_name(istate, ce, pos, ok_to_replace);`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00			`}`

Optimize "diff --cached" performance. The read_tree() function is called only from the call chain to run "git diff --cached" (this includes the internal call made by git-runstatus to run_diff_index()). The function vacates stage without any funky "merge" magic. The caller then goes and compares stage #1 entries from the tree with stage #0 entries from the original index. When adding the cache entries this way, it used the general purpose add_cache_entry(). This function looks for an existing entry to replace or if there is none to find where to insert the new entry, resolves D/F conflict and all the other things. For the purpose of reading entries into an empty stage, none of that processing is needed. We can instead append everything and then sort the result at the end. This commit changes read_tree() to first make sure that there is no existing cache entries at specified stage, and if that is the case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag (new), and then sort the resulting cache using qsort(). This new flag tells add_cache_entry() to omit all the checks such as "Does this path already exist? Does adding this path remove other existing entries because it turns a directory to a file?" and instead append the given cache entry straight at the end of the active cache. The caller of course is expected to sort the resulting cache at the end before using the result. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-09 22:42:50 +02:00			`static int add_index_entry_with_check(struct index_state istate, struct cache_entry ce, int option)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`{`
			`int pos;`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`int ok_to_add = option & ADD_CACHE_OK_TO_ADD;`
			`int ok_to_replace = option & ADD_CACHE_OK_TO_REPLACE;`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`int skip_df_check = option & ADD_CACHE_SKIP_DFCHECK;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00
Simplify cache API Earlier, add_file_to_index() invalidated the path in the cache-tree but remove_file_from_cache() did not, and the user of the latter needed to invalidate the entry himself. This led to a few bugs due to missed invalidate calls already. This patch makes the management of cache-tree less error prone by making more invalidate calls from lower level cache API functions. The rules are: - If you are going to write the index, you should either maintain cache_tree correctly. - If you cannot, alternatively you can remove the entire cache_tree by calling cache_tree_free() before you call write_cache(). - When you modify the index, cache_tree_invalidate_path() should be called with the path you are modifying, to discard the entry from the cache-tree structure. - The following cache API functions exported from read-cache.c (and the macro whose names have "cache" instead of "index") automatically call cache_tree_invalidate_path() for you: - remove_file_from_index(); - add_file_to_index(); - add_index_entry(); You can modify the index bypassing the above API functions (e.g. find an existing cache entry from the index and modify it in place). You need to call cache_tree_invalidate_path() yourself in such a case. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-14 05:33:11 +02:00			`cache_tree_invalidate_path(istate->cache_tree, ce->name);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`pos = index_name_pos(istate, ce->name, ntohs(ce->ce_flags));`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
Use core.filemode. With "[core] filemode = false", you can tell git to ignore differences in the working tree file only in executable bit. * "git-update-index --refresh" does not say "needs update" if index entry and working tree file differs only in executable bit. * "git-update-index" on an existing path takes executable bit from the existing index entry, if the path and index entry are both regular files. * "git-diff-files" and "git-diff-index" without --cached flag pretend the path on the filesystem has the same executable bit as the existing index entry, if the path and index entry are both regular files. If you are on a filesystem with unreliable mode bits, you may need to force the executable bit after registering the path in the index. * "git-update-index --chmod=+x foo" flips the executable bit of the index file entry for path "foo" on. Use "--chmod=-x" to flip it off. Note that --chmod only works in index file and does not look at nor update the working tree. So if you are on a filesystem and do not have working executable bit, you would do: 1. set the appropriate .git/config option; 2. "git-update-index --add new-file.c" 3. "git-ls-files --stage new-file.c" to see if it has the desired mode bits. If not, e.g. to drop executable bit picked up from the filesystem, say "git-update-index --chmod=-x new-file.c". Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 03:45:33 +02:00			`/* existing match? Just replace it. */`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`if (pos >= 0) {`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_changed = 1;`
			`istate->cache[pos] = ce;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return 0;`
			`}`
Fix off-by-one error in removal of cache entry. Also make the return value of "cache_name_pos()" be sane: positive or zero if we found it (it's the index into the cache array), and "-pos-1" to indicate where it should go if we didn't. 2005-04-11 07:06:50 +02:00			`pos = -pos-1;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`/*`
			`* Inserting a merged entry ("stage 0") into the index`
			`* will always replace all non-merged entries..`
			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (pos < istate->cache_nr && ce_stage(ce) == 0) {`
			`while (ce_same_name(istate->cache[pos], ce)) {`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`ok_to_add = 1;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (!remove_index_entry_at(istate, pos))`
When inserting a index entry of stage 0, remove all old unmerged entries. This allows you to actually tell git that you've resolved a conflict. 2005-04-16 21:05:45 +02:00			`break;`
			`}`
			`}`

Make "update-cache" a bit friendlier to use (and harder to mis-use). It now requires the "--add" flag before you add any new files, and a "--remove" file if you want to mark files for removal. And giving it the "--refresh" flag makes it just update all the files that it already knows about. 2005-04-10 20:32:54 +02:00			`if (!ok_to_add)`
			`return -1;`
Prevent bogus paths from being added to the index. With this one, it's now a fatal error to try to add a pathname that cannot be added with "git add", i.e. [torvalds@g5 git]$ git add .git/config fatal: unable to add .git/config to index and [torvalds@g5 git]$ git add foo/../bar fatal: unable to add foo/../bar to index instead of the old "Ignoring path xyz" warning that would end up silently succeeding on any other paths. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-18 21:07:31 +02:00			`if (!verify_path(ce->name))`
			`return -1;`
Make "update-cache" a bit friendlier to use (and harder to mis-use). It now requires the "--add" flag before you add any new files, and a "--remove" file if you want to mark files for removal. And giving it the "--refresh" flag makes it just update all the files that it already knows about. 2005-04-10 20:32:54 +02:00
Use core.filemode. With "[core] filemode = false", you can tell git to ignore differences in the working tree file only in executable bit. * "git-update-index --refresh" does not say "needs update" if index entry and working tree file differs only in executable bit. * "git-update-index" on an existing path takes executable bit from the existing index entry, if the path and index entry are both regular files. * "git-diff-files" and "git-diff-index" without --cached flag pretend the path on the filesystem has the same executable bit as the existing index entry, if the path and index entry are both regular files. If you are on a filesystem with unreliable mode bits, you may need to force the executable bit after registering the path in the index. * "git-update-index --chmod=+x foo" flips the executable bit of the index file entry for path "foo" on. Use "--chmod=-x" to flip it off. Note that --chmod only works in index file and does not look at nor update the working tree. So if you are on a filesystem and do not have working executable bit, you would do: 1. set the appropriate .git/config option; 2. "git-update-index --add new-file.c" 3. "git-ls-files --stage new-file.c" to see if it has the desired mode bits. If not, e.g. to drop executable bit picked up from the filesystem, say "git-update-index --chmod=-x new-file.c". Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 03:45:33 +02:00			`if (!skip_df_check &&`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`check_file_directory_conflict(istate, ce, pos, ok_to_replace)) {`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`if (!ok_to_replace)`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`return error("'%s' appears as both a file and as a directory",`
			`ce->name);`
			`pos = index_name_pos(istate, ce->name, ntohs(ce->ce_flags));`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`pos = -pos-1;`
			`}`
Optimize "diff --cached" performance. The read_tree() function is called only from the call chain to run "git diff --cached" (this includes the internal call made by git-runstatus to run_diff_index()). The function vacates stage without any funky "merge" magic. The caller then goes and compares stage #1 entries from the tree with stage #0 entries from the original index. When adding the cache entries this way, it used the general purpose add_cache_entry(). This function looks for an existing entry to replace or if there is none to find where to insert the new entry, resolves D/F conflict and all the other things. For the purpose of reading entries into an empty stage, none of that processing is needed. We can instead append everything and then sort the result at the end. This commit changes read_tree() to first make sure that there is no existing cache entries at specified stage, and if that is the case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag (new), and then sort the resulting cache using qsort(). This new flag tells add_cache_entry() to omit all the checks such as "Does this path already exist? Does adding this path remove other existing entries because it turns a directory to a file?" and instead append the given cache entry straight at the end of the active cache. The caller of course is expected to sort the resulting cache at the end before using the result. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-09 22:42:50 +02:00			`return pos + 1;`
			`}`

			`int add_index_entry(struct index_state istate, struct cache_entry ce, int option)`
			`{`
			`int pos;`

			`if (option & ADD_CACHE_JUST_APPEND)`
			`pos = istate->cache_nr;`
			`else {`
			`int ret;`
			`ret = add_index_entry_with_check(istate, ce, option);`
			`if (ret <= 0)`
			`return ret;`
			`pos = ret - 1;`
			`}`
git-update-cache refuses to add a file where a directory is registed. And vice versa. The next commit will introduce an option --replace to allow replacing existing entries. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:48:12 +02:00
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`/* Make sure the array is big enough .. */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (istate->cache_nr == istate->cache_alloc) {`
			`istate->cache_alloc = alloc_nr(istate->cache_alloc);`
			`istate->cache = xrealloc(istate->cache,`
			`istate->cache_alloc * sizeof(struct cache_entry *));`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`}`

			`/* Add it in.. */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_nr++;`
Optimize "diff --cached" performance. The read_tree() function is called only from the call chain to run "git diff --cached" (this includes the internal call made by git-runstatus to run_diff_index()). The function vacates stage without any funky "merge" magic. The caller then goes and compares stage #1 entries from the tree with stage #0 entries from the original index. When adding the cache entries this way, it used the general purpose add_cache_entry(). This function looks for an existing entry to replace or if there is none to find where to insert the new entry, resolves D/F conflict and all the other things. For the purpose of reading entries into an empty stage, none of that processing is needed. We can instead append everything and then sort the result at the end. This commit changes read_tree() to first make sure that there is no existing cache entries at specified stage, and if that is the case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag (new), and then sort the resulting cache using qsort(). This new flag tells add_cache_entry() to omit all the checks such as "Does this path already exist? Does adding this path remove other existing entries because it turns a directory to a file?" and instead append the given cache entry straight at the end of the active cache. The caller of course is expected to sort the resulting cache at the end before using the result. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-09 22:42:50 +02:00			`if (istate->cache_nr > pos + 1)`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`memmove(istate->cache + pos + 1,`
			`istate->cache + pos,`
			`(istate->cache_nr - pos - 1) * sizeof(ce));`
			`istate->cache[pos] = ce;`
			`istate->cache_changed = 1;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return 0;`
			`}`

Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`/*`
			`* "refresh" does not calculate a new sha1 file or bring the`
			`* cache up-to-date for mode/content changes. But what it`
			`* _does_ do is to "re-match" the stat information of a file`
			`* with the cache, so that you can refresh the cache for a`
			`* file that hasn't been changed but where the stat entry is`
			`* out of date.`
			`*`
			`* For example, you'd want to do this after doing a "git-read-tree",`
			`* to link up the stat cache details with the proper files.`
			`*/`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`static struct cache_entry refresh_cache_ent(struct index_state istate,`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`struct cache_entry *ce,`
			`unsigned int options, int *err)`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`{`
			`struct stat st;`
			`struct cache_entry *updated;`
			`int changed, size;`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`int ignore_valid = options & CE_MATCH_IGNORE_VALID;`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`if (lstat(ce->name, &st) < 0) {`
Propagate cache error internal to refresh_cache() via parameter. The function refresh_cache() is the only user of cache_errno that switches its behaviour based on what internal function refresh_cache_entry() finds; pass the error status back in a parameter passed down to it, to get rid of the global variable cache_errno. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 06:34:34 +02:00			`if (err)`
			`*err = errno;`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`return NULL;`
			`}`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`changed = ie_match_stat(istate, ce, &st, options);`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`if (!changed) {`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`/*`
			`* The path is unchanged. If we were told to ignore`
			`* valid bit, then we did the actual stat check and`
			`* found that the entry is unmodified. If the entry`
			`* is not marked VALID, this is the place to mark it`
			`* valid again, under "assume unchanged" mode.`
			`*/`
			`if (ignore_valid && assume_unchanged &&`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`!(ce->ce_flags & htons(CE_VALID)))`
			`; /* mark this one VALID again */`
			`else`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`return ce;`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`}`

ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`if (ie_modified(istate, ce, &st, options)) {`
Propagate cache error internal to refresh_cache() via parameter. The function refresh_cache() is the only user of cache_errno that switches its behaviour based on what internal function refresh_cache_entry() finds; pass the error status back in a parameter passed down to it, to get rid of the global variable cache_errno. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 06:34:34 +02:00			`if (err)`
			`*err = EINVAL;`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`return NULL;`
			`}`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00
			`size = ce_size(ce);`
			`updated = xmalloc(size);`
			`memcpy(updated, ce, size);`
			`fill_stat_cache_info(updated, &st);`

ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`/*`
			`* If ignore_valid is not set, we should leave CE_VALID bit`
			`* alone. Otherwise, paths marked with --no-assume-unchanged`
			`* (i.e. things to be edited) will reacquire CE_VALID bit`
			`* automatically, which is not really what we want.`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`*/`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`if (!ignore_valid && assume_unchanged &&`
			`!(ce->ce_flags & htons(CE_VALID)))`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`updated->ce_flags &= ~htons(CE_VALID);`

			`return updated;`
			`}`

git-add: Add support for --refresh option. This allows to refresh only a subset of the project files, based on the specified pathspecs. Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-11 23:59:01 +02:00			`int refresh_index(struct index_state istate, unsigned int flags, const char pathspec, char seen)`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`{`
			`int i;`
			`int has_errors = 0;`
			`int really = (flags & REFRESH_REALLY) != 0;`
			`int allow_unmerged = (flags & REFRESH_UNMERGED) != 0;`
			`int quiet = (flags & REFRESH_QUIET) != 0;`
			`int not_new = (flags & REFRESH_IGNORE_MISSING) != 0;`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`unsigned int options = really ? CE_MATCH_IGNORE_VALID : 0;`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`for (i = 0; i < istate->cache_nr; i++) {`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`struct cache_entry ce, new;`
Propagate cache error internal to refresh_cache() via parameter. The function refresh_cache() is the only user of cache_errno that switches its behaviour based on what internal function refresh_cache_entry() finds; pass the error status back in a parameter passed down to it, to get rid of the global variable cache_errno. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 06:34:34 +02:00			`int cache_errno = 0;`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`ce = istate->cache[i];`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`if (ce_stage(ce)) {`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`while ((i < istate->cache_nr) &&`
			`! strcmp(istate->cache[i]->name, ce->name))`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`i++;`
			`i--;`
			`if (allow_unmerged)`
			`continue;`
			`printf("%s: needs merge\n", ce->name);`
			`has_errors = 1;`
			`continue;`
			`}`

git-add: Add support for --refresh option. This allows to refresh only a subset of the project files, based on the specified pathspecs. Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-11 23:59:01 +02:00			`if (pathspec && !match_pathspec(pathspec, ce->name, strlen(ce->name), 0, seen))`
			`continue;`

ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`new = refresh_cache_ent(istate, ce, options, &cache_errno);`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`if (new == ce)`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`continue;`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`if (!new) {`
			`if (not_new && cache_errno == ENOENT)`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`continue;`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`if (really && cache_errno == EINVAL) {`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`/* If we are doing --really-refresh that`
			`* means the index is not valid anymore.`
			`*/`
			`ce->ce_flags &= ~htons(CE_VALID);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_changed = 1;`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`}`
			`if (quiet)`
			`continue;`
			`printf("%s: needs update\n", ce->name);`
			`has_errors = 1;`
			`continue;`
			`}`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_changed = 1;`
			`/* You can NOT just free istate->cache[i] here, since it`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`* might not be necessarily malloc()ed but can also come`
			`* from mmap(). */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache[i] = new;`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`}`
			`return has_errors;`
			`}`

Propagate cache error internal to refresh_cache() via parameter. The function refresh_cache() is the only user of cache_errno that switches its behaviour based on what internal function refresh_cache_entry() finds; pass the error status back in a parameter passed down to it, to get rid of the global variable cache_errno. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 06:34:34 +02:00			`struct cache_entry refresh_cache_entry(struct cache_entry ce, int really)`
			`{`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`return refresh_cache_ent(&the_index, ce, really, NULL);`
Propagate cache error internal to refresh_cache() via parameter. The function refresh_cache() is the only user of cache_errno that switches its behaviour based on what internal function refresh_cache_entry() finds; pass the error status back in a parameter passed down to it, to get rid of the global variable cache_errno. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 06:34:34 +02:00			`}`

index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`static int verify_hdr(struct cache_header *hdr, unsigned long size)`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`{`
			`SHA_CTX c;`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`unsigned char sha1[20];`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`return error("bad signature");`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (hdr->hdr_version != htonl(2))`
			`return error("bad index version");`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`SHA1_Init(&c);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`SHA1_Update(&c, hdr, size - 20);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`SHA1_Final(sha1, &c);`
Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length. Introduces global inline: hashcmp(const unsigned char sha1, const unsigned char sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-17 20:54:57 +02:00			`if (hashcmp(sha1, (unsigned char *)hdr + size - 20))`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`return error("bad index file sha1 signature");`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`return 0;`
			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`static int read_index_extension(struct index_state *istate,`
			`const char ext, void data, unsigned long sz)`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`{`
			`switch (CACHE_EXT(ext)) {`
			`case CACHE_EXT_TREE:`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_tree = cache_tree_read(data, sz);`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`break;`
			`default:`
			`if (ext < 'A' \|\| 'Z' < ext)`
			`return error("index uses %.4s extension, which we do not understand",`
			`ext);`
			`fprintf(stderr, "ignoring %.4s extension\n", ext);`
			`break;`
			`}`
			`return 0;`
			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int read_index(struct index_state *istate)`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`{`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`return read_index_from(istate, get_index_file());`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`}`

			`/* remember to discard_cache() before reading a different cache! */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int read_index_from(struct index_state istate, const char path)`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`{`
			`int fd, i;`
			`struct stat st;`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`unsigned long offset;`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`struct cache_header *hdr;`

			`errno = EBUSY;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (istate->mmap)`
			`return istate->cache_nr;`
[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-01 22:24:27 +02:00
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`errno = ENOENT;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->timestamp = 0;`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`fd = open(path, O_RDONLY);`
[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-01 22:24:27 +02:00			`if (fd < 0) {`
			`if (errno == ENOENT)`
			`return 0;`
			`die("index file open failed (%s)", strerror(errno));`
			`}`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
read_cache_from(): small simplification This change 'opens' the code block which maps the index file into memory, making the code clearer and easier to read. Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-25 16:18:17 +02:00			`if (fstat(fd, &st))`
Cleanup read_cache_from error handling. When I converted the mmap() call to xmmap() I failed to cleanup the way this routine handles errors and left some crufty code behind. This is a small cleanup, suggested by Johannes. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-26 05:40:58 +01:00			`die("cannot stat the open index (%s)", strerror(errno));`
read_cache_from(): small simplification This change 'opens' the code block which maps the index file into memory, making the code clearer and easier to read. Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-25 16:18:17 +02:00
			`errno = EINVAL;`
			`istate->mmap_size = xsize_t(st.st_size);`
			`if (istate->mmap_size < sizeof(struct cache_header) + 20)`
			`die("index file smaller than expected");`

			`istate->mmap = xmmap(NULL, istate->mmap_size, PROT_READ \| PROT_WRITE, MAP_PRIVATE, fd, 0);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`close(fd);`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`hdr = istate->mmap;`
			`if (verify_hdr(hdr, istate->mmap_size) < 0)`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`goto unmap;`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_nr = ntohl(hdr->hdr_entries);`
			`istate->cache_alloc = alloc_nr(istate->cache_nr);`
			`istate->cache = xcalloc(istate->cache_alloc, sizeof(struct cache_entry *));`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
			`offset = sizeof(*hdr);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`for (i = 0; i < istate->cache_nr; i++) {`
			`struct cache_entry *ce;`

			`ce = (struct cache_entry )((char )(istate->mmap) + offset);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`offset = offset + ce_size(ce);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache[i] = ce;`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`}`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->timestamp = st.st_mtime;`
			`while (offset <= istate->mmap_size - 20 - 8) {`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`/* After an array of active_nr index entries,`
			`* there can be arbitrary number of extended`
			`* sections, each of which is prefixed with`
			`* extension name (4-byte) and section length`
			`* in 4-byte network byte order.`
			`*/`
			`unsigned long extsize;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`memcpy(&extsize, (char *)(istate->mmap) + offset + 4, 4);`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`extsize = ntohl(extsize);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (read_index_extension(istate,`
			`((const char *) (istate->mmap)) + offset,`
			`(char *) (istate->mmap) + offset + 8,`
Remove all void-pointer arithmetic. ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in various ways. Usually the strategy that required the least changes was used. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:09 +02:00			`extsize) < 0)`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`goto unmap;`
			`offset += 8;`
			`offset += extsize;`
			`}`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`return istate->cache_nr;`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
			`unmap:`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`munmap(istate->mmap, istate->mmap_size);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`errno = EINVAL;`
[PATCH] Better error reporting for "git status" Instead of "git status" ignoring (and hiding) potential errors from the "git-update-index" call, make it exit if it fails, and show the error. In order to do this, use the "-q" flag (to ignore not-up-to-date files) and add a new "--unmerged" flag that allows unmerged entries in the index without any errors. This also avoids marking the index "changed" if an entry isn't actually modified, and makes sure that we exit with an understandable error message if the index is corrupt or unreadable. "read_cache()" no longer returns an error for the caller to check. Finally, make die() and usage() exit with recognizable error codes, if we ever want to check the failure reason in scripts. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-01 22:24:27 +02:00			`die("index file corrupt");`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int discard_index(struct index_state *istate)`
Status update on merge-recursive in C This is just an update for people being interested. Alex and me were busy with that project for a few days now. While it has progressed nicely, there are quite a couple TODOs in merge-recursive.c, just search for "TODO". For impatient people: yes, it passes all the tests, and yes, according to the evil test Alex did, it is faster than the Python script. But no, it is not yet finished. Biggest points are: - there are still three external calls - in the end, it should not be necessary to write the index more than once (just before exiting) - a lot of things can be refactored to make the code easier and shorter BTW we cannot just plug in git-merge-tree yet, because git-merge-tree does not handle renames at all. This patch is meant for testing, and as such, - it compile the program to git-merge-recur - it adjusts the scripts and tests to use git-merge-recur instead of git-merge-recursive - it provides "TEST", a script to execute the tests regarding -recursive - it inlines the changes to read-cache.c (read_cache_from(), discard_cache() and refresh_cache_entry()) Brought to you by Alex Riesen and Dscho Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-08 18:42:41 +02:00			`{`
			`int ret;`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`istate->cache_nr = 0;`
			`istate->cache_changed = 0;`
			`istate->timestamp = 0;`
			`cache_tree_free(&(istate->cache_tree));`
			`if (istate->mmap == NULL)`
Status update on merge-recursive in C This is just an update for people being interested. Alex and me were busy with that project for a few days now. While it has progressed nicely, there are quite a couple TODOs in merge-recursive.c, just search for "TODO". For impatient people: yes, it passes all the tests, and yes, according to the evil test Alex did, it is faster than the Python script. But no, it is not yet finished. Biggest points are: - there are still three external calls - in the end, it should not be necessary to write the index more than once (just before exiting) - a lot of things can be refactored to make the code easier and shorter BTW we cannot just plug in git-merge-tree yet, because git-merge-tree does not handle renames at all. This patch is meant for testing, and as such, - it compile the program to git-merge-recur - it adjusts the scripts and tests to use git-merge-recur instead of git-merge-recursive - it provides "TEST", a script to execute the tests regarding -recursive - it inlines the changes to read-cache.c (read_cache_from(), discard_cache() and refresh_cache_entry()) Brought to you by Alex Riesen and Dscho Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-08 18:42:41 +02:00			`return 0;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`ret = munmap(istate->mmap, istate->mmap_size);`
			`istate->mmap = NULL;`
			`istate->mmap_size = 0;`
Status update on merge-recursive in C This is just an update for people being interested. Alex and me were busy with that project for a few days now. While it has progressed nicely, there are quite a couple TODOs in merge-recursive.c, just search for "TODO". For impatient people: yes, it passes all the tests, and yes, according to the evil test Alex did, it is faster than the Python script. But no, it is not yet finished. Biggest points are: - there are still three external calls - in the end, it should not be necessary to write the index more than once (just before exiting) - a lot of things can be refactored to make the code easier and shorter BTW we cannot just plug in git-merge-tree yet, because git-merge-tree does not handle renames at all. This patch is meant for testing, and as such, - it compile the program to git-merge-recur - it adjusts the scripts and tests to use git-merge-recur instead of git-merge-recursive - it provides "TEST", a script to execute the tests regarding -recursive - it inlines the changes to read-cache.c (read_cache_from(), discard_cache() and refresh_cache_entry()) Brought to you by Alex Riesen and Dscho Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-08 18:42:41 +02:00
			`/* no need to throw away allocated active_cache */`
			`return ret;`
			`}`

Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`#define WRITE_BUFFER_SIZE 8192`
[PATCH] Kill a bunch of pointer sign warnings for gcc4 - Raw hashes should be unsigned char. - String functions want signed char. - Hash and compress functions want unsigned char. Signed-off By: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-18 14:14:09 +02:00			`static unsigned char write_buffer[WRITE_BUFFER_SIZE];`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`static unsigned long write_buffer_len;`

read-cache: tweak racy-git delay logic Instead of looping over the entries and writing out, use a separate loop after all entries have been written out to check how many entries are racily clean. Make sure that the newly created index file gets the right timestamp when we check by flushing the buffered data by ce_write(). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-08 23:47:32 +02:00			`static int ce_write_flush(SHA_CTX *context, int fd)`
			`{`
			`unsigned int buffered = write_buffer_len;`
			`if (buffered) {`
			`SHA1_Update(context, write_buffer, buffered);`
short i/o: fix calls to write to use xwrite or write_in_full We have a number of badly checked write() calls. Often we are expecting write() to write exactly the size we requested or fail, this fails to handle interrupts or short writes. Switch to using the new write_in_full(). Otherwise we at a minimum need to check for EINTR and EAGAIN, where this is appropriate use xwrite(). Note, the changes to config handling are much larger and handled in the next patch in the sequence. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 16:58:23 +01:00			`if (write_in_full(fd, write_buffer, buffered) != buffered)`
read-cache: tweak racy-git delay logic Instead of looping over the entries and writing out, use a separate loop after all entries have been written out to check how many entries are racily clean. Make sure that the newly created index file gets the right timestamp when we check by flushing the buffered data by ce_write(). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-08 23:47:32 +02:00			`return -1;`
			`write_buffer_len = 0;`
			`}`
			`return 0;`
			`}`

Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`static int ce_write(SHA_CTX context, int fd, void data, unsigned int len)`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`{`
			`while (len) {`
			`unsigned int buffered = write_buffer_len;`
			`unsigned int partial = WRITE_BUFFER_SIZE - buffered;`
			`if (partial > len)`
			`partial = len;`
			`memcpy(write_buffer + buffered, data, partial);`
			`buffered += partial;`
			`if (buffered == WRITE_BUFFER_SIZE) {`
read-cache: tweak racy-git delay logic Instead of looping over the entries and writing out, use a separate loop after all entries have been written out to check how many entries are racily clean. Make sure that the newly created index file gets the right timestamp when we check by flushing the buffered data by ce_write(). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-08 23:47:32 +02:00			`write_buffer_len = buffered;`
			`if (ce_write_flush(context, fd))`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`return -1;`
			`buffered = 0;`
			`}`
			`write_buffer_len = buffered;`
			`len -= partial;`
Remove all void-pointer arithmetic. ANSI C99 doesn't allow void-pointer arithmetic. This patch fixes this in various ways. Usually the strategy that required the least changes was used. Signed-off-by: Florian Forster <octo@verplant.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-18 17:18:09 +02:00			`data = (char *) data + partial;`
War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-07 09:04:01 +02:00			`}`
			`return 0;`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`}`

index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`static int write_index_ext_header(SHA_CTX *context, int fd,`
git-write-tree writes garbage on sparc64 In the "next" branch, write_index_ext_header() writes garbage on a 64-bit big-endian machine; the written index file will be unreadable. I noticed this on NetBSD/sparc64. Reproducible with: $ git init-db $ :>file $ git-update-index --add file $ git-write-tree $ git-update-index error: index uses extension, which we do not understand fatal: index file corrupt Signed-off-by: Dennis Stosberg <dennis@stosberg.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-28 21:08:08 +02:00			`unsigned int ext, unsigned int sz)`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`{`
			`ext = htonl(ext);`
			`sz = htonl(sz);`
read-cache.c cleanup Removes conditional returns. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-14 22:38:14 +02:00			`return ((ce_write(context, fd, &ext, 4) < 0) \|\|`
			`(ce_write(context, fd, &sz, 4) < 0)) ? -1 : 0;`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`}`

			`static int ce_flush(SHA_CTX *context, int fd)`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`{`
			`unsigned int left = write_buffer_len;`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`if (left) {`
			`write_buffer_len = 0;`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`SHA1_Update(context, write_buffer, left);`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`}`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00
[PATCH] Fix buffer overflow in ce_flush(). Add a check before appending SHA1 signature to write_buffer, flush it first if necessary. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-11 15:27:47 +02:00			`/* Flush first if not enough space for SHA1 signature */`
			`if (left + 20 > WRITE_BUFFER_SIZE) {`
short i/o: fix calls to write to use xwrite or write_in_full We have a number of badly checked write() calls. Often we are expecting write() to write exactly the size we requested or fail, this fails to handle interrupts or short writes. Switch to using the new write_in_full(). Otherwise we at a minimum need to check for EINTR and EAGAIN, where this is appropriate use xwrite(). Note, the changes to config handling are much larger and handled in the next patch in the sequence. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 16:58:23 +01:00			`if (write_in_full(fd, write_buffer, left) != left)`
[PATCH] Fix buffer overflow in ce_flush(). Add a check before appending SHA1 signature to write_buffer, flush it first if necessary. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-11 15:27:47 +02:00			`return -1;`
			`left = 0;`
			`}`

Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`/* Append the SHA1 signature at the end */`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`SHA1_Final(write_buffer + left, context);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`left += 20;`
short i/o: fix calls to write to use xwrite or write_in_full We have a number of badly checked write() calls. Often we are expecting write() to write exactly the size we requested or fail, this fails to handle interrupts or short writes. Switch to using the new write_in_full(). Otherwise we at a minimum need to check for EINTR and EAGAIN, where this is appropriate use xwrite(). Note, the changes to config handling are much larger and handled in the next patch in the sequence. Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 16:58:23 +01:00			`return (write_in_full(fd, write_buffer, left) != left) ? -1 : 0;`
Speed up index file writing by chunking it nicely. No point in making 17,000 small writes when you can make just a couple of hundred nice 8kB writes instead and save a lot of time. 2005-04-20 21:16:57 +02:00			`}`

Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`static void ce_smudge_racily_clean_entry(struct cache_entry *ce)`
			`{`
			`/*`
			`* The only thing we care about in this function is to smudge the`
			`* falsely clean entry due to touch-update-touch race, so we leave`
			`* everything else as they are. We are called for entries whose`
			`* ce_mtime match the index file mtime.`
			`*/`
			`struct stat st;`

			`if (lstat(ce->name, &st) < 0)`
			`return;`
			`if (ce_match_stat_basic(ce, &st))`
			`return;`
			`if (ce_modified_check_fs(ce, &st)) {`
ce_smudge_racily_clean_entry: explain why it works. This is a tricky code and warrants extra commenting. I wasted 30 minutes trying to break it until I realized why it works. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 23:18:47 +01:00			`/* This is "racily clean"; smudge it. Note that this`
			`* is a tricky code. At first glance, it may appear`
			`* that it can break with this sequence:`
			`*`
			`* $ echo xyzzy >frotz`
			`* $ git-update-index --add frotz`
			`* $ : >frotz`
			`* $ sleep 3`
			`* $ echo filfre >nitfol`
			`* $ git-update-index --add nitfol`
			`*`
Racy git: avoid having to be always too careful Immediately after a bulk checkout, most of the paths in the working tree would have the same timestamp as the index file, and this would force ce_match_stat() to take slow path for all of them. When writing an index file out, if many of the paths have very new (read: the same timestamp as the index file being written out) timestamp, we are better off delaying the return from the command, to make sure that later command to touch the working tree files will leave newer timestamps than recorded in the index, thereby avoiding to take the slow path. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-05 13:16:02 +02:00			`* but it does not. When the second update-index runs,`
ce_smudge_racily_clean_entry: explain why it works. This is a tricky code and warrants extra commenting. I wasted 30 minutes trying to break it until I realized why it works. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 23:18:47 +01:00			`* it notices that the entry "frotz" has the same timestamp`
			`* as index, and if we were to smudge it by resetting its`
			`* size to zero here, then the object name recorded`
			`* in index is the 6-byte file but the cached stat information`
			`* becomes zero --- which would then match what we would`
War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-07 09:04:01 +02:00			`* obtain from the filesystem next time we stat("frotz").`
ce_smudge_racily_clean_entry: explain why it works. This is a tricky code and warrants extra commenting. I wasted 30 minutes trying to break it until I realized why it works. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 23:18:47 +01:00			`*`
			`* However, the second update-index, before calling`
			`* this function, notices that the cached size is 6`
			`* bytes and what is on the filesystem is an empty`
			`* file, and never calls us, so the cached size information`
			`* for "frotz" stays 6 which does not match the filesystem.`
			`*/`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`ce->ce_size = htonl(0);`
			`}`
			`}`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`int write_index(struct index_state *istate, int newfd)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`{`
			`SHA_CTX c;`
			`struct cache_header hdr;`
Small cache_tree_write refactor. This function cannot fail, make it void. Also make write_one act on a const char* instead of a char*. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-25 10:22:44 +02:00			`int i, err, removed;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`struct cache_entry **cache = istate->cache;`
			`int entries = istate->cache_nr;`
[PATCH] Bugfix: read-cache.c:write_cache() misrecords number of entries. When we choose to omit deleted entries, we should subtract numbers of such entries from the total number in the header. Signed-off-by: Junio C Hamano <junkio@cox.net> Oops. Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-10 10:32:37 +02:00
			`for (i = removed = 0; i < entries; i++)`
			`if (!cache[i]->ce_mode)`
			`removed++;`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`hdr.hdr_signature = htonl(CACHE_SIGNATURE);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`hdr.hdr_version = htonl(2);`
[PATCH] Bugfix: read-cache.c:write_cache() misrecords number of entries. When we choose to omit deleted entries, we should subtract numbers of such entries from the total number in the header. Signed-off-by: Junio C Hamano <junkio@cox.net> Oops. Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-10 10:32:37 +02:00			`hdr.hdr_entries = htonl(entries - removed);`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00
			`SHA1_Init(&c);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (ce_write(&c, newfd, &hdr, sizeof(hdr)) < 0)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return -1;`

			`for (i = 0; i < entries; i++) {`
			`struct cache_entry *ce = cache[i];`
git-read-tree: remove deleted files in the working directory Only when "-u" is used of course. 2005-06-10 00:34:04 +02:00			`if (!ce->ce_mode)`
			`continue;`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (istate->timestamp &&`
			`istate->timestamp <= ntohl(ce->ce_mtime.sec))`
Racy GIT (part #2) The previous round caught the most trivial case well, but broke down once index file is updated again. Smudge problematic entries (they should be very few if any under normal interactive workflow) before writing a new index file out. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-20 21:12:18 +01:00			`ce_smudge_racily_clean_entry(ce);`
Make the sha1 of the index file go at the very end of the file. This allows us to both calculate it and verify it faster. 2005-04-20 21:36:41 +02:00			`if (ce_write(&c, newfd, ce, ce_size(ce)) < 0)`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`return -1;`
			`}`
read-cache/write-cache: optionally return cache checksum SHA1. read_cache_1() and write_cache_1() takes an extra parameter *sha1 that returns the checksum of the index file when non-NULL. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-24 01:52:08 +02:00
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`/* Write extension data here */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`if (istate->cache_tree) {`
Small cache_tree_write refactor. This function cannot fail, make it void. Also make write_one act on a const char* instead of a char*. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-25 10:22:44 +02:00			`struct strbuf sb;`

			`strbuf_init(&sb, 0);`
			`cache_tree_write(&sb, istate->cache_tree);`
			`err = write_index_ext_header(&c, newfd, CACHE_EXT_TREE, sb.len) < 0`
			`\|\| ce_write(&c, newfd, sb.buf, sb.len) < 0;`
			`strbuf_release(&sb);`
			`if (err)`
index: make the index file format extensible. ... and move the cache-tree data into it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-25 06:18:58 +02:00			`return -1;`
			`}`
			`return ce_flush(&c, newfd);`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`}`