mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-15 05:33:04 +01:00

711 lines

24 KiB

C

Raw Normal View History

Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`#ifndef CACHE_H`
			`#define CACHE_H`

Clean up compatibility definitions. This attempts to clean up the way various compatibility functions are defined and used. - A new header file, git-compat-util.h, is introduced. This looks at various NO_XXX and does necessary function name replacements, equivalent of -Dstrcasestr=gitstrcasestr in the Makefile. - Those function name replacements are removed from the Makefile. - Common features such as usage(), die(), xmalloc() are moved from cache.h to git-compat-util.h; cache.h includes git-compat-util.h itself. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-05 20:54:29 +01:00			`#include "git-compat-util.h"`
Rewrite convert_to_{git,working_tree} to use strbuf's. * Now, those functions take an "out" strbuf argument, where they store their result if any. In that case, it also returns 1, else it returns 0. * those functions support "in place" editing, in the sense that it's OK to call them this way: convert_to_git(path, sb->buf, sb->len, sb); When doable, conversions are done in place for real, else the strbuf content is just replaced with the new one, transparentely for the caller. If you want to create a new filter working this way, being the accumulation of filter1, filter2, ... filtern, then your meta_filter would be: int meta_filter(..., const char src, size_t len, struct strbuf sb) { int ret = 0; ret \|= filter1(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } ret \|= filter2(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } .... return ret \| filtern(..., src, len, sb); } That's why subfilters the convert_to_* functions called were also rewritten to work this way. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-16 15:51:04 +02:00			`#include "strbuf.h"`
Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-23 03:41:14 +01:00			`#include "hash.h"`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
Add support for alternate SHA1 library implementations. This one includes the Mozilla SHA1 implementation sent in by Edgar Toernig. It's dual-licenced under MPL-1.1 or GPL, so in the context of git, we obviously use the GPL version. Side note: the Mozilla SHA1 implementation is about twice as fast as the default openssl one on my G5, but the default openssl one has optimized x86 assembly language on x86. So choose wisely. 2005-04-21 21:33:22 +02:00			`#include SHA1_HEADER`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`#include <zlib.h>`

Improve accuracy of check for presence of deflateBound. ZLIB_VERNUM isn't defined in some zlib versions, so this patch does a proper linking test in autoconf to see whether deflateBound exists in zlib. Also, setting NO_DEFLATE_BOUND will also work for folk not using autoconf. Signed-off-by: David Symonds <dsymonds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-07 04:24:28 +01:00			`#if defined(NO_DEFLATE_BOUND) \|\| ZLIB_VERNUM < 0x1200`
[PATCH] compat: support pre-1.2 zlib Older zlib's don't have deflateBound() 2005-04-30 18:51:03 +02:00			`#define deflateBound(c,s) ((s) + (((s) + 7) >> 3) + (((s) + 63) >> 6) + 11)`
			`#endif`

Use setenv(), fix warnings - Fix -Wundef -Wold-style-definition warnings - Make pll_free() static [jc: original patch by Timo had another unrelated bits: - Use setenv() instead of putenv() I'm postponing that part for now.] Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-26 16:13:46 +01:00			`#if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT)`
[PATCH] compat: missing dirent.d_type field Not everybody has "d_type". 2005-04-30 18:51:03 +02:00			`#define DTYPE(de) ((de)->d_type)`
			`#else`
Undef DT_* before redefining them. When overriding DT_* macro detection with NO_D_TYPE_IN_DIRENT (recent Cygwin build problem, which hopefully is already fixed in their CVS snapshot version), we define DTYPE() macro to return just "we do not know", but still needed to use DT_* macro to avoid ifdef in the code we use them. If the platform defines DT_* macro but with unusable d_type, this would have resulted in us redefining these preprocessor symbols. Admittedly, that would be just a couple of compilation warnings, and on Cygwin at least this particular problem is transitory (the problem is already fixed in their CVS snapshot version), so this is a low priority fix. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-20 22:33:20 +01:00			`#undef DT_UNKNOWN`
			`#undef DT_DIR`
			`#undef DT_REG`
			`#undef DT_LNK`
[PATCH] compat: missing dirent.d_type field Not everybody has "d_type". 2005-04-30 18:51:03 +02:00			`#define DT_UNKNOWN 0`
			`#define DT_DIR 1`
			`#define DT_REG 2`
[PATCH 2/3] Support symlinks in git-ls-files --others. It is kind of surprising that this was missed in the last round, but the work tree scanner in git-ls-files was still deliberately ignoring symlinks. This patch fixes it, so that --others will correctly report unregistered symlinks. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-13 02:16:04 +02:00			`#define DT_LNK 3`
[PATCH] compat: missing dirent.d_type field Not everybody has "d_type". 2005-04-30 18:51:03 +02:00			`#define DTYPE(de) DT_UNKNOWN`
			`#endif`

Add S_IFINVALID mode S_IFINVALID is used to signal, that no mode information is available. Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-22 18:43:56 +02:00			`/* unknown mode (impossible combination S_IFIFO\|S_IFCHR) */`
			`#define S_IFINVALID 0030000`

Add "S_IFDIRLNK" file mode infrastructure for git links This just adds the basic helper functions to recognize and work with git tree entries that are links to other git repositories ("subprojects"). They still aren't actually connected up to any of the code-paths, but now all the infrastructure is in place. The next commit will start actually adding actual subproject support. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:14:58 +02:00			`/*`
			`* A "directory link" is a link to another git directory.`
			`*`
			`* The value 0160000 is not normally a valid mode, and`
			`* also just happens to be S_IFDIR + S_IFLNK`
			`*`
			`* NOTE! We really shouldn't depend on the S_IFxxx macros`
			`* always having the same values everywhere. We should use`
			`* our internal git values for these things, and then we can`
			`* translate that to the OS-specific value. It just so`
			`* happens that everybody shares the same bit representation`
			`* in the UNIX world (and apparently wider too..)`
			`*/`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`#define S_IFGITLINK 0160000`
			`#define S_ISGITLINK(m) (((m) & S_IFMT) == S_IFGITLINK)`
Add "S_IFDIRLNK" file mode infrastructure for git links This just adds the basic helper functions to recognize and work with git tree entries that are links to other git repositories ("subprojects"). They still aren't actually connected up to any of the code-paths, but now all the infrastructure is in place. The next commit will start actually adding actual subproject support. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:14:58 +02:00
Add first cut at "git protocol" connect logic. Useful for pulling stuff off a dedicated server. Instead of connecting with ssh or just starting a local pipeline, we connect over TCP to the other side and try to see if there's a git server listening. Of course, since I haven't written the git server yet, that will never happen. But the server really just needs to listen on a port, and execute a "git-upload-pack" when somebody connects. (It should read one packet-line, which should be of the format "git-upload-pack directoryname\n" and eventually we migth have other commands the server might accept). 2005-07-14 03:46:20 +02:00			`/*`
			`* Intensive research over the course of many years has shown that`
			`* port 9418 is totally unused by anything else. Or`
			`*`
			`* Your search - "port 9418" - did not match any documents.`
			`*`
			`* as www.google.com puts it.`
[PATCH] Add note about IANA confirmation The git port (9418) is officially listed by IANA now. So document it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-12 20:23:00 +02:00			`*`
			`* This port has been properly assigned for git use by IANA:`
			`* git (Assigned-9418) [I06-050728-0001].`
			`*`
			`* git 9418/tcp git pack transfer service`
			`* git 9418/udp git pack transfer service`
			`*`
			`* with Linus Torvalds <torvalds@osdl.org> as the point of`
			`* contact. September 2005.`
			`*`
			`* See http://www.iana.org/assignments/port-numbers`
Add first cut at "git protocol" connect logic. Useful for pulling stuff off a dedicated server. Instead of connecting with ssh or just starting a local pipeline, we connect over TCP to the other side and try to see if there's a git server listening. Of course, since I haven't written the git server yet, that will never happen. But the server really just needs to listen on a port, and execute a "git-upload-pack" when somebody connects. (It should read one packet-line, which should be of the format "git-upload-pack directoryname\n" and eventually we migth have other commands the server might accept). 2005-07-14 03:46:20 +02:00			`*/`
			`#define DEFAULT_GIT_PORT 9418`

Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`/*`
			`* Basic data structures for the directory cache`
			`*/`

			`#define CACHE_SIGNATURE 0x44495243 /* "DIRC" */`
			`struct cache_header {`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`unsigned int hdr_signature;`
			`unsigned int hdr_version;`
			`unsigned int hdr_entries;`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`};`

			`/*`
			`* The "cache_time" is just the low 32 bits of the`
			`* time. It doesn't matter if it overflows - we only`
			`* check it for equality in the 32 bits we save.`
			`*/`
			`struct cache_time {`
			`unsigned int sec;`
			`unsigned int nsec;`
			`};`

			`/*`
			`* dev/ino/uid/gid/size are also just tracked to the low 32 bits`
			`* Again - this is just a (very strong in practice) heuristic that`
			`* the inode hasn't changed.`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`*`
			`* We save the fields in big-endian order to allow using the`
			`* index file over NFS transparently.`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`*/`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`struct ondisk_cache_entry {`
			`struct cache_time ctime;`
			`struct cache_time mtime;`
			`unsigned int dev;`
			`unsigned int ino;`
			`unsigned int mode;`
			`unsigned int uid;`
			`unsigned int gid;`
			`unsigned int size;`
			`unsigned char sha1[20];`
			`unsigned short flags;`
			`char name[FLEX_ARRAY]; /* more */`
			`};`

Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`struct cache_entry {`
Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-23 03:41:14 +01:00			`struct cache_entry *next;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`unsigned int ce_ctime;`
			`unsigned int ce_mtime;`
Convert the index file reading/writing to use network byte order. This allows using a git tree over NFS with different byte order, and makes it possible to just copy a fully populated repository and have the end result immediately usable (needing just a refresh to update the stat information). 2005-04-15 19:44:27 +02:00			`unsigned int ce_dev;`
			`unsigned int ce_ino;`
			`unsigned int ce_mode;`
			`unsigned int ce_uid;`
			`unsigned int ce_gid;`
			`unsigned int ce_size;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`unsigned int ce_flags;`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`unsigned char sha1[20];`
[PATCH] Compilation: zero-length array declaration. ISO C99 (and GCC 3.x or later) lets you write a flexible array at the end of a structure, like this: struct frotz { int xyzzy; char nitfol[]; /* more / }; GCC 2.95 and 2.96 let you to do this with "char nitfol[0]"; unfortunately this is not allowed by ISO C90. This declares such construct like this: struct frotz { int xyzzy; char nitfol[FLEX_ARRAY]; / more */ }; and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and empty for others. If you are using a C90 C compiler, you should be able to override this with CFLAGS=-DFLEX_ARRAY=1 from the command line of "make". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-07 10:33:54 +01:00			`char name[FLEX_ARRAY]; /* more */`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`};`

Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00			`#define CE_NAMEMASK (0x0fff)`
			`#define CE_STAGEMASK (0x3000)`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`#define CE_VALID (0x8000)`
[PATCH] Add --stage to show-files for new stage dircache. This adds --stage option to show-files command. It shows file-mode, SHA1, stage and pathname. Record separator follows the usual convention of -z option as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-16 17:33:23 +02:00			`#define CE_STAGESHIFT 12`
Make cache entry comparison take the new "state" flag into account. This is what allows us to have multiple states of the same file in the index, and what makes it always sort correctly. 2005-04-16 07:51:44 +02:00
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`/* In-memory only */`
			`#define CE_UPDATE (0x10000)`
			`#define CE_REMOVE (0x20000)`
Avoid running lstat(2) on the same cache entry. Aside from the lstat(2) done for work tree files, there are quite many lstat(2) calls in refname dwimming codepath. This patch is not about reducing them. * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the cache entries that record a regular file blob that is up to date in the work tree. If somebody later walks the index and wants to see if the work tree has changes, they do not have to be checked with lstat(2) again. * fill_stat_cache_info() marks the cache entry it just added with CE_UPTODATE. This has the effect of marking the paths we write out of the index and lstat(2) immediately as "no need to lstat -- we know it is up-to-date", from quite a lot fo callers: - git-apply --index - git-update-index - git-checkout-index - git-add (uses add_file_to_index()) - git-commit (ditto) - git-mv (ditto) * refresh_cache_ent() also marks the cache entry that are clean with CE_UPTODATE. * write_index is changed not to write CE_UPTODATE out to the index file, because CE_UPTODATE is meant to be transient only in core. For the same reason, CE_UPDATE is not written to prevent an accident from happening. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-19 08:45:24 +01:00			`#define CE_UPTODATE (0x40000)`
Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-23 03:41:14 +01:00			`#define CE_UNHASHED (0x80000)`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00
index: be careful when handling long names We currently use lower 12-bit (masked with CE_NAMEMASK) in the ce_flags field to store the length of the name in cache_entry, without checking the length parameter given to create_ce_flags(). This can make us store incorrect length. Currently we are mostly protected by the fact that many codepaths first copy the path in a variable of size PATH_MAX, which typically is 4096 that happens to match the limit, but that feels like a bug waiting to happen. Besides, that would not allow us to shorten the width of CE_NAMEMASK to use the bits for new flags. This redefines the meaning of the name length stored in the cache_entry. A name that does not fit is represented by storing CE_NAMEMASK in the field, and the actual length needs to be computed by actually counting the bytes in the name[] field. This way, only the unusually long paths need to suffer. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-19 08:42:00 +01:00			`static inline unsigned create_ce_flags(size_t len, unsigned stage)`
			`{`
			`if (len >= CE_NAMEMASK)`
			`len = CE_NAMEMASK;`
			`return (len \| (stage << CE_STAGESHIFT));`
			`}`

			`static inline size_t ce_namelen(const struct cache_entry *ce)`
			`{`
			`size_t len = ce->ce_flags & CE_NAMEMASK;`
			`if (len < CE_NAMEMASK)`
			`return len;`
			`return strlen(ce->name + CE_NAMEMASK) + CE_NAMEMASK;`
			`}`

[PATCH] Add --stage to show-files for new stage dircache. This adds --stage option to show-files command. It shows file-mode, SHA1, stage and pathname. Record separator follows the usual convention of -z option as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-16 17:33:23 +02:00			`#define ce_size(ce) cache_entry_size(ce_namelen(ce))`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`#define ondisk_ce_size(ce) ondisk_cache_entry_size(ce_namelen(ce))`
			`#define ce_stage(ce) ((CE_STAGEMASK & (ce)->ce_flags) >> CE_STAGESHIFT)`
Avoid running lstat(2) on the same cache entry. Aside from the lstat(2) done for work tree files, there are quite many lstat(2) calls in refname dwimming codepath. This patch is not about reducing them. * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the cache entries that record a regular file blob that is up to date in the work tree. If somebody later walks the index and wants to see if the work tree has changes, they do not have to be checked with lstat(2) again. * fill_stat_cache_info() marks the cache entry it just added with CE_UPTODATE. This has the effect of marking the paths we write out of the index and lstat(2) immediately as "no need to lstat -- we know it is up-to-date", from quite a lot fo callers: - git-apply --index - git-update-index - git-checkout-index - git-add (uses add_file_to_index()) - git-commit (ditto) - git-mv (ditto) * refresh_cache_ent() also marks the cache entry that are clean with CE_UPTODATE. * write_index is changed not to write CE_UPTODATE out to the index file, because CE_UPTODATE is meant to be transient only in core. For the same reason, CE_UPDATE is not written to prevent an accident from happening. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-19 08:45:24 +01:00			`#define ce_uptodate(ce) ((ce)->ce_flags & CE_UPTODATE)`
			`#define ce_mark_uptodate(ce) ((ce)->ce_flags \|= CE_UPTODATE)`
[PATCH] Add --stage to show-files for new stage dircache. This adds --stage option to show-files command. It shows file-mode, SHA1, stage and pathname. Record separator follows the usual convention of -z option as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-16 17:33:23 +02:00
Be much more liberal about the file mode bits. We only really care about the difference between a file being executable or not (by its owner). Everything else we leave for the user umask to decide. 2005-04-17 07:26:31 +02:00			`#define ce_permissions(mode) (((mode) & 0100) ? 0755 : 0644)`
[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`static inline unsigned int create_ce_mode(unsigned int mode)`
			`{`
			`if (S_ISLNK(mode))`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`return S_IFLNK;`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`if (S_ISDIR(mode) \|\| S_ISGITLINK(mode))`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`return S_IFGITLINK;`
			`return S_IFREG \| ce_permissions(mode);`
[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`}`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00			`static inline unsigned int ce_mode_from_stat(struct cache_entry *ce, unsigned int mode)`
			`{`
Add core.symlinks to mark filesystems that do not support symbolic links. Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-02 22:11:30 +01:00			`extern int trust_executable_bit, has_symlinks;`
			`if (!has_symlinks && S_ISREG(mode) &&`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`ce && S_ISLNK(ce->ce_mode))`
Add core.symlinks to mark filesystems that do not support symbolic links. Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-02 22:11:30 +01:00			`return ce->ce_mode;`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00			`if (!trust_executable_bit && S_ISREG(mode)) {`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`if (ce && S_ISREG(ce->ce_mode))`
Do not take mode bits from index after type change. When we do not trust executable bit from lstat(2), we copied existing ce_mode bits without checking if the filesystem object is a regular file (which is the only thing we apply the "trust executable bit" business) nor if the blob in the index is a regular file (otherwise, we should do the same as registering a new regular file, which is to default non-executable). Noticed by Johannes Sixt. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-17 07:43:48 +01:00			`return ce->ce_mode;`
			`return create_ce_mode(0666);`
			`}`
			`return create_ce_mode(mode);`
			`}`
tree/diff header cleanup. Introduce tree-walk.[ch] and move "struct tree_desc" and associated functions from various places. Rename DIFF_FILE_CANON_MODE(mode) macro to canon_mode(mode) and move it to cache.h. This macro returns the canonicalized st_mode value in the host byte order for files, symlinks and directories -- to be compared with a tree_desc entry. create_ce_mode(mode) in cache.h is similar but is intended to be used for index entries (so it does not work for directories) and returns the value in the network byte order. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-30 08:55:43 +02:00			`#define canon_mode(mode) \`
			`(S_ISREG(mode) ? (S_IFREG \| ce_permissions(mode)) : \`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`S_ISLNK(mode) ? S_IFLNK : S_ISDIR(mode) ? S_IFDIR : S_IFGITLINK)`
Be much more liberal about the file mode bits. We only really care about the difference between a file being executable or not (by its owner). Everything else we leave for the user umask to decide. 2005-04-17 07:26:31 +02:00
[PATCH] Add --stage to show-files for new stage dircache. This adds --stage option to show-files command. It shows file-mode, SHA1, stage and pathname. Record separator follows the usual convention of -z option as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-16 17:33:23 +02:00			`#define cache_entry_size(len) ((offsetof(struct cache_entry,name) + (len) + 8) & ~7)`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`#define ondisk_cache_entry_size(len) ((offsetof(struct ondisk_cache_entry,name) + (len) + 8) & ~7)`
Encode a few extra flags per index entry. This will allow us to have the same name in different "states" in the index at the same time. Which in turn seems to be a very simple way to merge. 2005-04-16 06:45:38 +02:00
Move index-related variables into a structure. This defines a index_state structure and moves index-related global variables into it. Currently there is one instance of it, the_index, and everybody accesses it, so there is no code change. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 03:14:06 +02:00			`struct index_state {`
			`struct cache_entry **cache;`
			`unsigned int cache_nr, cache_alloc, cache_changed;`
			`struct cache_tree *cache_tree;`
			`time_t timestamp;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`void *alloc;`
lazy index hashing This delays the hashing of index names until it becomes necessary for the first time. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-23 08:01:13 +01:00			`unsigned name_hash_initialized : 1;`
Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-23 03:41:14 +01:00			`struct hash_table name_hash;`
Move index-related variables into a structure. This defines a index_state structure and moves index-related global variables into it. Currently there is one instance of it, the_index, and everybody accesses it, so there is no code change. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 03:14:06 +02:00			`};`

			`extern struct index_state the_index;`

Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`#ifndef NO_THE_INDEX_COMPATIBILITY_MACROS`
Move index-related variables into a structure. This defines a index_state structure and moves index-related global variables into it. Currently there is one instance of it, the_index, and everybody accesses it, so there is no code change. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 03:14:06 +02:00			`#define active_cache (the_index.cache)`
			`#define active_nr (the_index.cache_nr)`
			`#define active_alloc (the_index.cache_alloc)`
			`#define active_cache_changed (the_index.cache_changed)`
			`#define active_cache_tree (the_index.cache_tree)`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`#define read_cache() read_index(&the_index)`
			`#define read_cache_from(path) read_index_from(&the_index, (path))`
			`#define write_cache(newfd, cache, entries) write_index(&the_index, (newfd))`
			`#define discard_cache() discard_index(&the_index)`
			`#define cache_name_pos(name, namelen) index_name_pos(&the_index,(name),(namelen))`
			`#define add_cache_entry(ce, option) add_index_entry(&the_index, (ce), (option))`
			`#define remove_cache_entry_at(pos) remove_index_entry_at(&the_index, (pos))`
			`#define remove_file_from_cache(path) remove_file_from_index(&the_index, (path))`
			`#define add_file_to_cache(path, verbose) add_file_to_index(&the_index, (path), (verbose))`
git-add: Add support for --refresh option. This allows to refresh only a subset of the project files, based on the specified pathspecs. Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-11 23:59:01 +02:00			`#define refresh_cache(flags) refresh_index(&the_index, (flags), NULL, NULL)`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00			`#define ce_match_stat(ce, st, options) ie_match_stat(&the_index, (ce), (st), (options))`
			`#define ce_modified(ce, st, options) ie_modified(&the_index, (ce), (st), (options))`
Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-23 03:41:14 +01:00			`#define cache_name_exists(name, namelen) index_name_exists(&the_index, (name), (namelen))`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`#endif`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
index_fd(): use enum object_type instead of type name string. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-28 20:45:56 +01:00			`enum object_type {`
			`OBJ_BAD = -1,`
			`OBJ_NONE = 0,`
			`OBJ_COMMIT = 1,`
			`OBJ_TREE = 2,`
			`OBJ_BLOB = 3,`
			`OBJ_TAG = 4,`
			`/* 5 for future expansion */`
			`OBJ_OFS_DELTA = 6,`
			`OBJ_REF_DELTA = 7,`
			`OBJ_MAX,`
			`};`

rename: Break filepairs with different types. When we consider if a path has been totally rewritten, we did not touch changes from symlinks to files or vice versa. But a change that modifies even the type of a blob surely should count as a complete rewrite. While we are at it, modernise diffcore-break to be aware of gitlinks (we do not want to touch them). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-01 07:22:38 +01:00			`static inline enum object_type object_type(unsigned int mode)`
			`{`
			`return S_ISDIR(mode) ? OBJ_TREE :`
			`S_ISGITLINK(mode) ? OBJ_COMMIT :`
			`OBJ_BLOB;`
			`}`

Introduce GIT_DIR environment variable. During the mailing list discussion on renaming GIT_ environment variables, people felt that having one environment that lets the user (or Porcelain) specify both SHA1_FILE_DIRECTORY (now GIT_OBJECT_DIRECTORY) and GIT_INDEX_FILE for the default layout would be handy. This change introduces GIT_DIR environment variable, from which the defaults for GIT_INDEX_FILE and GIT_OBJECT_DIRECTORY are derived. When GIT_DIR is not defined, it defaults to ".git". GIT_INDEX_FILE defaults to "$GIT_DIR/index" and GIT_OBJECT_DIRECTORY defaults to "$GIT_DIR/objects". Special thanks for ideas and discussions go to Petr Baudis and Daniel Barkalow. Bugs are mine ;-) Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-10 07:57:58 +02:00			`#define GIT_DIR_ENVIRONMENT "GIT_DIR"`
introduce GIT_WORK_TREE to specify the work tree setup_gdg is used as abbreviation for setup_git_directory_gently. The work tree can be specified using the environment variable GIT_WORK_TREE and the config option core.worktree (the environment variable has precendence over the config option). Additionally there is a command line option --work-tree which sets the environment variable. setup_gdg does the following now: GIT_DIR unspecified repository in .git directory parent directory of the .git directory is used as work tree, GIT_WORK_TREE is ignored GIT_DIR unspecified repository in cwd GIT_DIR is set to cwd see the cases with GIT_DIR specified what happens next and also see the note below GIT_DIR specified GIT_WORK_TREE/core.worktree unspecified cwd is used as work tree GIT_DIR specified GIT_WORK_TREE/core.worktree specified the specified work tree is used Note on the case where GIT_DIR is unspecified and repository is in cwd: GIT_WORK_TREE is used but is_inside_git_dir is always true. I did it this way because setup_gdg might be called multiple times (e.g. when doing alias expansion) and in successive calls setup_gdg should do the same thing every time. Meaning of is_bare/is_inside_work_tree/is_inside_git_dir: (1) is_bare_repository A repository is bare if core.bare is true or core.bare is unspecified and the name suggests it is bare (directory not named .git). The bare option disables a few protective checks which are useful with a working tree. Currently this changes if a repository is bare: updates of HEAD are allowed git gc packs the refs the reflog is disabled by default (2) is_inside_work_tree True if the cwd is inside the associated working tree (if there is one), false otherwise. (3) is_inside_git_dir True if the cwd is inside the git directory, false otherwise. Before this patch is_inside_git_dir was always true for bare repositories. When setup_gdg finds a repository git_config(git_default_config) is always called. This ensure that is_bare_repository makes use of core.bare and does not guess even though core.bare is specified. inside_work_tree and inside_git_dir are set if setup_gdg finds a repository. The is_inside_work_tree and is_inside_git_dir functions will die if they are called before a successful call to setup_gdg. Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-06 09:10:42 +02:00			`#define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"`
Introduce GIT_DIR environment variable. During the mailing list discussion on renaming GIT_ environment variables, people felt that having one environment that lets the user (or Porcelain) specify both SHA1_FILE_DIRECTORY (now GIT_OBJECT_DIRECTORY) and GIT_INDEX_FILE for the default layout would be handy. This change introduces GIT_DIR environment variable, from which the defaults for GIT_INDEX_FILE and GIT_OBJECT_DIRECTORY are derived. When GIT_DIR is not defined, it defaults to ".git". GIT_INDEX_FILE defaults to "$GIT_DIR/index" and GIT_OBJECT_DIRECTORY defaults to "$GIT_DIR/objects". Special thanks for ideas and discussions go to Petr Baudis and Daniel Barkalow. Bugs are mine ;-) Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-10 07:57:58 +02:00			`#define DEFAULT_GIT_DIR_ENVIRONMENT ".git"`
Rename environment variables. H. Peter Anvin mentioned that using SHA1_whatever as an environment variable name is not nice and we should instead use names starting with "GIT_" prefix to avoid conflicts. Here is what this patch does: * Renames the following environment variables: New name Old Name GIT_AUTHOR_DATE AUTHOR_DATE GIT_AUTHOR_EMAIL AUTHOR_EMAIL GIT_AUTHOR_NAME AUTHOR_NAME GIT_COMMITTER_EMAIL COMMIT_AUTHOR_EMAIL GIT_COMMITTER_NAME COMMIT_AUTHOR_NAME GIT_ALTERNATE_OBJECT_DIRECTORIES SHA1_FILE_DIRECTORIES GIT_OBJECT_DIRECTORY SHA1_FILE_DIRECTORY * Introduces a compatibility macro, gitenv(), which does an getenv() and if it fails calls gitenv_bc(), which in turn picks up the value from old name while giving a warning about using an old name. * Changes all users of the environment variable to fetch environment variable with the new name using gitenv(). * Updates the documentation and scripts shipped with Linus GIT distribution. The transition plan is as follows: * We will keep the backward compatibility list used by gitenv() for now, so the current scripts and user environments continue to work as before. The users will get warnings when they have old name but not new name in their environment to the stderr. * The Porcelain layers should start using new names. However, just in case it ends up calling old Plumbing layer implementation, they should also export old names, taking values from the corresponding new names, during the transition period. * After a transition period, we would drop the compatibility support and drop gitenv(). Revert the callers to directly call getenv() but keep using the new names. The last part is probably optional and the transition duration needs to be set to a reasonable value. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-10 02:57:56 +02:00			`#define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"`
Add support for a "GIT_INDEX_FILE" environment variable. We use that to specify alternative index files, which can be useful if you want to (for example) generate a temporary index file to do some specific operation that you don't want to mess with your main one with. It defaults to the regular ".git/index" if it hasn't been specified. 2005-04-21 19:55:18 +02:00			`#define INDEX_ENVIRONMENT "GIT_INDEX_FILE"`
Teach parse_commit_buffer about grafting. Introduce a new file $GIT_DIR/info/grafts (or $GIT_GRAFT_FILE) which is a list of "fake commit parent records". Each line of this file is a commit ID, followed by parent commit IDs, all 40-byte hex SHA1 separated by a single SP in between. The records override the parent information we would normally read from the commit objects, allowing both adding "fake" parents (i.e. grafting), and pretending as if a commit is not a child of some of its real parents (i.e. cauterizing). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-30 09:58:28 +02:00			`#define GRAFT_ENVIRONMENT "GIT_GRAFT_FILE"`
Use preprocessor constants for environment variable names. We broke the discipline Linus set up to allow compiler help us avoid typos in environment names in the early days of git over time. This defines a handful preprocessor constants for environment variable names used in relatively core parts of the system. I've left out variable names specific to subsystems such as HTTP and SSL as I do not think they are big problems. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-19 10:28:15 +01:00			`#define TEMPLATE_DIR_ENVIRONMENT "GIT_TEMPLATE_DIR"`
			`#define CONFIG_ENVIRONMENT "GIT_CONFIG"`
			`#define CONFIG_LOCAL_ENVIRONMENT "GIT_CONFIG_LOCAL"`
			`#define EXEC_PATH_ENVIRONMENT "GIT_EXEC_PATH"`
Add basic infrastructure to assign attributes to paths This adds the basic infrastructure to assign attributes to paths, in a way similar to what the exclusion mechanism does based on $GIT_DIR/info/exclude and .gitignore files. An attribute is just a simple string that does not contain any whitespace. They can be specified in $GIT_DIR/info/attributes file, and .gitattributes file in each directory. Each line in these files defines a pattern matching rule. Similar to the exclusion mechanism, a later match overrides an earlier match in the same file, and entries from .gitattributes file in the same directory takes precedence over the ones from parent directories. Lines in $GIT_DIR/info/attributes file are used as the lowest precedence default rules. A line is either a comment (an empty line, or a line that begins with a '#'), or a rule, which is a whitespace separated list of tokens. The first token on the line is a shell glob pattern. The rest are names of attributes, each of which can optionally be prefixed with '!'. Such a line means "if a path matches this glob, this attribute is set (or unset -- if the attribute name is prefixed with '!'). For glob matching, the same "if the pattern does not have a slash in it, the basename of the path is matched with fnmatch(3) against the pattern, otherwise, the path is matched with the pattern with FNM_PATHNAME" rule as the exclusion mechanism is used. This does not define what an attribute means. Tying an attribute to various effects it has on git operation for paths that have it will be specified separately. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-12 10:07:32 +02:00			`#define GITATTRIBUTES_FILE ".gitattributes"`
			`#define INFOATTRIBUTES_FILE "info/attributes"`
attribute macro support This adds "attribute macros" (for lack of better name). So far, we have low-level attributes such as crlf and diff, which are defined in operational terms --- setting or unsetting them on a particular path directly affects what is done to the path. For example, in order to decline diffs or crlf conversions on a binary blob, no diffs on PostScript files, and treat all other files normally, you would have something like these: * diff crlf .ps !diff proprietary.o !diff !crlf That is fine as the operation goes, but gets unwieldy rather rapidly, when we start adding more low-level attributes that are defined in operational terms. A near-term example of such an attribute would be 'merge-3way' which would control if git should attempt the usual 3-way file-level merge internally, or leave merging to a specialized external program of user's choice. When it is added, we do _not_ want to force the users to update the above to: diff crlf merge-3way .ps !diff proprietary.o !diff !crlf !merge-3way The way this patch solves this issue is to realize that the attributes the user is assigning to paths are not defined in terms of operations but in terms of what they are. All of the three low-level attributes usually make sense for most of the files that sane SCM users have git operate on (these files are typically called "text'). Only a few cases, such as binary blob, need exception to decline the "usual treatment given to text files" -- and people mark them as "binary". So this allows the $GIT_DIR/info/alternates and .gitattributes at the toplevel of the project to also specify attributes that assigns other attributes. The syntax is '[attr]' followed by an attribute name followed by a list of attribute names: [attr] binary !diff !crlf !merge-3way When "binary" attribute is set to a path, if the path has not got diff/crlf/merge-3way attribute set or unset by other rules, this rule unsets the three low-level attributes. It is expected that the user level .gitattributes will be expressed mostly in terms of attributes based on what the files are, and the above sample would become like this: (built-in attribute configuration) [attr] binary !diff !crlf !merge-3way diff crlf merge-3way (project specific .gitattributes) proprietary.o binary (user preference $GIT_DIR/info/attributes) .ps !diff There are a few caveats. As described above, you can define these macros only in $GIT_DIR/info/attributes and toplevel .gitattributes. * There is no attempt to detect circular definition of macro attributes, and definitions are evaluated from bottom to top as usual to fill in other attributes that have not yet got values. The following would work as expected: [attr] text diff crlf [attr] ps text !diff .ps ps while this would most likely not (I haven't tried): [attr] ps text !diff [attr] text diff crlf .ps ps * When a macro says "[attr] A B !C", saying that a path does not have attribute A does not let you tell anything about attributes B or C. That is, given this: [attr] text diff crlf [attr] ps text !diff .txt !ps path hello.txt, which would match ".txt" pattern, would have "ps" attribute set to zero, but that does not make text attribute of hello.txt set to false (nor diff attribute set to true). Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-14 17:54:37 +02:00			`#define ATTRIBUTE_MACRO_PREFIX "[attr]"`
Add support for a "GIT_INDEX_FILE" environment variable. We use that to specify alternative index files, which can be useful if you want to (for example) generate a temporary index file to do some specific operation that you don't want to mess with your main one with. It defaults to the regular ".git/index" if it hasn't been specified. 2005-04-21 19:55:18 +02:00
Introduce is_bare_repository() and core.bare configuration variable This removes the old is_bare_git_dir(const char ) to ask if a directory, if it is a GIT_DIR, is a bare repository, and replaces it with is_bare_repository(void ). The function looks at core.bare configuration variable if exists but uses the old heuristics: if it is ".git" or ends with "/.git", then it does not look like a bare repository, otherwise it does. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-07 11:00:28 +01:00			`extern int is_bare_repository_cfg;`
			`extern int is_bare_repository(void);`
Do not verify filenames in a bare repository For example, it makes no sense to check the presence of a file named "HEAD" when calling "git log HEAD" in a bare repository. Noticed by Han-Wen Nienhuys. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> 2007-01-20 03:09:34 +01:00			`extern int is_inside_git_dir(void);`
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:30:14 +02:00			`extern char *git_work_tree_cfg;`
introduce GIT_WORK_TREE to specify the work tree setup_gdg is used as abbreviation for setup_git_directory_gently. The work tree can be specified using the environment variable GIT_WORK_TREE and the config option core.worktree (the environment variable has precendence over the config option). Additionally there is a command line option --work-tree which sets the environment variable. setup_gdg does the following now: GIT_DIR unspecified repository in .git directory parent directory of the .git directory is used as work tree, GIT_WORK_TREE is ignored GIT_DIR unspecified repository in cwd GIT_DIR is set to cwd see the cases with GIT_DIR specified what happens next and also see the note below GIT_DIR specified GIT_WORK_TREE/core.worktree unspecified cwd is used as work tree GIT_DIR specified GIT_WORK_TREE/core.worktree specified the specified work tree is used Note on the case where GIT_DIR is unspecified and repository is in cwd: GIT_WORK_TREE is used but is_inside_git_dir is always true. I did it this way because setup_gdg might be called multiple times (e.g. when doing alias expansion) and in successive calls setup_gdg should do the same thing every time. Meaning of is_bare/is_inside_work_tree/is_inside_git_dir: (1) is_bare_repository A repository is bare if core.bare is true or core.bare is unspecified and the name suggests it is bare (directory not named .git). The bare option disables a few protective checks which are useful with a working tree. Currently this changes if a repository is bare: updates of HEAD are allowed git gc packs the refs the reflog is disabled by default (2) is_inside_work_tree True if the cwd is inside the associated working tree (if there is one), false otherwise. (3) is_inside_git_dir True if the cwd is inside the git directory, false otherwise. Before this patch is_inside_git_dir was always true for bare repositories. When setup_gdg finds a repository git_config(git_default_config) is always called. This ensure that is_bare_repository makes use of core.bare and does not guess even though core.bare is specified. inside_work_tree and inside_git_dir are set if setup_gdg finds a repository. The is_inside_work_tree and is_inside_git_dir functions will die if they are called before a successful call to setup_gdg. Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-06 09:10:42 +02:00			`extern int is_inside_work_tree(void);`
git_dir holds pointers to local strings, hence MUST be const. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-23 12:39:11 +02:00			`extern const char *get_git_dir(void);`
Introduce GIT_DIR environment variable. During the mailing list discussion on renaming GIT_ environment variables, people felt that having one environment that lets the user (or Porcelain) specify both SHA1_FILE_DIRECTORY (now GIT_OBJECT_DIRECTORY) and GIT_INDEX_FILE for the default layout would be handy. This change introduces GIT_DIR environment variable, from which the defaults for GIT_INDEX_FILE and GIT_OBJECT_DIRECTORY are derived. When GIT_DIR is not defined, it defaults to ".git". GIT_INDEX_FILE defaults to "$GIT_DIR/index" and GIT_OBJECT_DIRECTORY defaults to "$GIT_DIR/objects". Special thanks for ideas and discussions go to Petr Baudis and Daniel Barkalow. Bugs are mine ;-) Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-10 07:57:58 +02:00			`extern char *get_object_directory(void);`
[PATCH] Operations on refs This patch adds code to read a hash out of a specified file under {GIT_DIR}/refs/, and to write such files atomically and optionally with an compare and lock. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-06 22:31:29 +02:00			`extern char *get_refs_directory(void);`
Introduce GIT_DIR environment variable. During the mailing list discussion on renaming GIT_ environment variables, people felt that having one environment that lets the user (or Porcelain) specify both SHA1_FILE_DIRECTORY (now GIT_OBJECT_DIRECTORY) and GIT_INDEX_FILE for the default layout would be handy. This change introduces GIT_DIR environment variable, from which the defaults for GIT_INDEX_FILE and GIT_OBJECT_DIRECTORY are derived. When GIT_DIR is not defined, it defaults to ".git". GIT_INDEX_FILE defaults to "$GIT_DIR/index" and GIT_OBJECT_DIRECTORY defaults to "$GIT_DIR/objects". Special thanks for ideas and discussions go to Petr Baudis and Daniel Barkalow. Bugs are mine ;-) Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-10 07:57:58 +02:00			`extern char *get_index_file(void);`
Teach parse_commit_buffer about grafting. Introduce a new file $GIT_DIR/info/grafts (or $GIT_GRAFT_FILE) which is a list of "fake commit parent records". Each line of this file is a commit ID, followed by parent commit IDs, all 40-byte hex SHA1 separated by a single SP in between. The records override the parent information we would normally read from the commit objects, allowing both adding "fake" parents (i.e. grafting), and pretending as if a commit is not a child of some of its real parents (i.e. cauterizing). Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-07-30 09:58:28 +02:00			`extern char *get_graft_file(void);`
Add set_git_dir() function With the function set_git_dir() you can reset the path that will be used for git_path(), git_dir() and friends. The responsibility to close files and throw away information from the old git_dir lies with the caller. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:38 +02:00			`extern int set_git_dir(const char *path);`
Clean up work-tree handling The old version of work-tree support was an unholy mess, barely readable, and not to the point. For example, why do you have to provide a worktree, when it is not used? As in "git status". Now it works. Another riddle was: if you can have work trees inside the git dir, why are some programs complaining that they need a work tree? IOW it is allowed to call $ git --git-dir=../ --work-tree=. bla when you really want to. In this case, you are both in the git directory and in the working tree. So, programs have to actually test for the right thing, namely if they are inside a working tree, and not if they are inside a git directory. Also, GIT_DIR=../.git should behave the same as if no GIT_DIR was specified, unless there is a repository in the current working directory. It does now. The logic to determine if a repository is bare, or has a work tree (tertium non datur), is this: --work-tree=bla overrides GIT_WORK_TREE, which overrides core.bare = true, which overrides core.worktree, which overrides GIT_DIR/.. when GIT_DIR ends in /.git, which overrides the directory in which .git/ was found. In related news, a long standing bug was fixed: when in .git/bla/x.git/, which is a bare repository, git formerly assumed ../.. to be the appropriate git dir. This problem was reported by Shawn Pearce to have caused much pain, where a colleague mistakenly ran "git init" in "/" a long time ago, and bare repositories just would not work. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:30:14 +02:00			`extern const char *get_git_work_tree(void);`
Introduce GIT_DIR environment variable. During the mailing list discussion on renaming GIT_ environment variables, people felt that having one environment that lets the user (or Porcelain) specify both SHA1_FILE_DIRECTORY (now GIT_OBJECT_DIRECTORY) and GIT_INDEX_FILE for the default layout would be handy. This change introduces GIT_DIR environment variable, from which the defaults for GIT_INDEX_FILE and GIT_OBJECT_DIRECTORY are derived. When GIT_DIR is not defined, it defaults to ".git". GIT_INDEX_FILE defaults to "$GIT_DIR/index" and GIT_OBJECT_DIRECTORY defaults to "$GIT_DIR/objects". Special thanks for ideas and discussions go to Petr Baudis and Daniel Barkalow. Bugs are mine ;-) Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-10 07:57:58 +02:00
			`#define ALTERNATE_DB_ENVIRONMENT "GIT_ALTERNATE_OBJECT_DIRECTORIES"`
Add support for a "GIT_INDEX_FILE" environment variable. We use that to specify alternative index files, which can be useful if you want to (for example) generate a temporary index file to do some specific operation that you don't want to mess with your main one with. It defaults to the regular ".git/index" if it hasn't been specified. 2005-04-21 19:55:18 +02:00
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`extern const char *get_pathspec(const char prefix, const char **pathspec);`
Refactor working tree setup Create a setup_work_tree() that can be used from any command requiring a working tree conditionally. Signed-off-by: Mike Hommey <mh@glandium.org> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-03 12:23:11 +01:00			`extern void setup_work_tree(void);`
working from subdirectory: preparation - prefix_filename() is like prefix_path() but can be used to name any file on the filesystem, not the files that might go into the index file. - setup_git_directory_gently() tries to find the GIT_DIR, but does not die() if called outside a git repository. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-26 08:14:15 +01:00			`extern const char setup_git_directory_gently(int );`
[PATCH] Make "git diff" work inside relative subdirectories We always show the diff as an absolute path, but pathnames to diff are taken relative to the current working directory (and if no pathnames are given, the default ends up being all of the current working directory). Note that "../xyz" also works, so you can do cd linux/drivers/char git diff ../block and it will generate a diff of the linux/drivers/block changes. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 03:06:34 +02:00			`extern const char *setup_git_directory(void);`
Diff clean-up. This is a long overdue clean-up to the code for parsing and passing diff options. It also tightens some constness issues. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-21 09:00:47 +02:00			`extern const char prefix_path(const char prefix, int len, const char *path);`
working from subdirectory: preparation - prefix_filename() is like prefix_path() but can be used to name any file on the filesystem, not the files that might go into the index file. - setup_git_directory_gently() tries to find the GIT_DIR, but does not die() if called outside a git repository. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-26 08:14:15 +01:00			`extern const char prefix_filename(const char prefix, int len, const char *path);`
Fix filename verification when in a subdirectory When we are in a subdirectory of a git archive, we need to take the prefix of that subdirectory into accoung when we verify filename arguments. Noted by Matthias Lederhofer This also uses the improved error reporting for all the other git commands that use the revision parsing interfaces, not just git-rev-parse. Also, it makes the error reporting for mixed filenames and argument flags clearer (you cannot put flags after the start of the pathname list). [jc: with fix to a trivial typo noticed by Timo Hirvonen] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-26 19:15:54 +02:00			`extern void verify_filename(const char prefix, const char name);`
revision parsing: make "rev -- paths" checks stronger. If you don't have a "--" marker, then: - all of the arguments we are going to assume are pathspecs must exist in the working tree. - none of the arguments we parsed as revisions could be interpreted as a filename. so that there really isn't any possibility of confusion in case somebody does have a revision that looks like a pathname too. The former rule has been in effect; this implements the latter. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-04-27 00:09:27 +02:00			`extern void verify_non_filename(const char prefix, const char name);`
[PATCH] Make "git diff" work inside relative subdirectories We always show the diff as an absolute path, but pathnames to diff are taken relative to the current working directory (and if no pathnames are given, the default ends up being all of the current working directory). Note that "../xyz" also works, so you can do cd linux/drivers/char git diff ../block and it will generate a diff of the linux/drivers/block changes. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-17 03:06:34 +02:00
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`#define alloc_nr(x) (((x)+16)*3/2)`

refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`/*`
			`* Realloc the buffer pointed at by variable 'x' so that it can hold`
			`* at least 'nr' entries; the number of entries currently allocated`
			`* is 'alloc', using the standard growing factor alloc_nr() macro.`
			`*`
			`* DO NOT USE any expression with side-effect for 'x' or 'alloc'.`
			`*/`
			`#define ALLOC_GROW(x, nr, alloc) \`
			`do { \`
Fix ALLOC_GROW off-by-one The ALLOC_GROW macro will never let us fill the array completely, instead allocating an extra chunk if that would be the case. This is because the 'nr' argument was originally treated as "how much we do have now" instead of "how much do we want". The latter makes much more sense because you can grow by more than one item. This off-by-one never resulted in an error because it meant we were overly conservative about when to allocate. Any callers which passed "how much we have now" need to be updated, or they will fail to allocate enough. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-17 00:37:39 +02:00			`if ((nr) > alloc) { \`
Extend --pretty=oneline to cover the first paragraph, so that an ugly commit message like this can be handled sanely. Currently, --pretty=oneline and --pretty=email (hence format-patch) take and use only the first line of the commit log message. This changes them to: - Take the first paragraph, where the definition of the first paragraph is "skip all blank lines from the beginning, and then grab everything up to the next empty line". - Replace all line breaks with a whitespace. This change would not affect a well-behaved commit message that adheres to the convention of "single line summary, a blank line, and then body of message", as its first paragraph always consists of a single line. Commit messages from different culture, such as the ones imported from CVS/SVN, can however get chomped with the existing behaviour at the first linebreak in the middle of sentence right now, which would become much easier to see with this change. The Subject: and --pretty=oneline output would become very long and unsightly for non-conforming commits, but their messages are already ugly anyway, and thischange at least avoids the loss of information. The Subject: line from a multi-line paragraph is folded using RFC2822 line folding rules at the places where line breaks were in the original. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-12 07:10:55 +02:00			`if (alloc_nr(alloc) < (nr)) \`
			`alloc = (nr); \`
			`else \`
			`alloc = alloc_nr(alloc); \`
refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`x = xrealloc((x), alloc * sizeof(*(x))); \`
			`} \`
			`} while(0)`

Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`/* Initialize and use the cache information */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`extern int read_index(struct index_state *);`
			`extern int read_index_from(struct index_state , const char path);`
			`extern int write_index(struct index_state *, int newfd);`
			`extern int discard_index(struct index_state *);`
Prevent bogus paths from being added to the index. With this one, it's now a fatal error to try to add a pathname that cannot be added with "git add", i.e. [torvalds@g5 git]$ git add .git/config fatal: unable to add .git/config to index and [torvalds@g5 git]$ git add foo/../bar fatal: unable to add foo/../bar to index instead of the old "Ignoring path xyz" warning that would end up silently succeeding on any other paths. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-18 21:07:31 +02:00			`extern int verify_path(const char *path);`
Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-23 03:41:14 +01:00			`extern int index_name_exists(struct index_state istate, const char name, int namelen);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`extern int index_name_pos(struct index_state , const char name, int namelen);`
Add git-update-cache --replace option. When "path" exists as a file or a symlink in the index, an attempt to add "path/file" is refused because it results in file vs directory conflict. Similarly when "path/file1", "path/file2", etc. exist, an attempt to add "path" as a file or a symlink is refused. With git-update-cache --replace, these existing entries that conflict with the entry being added are automatically removed from the cache, with warning messages. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-05-08 06:55:21 +02:00			`#define ADD_CACHE_OK_TO_ADD 1 /* Ok to add */`
			`#define ADD_CACHE_OK_TO_REPLACE 2 /* Ok to replace file/directory */`
[PATCH] Fix oversimplified optimization for add_cache_entry(). An earlier change to optimize directory-file conflict check broke what "read-tree --emu23" expects. This is fixed by this commit. (1) Introduces an explicit flag to tell add_cache_entry() not to check for conflicts and use it when reading an existing tree into an empty stage --- by definition this case can never introduce such conflicts. (2) Makes read-cache.c:has_file_name() and read-cache.c:has_dir_name() aware of the cache stages, and flag conflict only with paths in the same stage. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-25 11:25:29 +02:00			`#define ADD_CACHE_SKIP_DFCHECK 4 /* Ok to skip DF conflict checks */`
Optimize "diff --cached" performance. The read_tree() function is called only from the call chain to run "git diff --cached" (this includes the internal call made by git-runstatus to run_diff_index()). The function vacates stage without any funky "merge" magic. The caller then goes and compares stage #1 entries from the tree with stage #0 entries from the original index. When adding the cache entries this way, it used the general purpose add_cache_entry(). This function looks for an existing entry to replace or if there is none to find where to insert the new entry, resolves D/F conflict and all the other things. For the purpose of reading entries into an empty stage, none of that processing is needed. We can instead append everything and then sort the result at the end. This commit changes read_tree() to first make sure that there is no existing cache entries at specified stage, and if that is the case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag (new), and then sort the resulting cache using qsort(). This new flag tells add_cache_entry() to omit all the checks such as "Does this path already exist? Does adding this path remove other existing entries because it turns a directory to a file?" and instead append the given cache entry straight at the end of the active cache. The caller of course is expected to sort the resulting cache at the end before using the result. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-09 22:42:50 +02:00			`#define ADD_CACHE_JUST_APPEND 8 /* Append only; tree.c::read_tree() */`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`extern int add_index_entry(struct index_state , struct cache_entry ce, int option);`
Extract helper bits from c-merge-recursive work This backports the pieces that are not uncooked from the merge-recursive WIP we have seen earlier, to be used in git-mv rewritten in C. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-26 06:32:18 +02:00			`extern struct cache_entry refresh_cache_entry(struct cache_entry ce, int really);`
Make read-cache.c "the_index" free. This makes all low-level functions defined in read-cache.c to take an explicit index_state structure as their first parameter, to specify which index to work on. These functions traditionally operated on "the_index" and were named foo_cache(); the counterparts this patch introduces are called foo_index(). The traditional foo_cache() functions are made into macros that give "the_index" to their corresponding foo_index() functions. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-02 08:26:07 +02:00			`extern int remove_index_entry_at(struct index_state *, int pos);`
			`extern int remove_file_from_index(struct index_state , const char path);`
			`extern int add_file_to_index(struct index_state , const char path, int verbose);`
Move make_cache_entry() from merge-recursive.c into read-cache.c The function make_cache_entry() is too useful to be hidden away in merge-recursive. So move it to libgit.a (exposing it via cache.h). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-11 05:17:28 +02:00			`extern struct cache_entry make_cache_entry(unsigned int mode, const unsigned char sha1, const char *path, int stage, int refresh);`
Rename some more cache-related functions same_name -> ce_same_name() remove_entry_at() -> remove_cache_entry_at() Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz> 2005-05-15 04:04:25 +02:00			`extern int ce_same_name(struct cache_entry a, struct cache_entry b);`
ce_match_stat, run_diff_files: use symbolic constants for readability ce_match_stat() can be told: (1) to ignore CE_VALID bit (used under "assume unchanged" mode) and perform the stat comparison anyway; (2) not to perform the contents comparison for racily clean entries and report mismatch of cached stat information; using its "option" parameter. Give them symbolic constants. Similarly, run_diff_files() can be told not to report anything on removed paths. Also give it a symbolic constant for that. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-10 09:15:03 +01:00
			`/* do stat comparison even if CE_VALID is true */`
			`#define CE_MATCH_IGNORE_VALID 01`
			`/* do not check the contents but report dirty on racily-clean entries */`
			`#define CE_MATCH_RACY_IS_DIRTY 02`
			`extern int ie_match_stat(struct index_state , struct cache_entry , struct stat *, unsigned int);`
			`extern int ie_modified(struct index_state , struct cache_entry , struct stat *, unsigned int);`

Make "ce_match_path()" a generic helper function ... and make git-diff-files use it too. This all _should_ make the diffcore-pathspec.c phase unnecessary, since the diff'ers now all do the path matching early interally. 2005-07-15 01:55:06 +02:00			`extern int ce_path_match(const struct cache_entry ce, const char *pathspec);`
index_fd(): pass optional path parameter as hint for blob conversion Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-28 20:52:04 +01:00			`extern int index_fd(unsigned char sha1, int fd, struct stat st, int write_object, enum object_type type, const char *path);`
Allow saving an object from a pipe In order to support getting data into git with scripts, this adds a --stdin option to git-hash-object, which will make it read from stdin. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-10 23:25:24 +01:00			`extern int index_pipe(unsigned char sha1, int fd, const char type, int write_object);`
Show original and resulting blob object info in diff output. This adds more cruft to diff --git header to record the blob SHA1 and the mode the patch/diff is intended to be applied against, to help the receiving end fall back on a three-way merge. The new header looks like this: diff --git a/apply.c b/apply.c index 7be5041..8366082 100644 --- a/apply.c +++ b/apply.c @@ -14,6 +14,7 @@ // files that are being modified, but doesn't apply the patch // --stat does just a diffstat, and doesn't actually apply +// --show-index-info shows the old and new index info for... ... Upon receiving such a patch, if the patch did not apply cleanly to the target tree, the recipient can try to find the matching old objects in her object database and create a temporary tree, apply the patch to that temporary tree, and attempt a 3-way merge between the patched temporary tree and the target tree using the original temporary tree as the common ancestor. The patch lifts the code to compute the hash for an on-filesystem object from update-index.c and makes it available to the diff output routine. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-07 12:42:00 +02:00			`extern int index_path(unsigned char sha1, const char path, struct stat *st, int write_object);`
[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`extern void fill_stat_cache_info(struct cache_entry ce, struct stat st);`

Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00			`#define REFRESH_REALLY 0x0001 /* ignore_valid */`
			`#define REFRESH_UNMERGED 0x0002 /* allow unmerged */`
			`#define REFRESH_QUIET 0x0004 /* be quiet about it */`
			`#define REFRESH_IGNORE_MISSING 0x0008 /* ignore non-existent */`
git-add: Add support for --refresh option. This allows to refresh only a subset of the project files, based on the specified pathspecs. Signed-off-by: Alexandre Julliard <julliard@winehq.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-11 23:59:01 +02:00			`extern int refresh_index(struct index_state , unsigned int flags, const char pathspec, char seen);`
Libify the index refresh logic This cleans up and libifies the "git update-index --[really-]refresh" functionality. This will be eventually required for eventually doing the "commit" and "status" commands as built-ins. It really just moves "refresh_index()" from update-index.c to read-cache.c, but it also has to change the calling convention so that the function uses a "unsigned int flags" argument instead of various static flags variables for passing down the information about whether to be quiet or not, and allow unmerged entries etc. That actually cleans up update-index.c too, since it turns out that all those flags were really specific to that one function of the index update, so they shouldn't have had file-scope visibility even before. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-19 18:56:35 +02:00
Make index file locking code reusable to others. The framework to create lockfiles that are removed at exit is first used to reliably write the index file, but it is applicable to other things, so stop calling it "cache_file". This also rewords a few remaining error message that called the index file "cache file". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-06 21:51:49 +02:00			`struct lock_file {`
			`struct lock_file *next;`
Close files opened by lock_file() before unlinking. This is needed on Windows since open files cannot be unlinked. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-13 21:05:03 +01:00			`int fd;`
lockfile: record the primary process. The usual process flow is the main process opens and holds the lock to the index, does its thing, perhaps spawning children during the course, and then writes the resulting index out by releaseing the lock. However, the lockfile interface uses atexit(3) to clean it up, without regard to who actually created the lock. This typically leads to a confusing behaviour of lock being released too early when the child exits, and then the parent process when it calls commit_lockfile() finds that it cannot unlock it. This fixes the problem by recording who created and holds the lock, and upon atexit(3) handler, child simply ignores the lockfile the parent created. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-21 12:11:10 +02:00			`pid_t owner;`
Fix infinite loop when deleting multiple packed refs. It was stupid to link the same element twice to lock_file_list and end up in a loop, so we certainly need a fix. But it is not like we are taking a lock on multiple files in this case. It is just that we leave the linked element on the list even after commit_lock_file() successfully removes the cruft. We cannot remove the list element in commit_lock_file(); if we are interrupted in the middle of list manipulation, the call to remove_lock_file_on_signal() will happen with a broken list structure pointed by lock_file_list, which would cause the cruft to remain, so not removing the list element is the right thing to do. Instead we should be reusing the element already on the list. There is already a code for that in lock_file() function in lockfile.c. The code checks lk->next and the element is linked only when it is not already on the list -- which is incorrect for the last element on the list (which has NULL in its next field), but if you read the check as "is this element already on the list?" it actually makes sense. We do not want to link it on the list again, nor we would want to set up signal/atexit over and over. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-02 20:19:05 +01:00			`char on_list;`
Make index file locking code reusable to others. The framework to create lockfiles that are removed at exit is first used to reliably write the index file, but it is applicable to other things, so stop calling it "cache_file". This also rewords a few remaining error message that called the index file "cache file". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-06 21:51:49 +02:00			`char filename[PATH_MAX];`
[PATCH] Implement git-checkout-cache -u to update stat information in the cache. With -u flag, git-checkout-cache picks up the stat information from newly created file and updates the cache. This removes the need to run git-update-cache --refresh immediately after running git-checkout-cache. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-15 23:23:12 +02:00			`};`
Better error message when we are unable to lock the index file Most of the callers except the one in refs.c use the function to update the index file. Among the index writers, everybody except write-tree dies if they cannot open it for writing. This gives the function an extra argument, to tell it to die when it cannot create a new file as the lockfile. The only caller that does not have to die is write-tree, because updating the index for the cache-tree part is optional and not being able to do so does not affect the correctness. I think we do not have to be so careful and make the failure into die() the same way as other callers, but that would be a different patch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-12 10:03:47 +02:00			`extern int hold_lock_file_for_update(struct lock_file , const char path, int);`
Make index file locking code reusable to others. The framework to create lockfiles that are removed at exit is first used to reliably write the index file, but it is applicable to other things, so stop calling it "cache_file". This also rewords a few remaining error message that called the index file "cache file". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-06 21:51:49 +02:00			`extern int commit_lock_file(struct lock_file *);`
_GIT_INDEX_OUTPUT: allow plumbing to output to an alternative index file. When defined, this allows plumbing commands that update the index (add, apply, checkout-index, merge-recursive, mv, read-tree, rm, update-index, and write-tree) to write their resulting index to an alternative index file while holding a lock to the original index file. With this, git-commit that jumps the index does not have to make an extra copy of the index file, and more importantly, it can do the update while holding the lock on the index. However, I think the interface to let an environment variable specify the output is a mistake, as shown in the documentation. If a curious user has the environment variable set to something other than the file GIT_INDEX_FILE points at, almost everything will break. This should instead be a command line parameter to tell these plumbing commands to write the result in the named file, to prevent stupid mistakes. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-01 08:09:02 +02:00
			`extern int hold_locked_index(struct lock_file *, int);`
			`extern int commit_locked_index(struct lock_file *);`
git-read-tree --index-output=<file> This corrects the interface mistake of the previous one, and gives a command line parameter to the only plumbing command that currently needs it: "git-read-tree". We can add the calls to set_alternate_index_output() to other plumbing commands that update the index if/when needed. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-01 08:27:41 +02:00			`extern void set_alternate_index_output(const char *);`
close_lock_file(): new function in the lockfile API The lockfile API is a handy way to obtain a file that is cleaned up if you die(). But sometimes you would need this sequence to work: 1. hold_lock_file_for_update() to get a file descriptor for writing; 2. write the contents out, without being able to decide if the results should be committed or rolled back; 3. do something else that makes the decision --- and this "something else" needs the lockfile not to have an open file descriptor for writing (e.g. Windows do not want a open file to be renamed); 4. call commit_lock_file() or rollback_lock_file() as appropriately. This adds close_lock_file() you can call between step 2 and 3 in the above sequence. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-16 20:05:32 +01:00			`extern int close_lock_file(struct lock_file *);`
Make index file locking code reusable to others. The framework to create lockfiles that are removed at exit is first used to reliably write the index file, but it is applicable to other things, so stop calling it "cache_file". This also rewords a few remaining error message that called the index file "cache file". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-06 21:51:49 +02:00			`extern void rollback_lock_file(struct lock_file *);`
Use const qualifier for 'sha1' parameter in delete_ref function delete_ref function does not change the 'sha1' parameter. Non-const pointer causes a compiler warning if you call to the function using a const argument. Signed-off-by: Carlos Rica <jasampler@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-18 05:34:34 +02:00			`extern int delete_ref(const char , const unsigned char sha1);`
Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00
apply --whitespace: configuration option. The new configuration option apply.whitespace can take one of "warn", "error", "error-all", or "strip". When git-apply is run to apply the patch to the index, they are used as the default value if there is no command line --whitespace option. Andrew can now tell people who feed him git trees to update to this version and say: git repo-config apply.whitespace error Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 23:47:45 +01:00			`/* Environment bits from configuration mechanism */`
Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-11 01:31:08 +02:00			`extern int trust_executable_bit;`
Add core.quotepath configuration variable. We always quote "unusual" byte values in a pathname using C-string style, to make it safer for parsing scripts that do not handle NUL separated records well (or just too lazy to bother). The absolute minimum bytes that need to be quoted for this purpose are TAB, LF (and other control characters), double quote and backslash. However, we have also always quoted the bytes in high 8-bit range; this was partly because we were lazy and partly because we were being cautious. This introduces an internal "quote_path_fully" variable, and core.quotepath configuration variable to control it. When set to false, it does not quote bytes in high 8-bit range anymore but passes them intact. The variable defaults to "true" to retain the traditional behaviour for now. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-25 00:11:24 +02:00			`extern int quote_path_fully;`
Add core.symlinks to mark filesystems that do not support symbolic links. Some file systems that can host git repositories and their working copies do not support symbolic links. But then if the repository contains a symbolic link, it is impossible to check out the working copy. This patch enables partial support of symbolic links so that it is possible to check out a working copy on such a file system. A new flag core.symlinks (which is true by default) can be set to false to indicate that the filesystem does not support symbolic links. In this case, symbolic links that exist in the trees are checked out as small plain files, and checking in modifications of these files preserve the symlink property in the database (as long as an entry exists in the index). Of course, this does not magically make symbolic links work on such defective file systems; hence, this solution does not help if the working copy relies on that an entry is a real symbolic link. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-02 22:11:30 +01:00			`extern int has_symlinks;`
"Assume unchanged" git This adds "assume unchanged" logic, started by this message in the list discussion recently: <Pine.LNX.4.64.0601311807470.7301@g5.osdl.org> This is a workaround for filesystems that do not have lstat() that is quick enough for the index mechanism to take advantage of. On the paths marked as "assumed to be unchanged", the user needs to explicitly use update-index to register the object name to be in the next commit. You can use two new options to update-index to set and reset the CE_VALID bit: git-update-index --assume-unchanged path... git-update-index --no-assume-unchanged path... These forms manipulate only the CE_VALID bit; it does not change the object name recorded in the index file. Nor they add a new entry to the index. When the configuration variable "core.ignorestat = true" is set, the index entries are marked with CE_VALID bit automatically after: - update-index to explicitly register the current object name to the index file. - when update-index --refresh finds the path to be up-to-date. - when tools like read-tree -u and apply --index update the working tree file and register the current object name to the index file. The flag is dropped upon read-tree that does not check out the index entry. This happens regardless of the core.ignorestat settings. Index entries marked with CE_VALID bit are assumed to be unchanged most of the time. However, there are cases that CE_VALID bit is ignored for the sake of safety and usability: - while "git-read-tree -m" or git-apply need to make sure that the paths involved in the merge do not have local modifications. This sacrifices performance for safety. - when git-checkout-index -f -q -u -a tries to see if it needs to checkout the paths. Otherwise you can never check anything out ;-). - when git-update-index --really-refresh (a new flag) tries to see if the index entry is up to date. You can start with everything marked as CE_VALID and run this once to drop CE_VALID bit for paths that are modified. Most notably, "update-index --refresh" honours CE_VALID and does not actively stat, so after you modified a file in the working tree, update-index --refresh would not notice until you tell the index about it with "git-update-index path" or "git-update-index --no-assume-unchanged path". This version is not expected to be perfect. I think diff between index and/or tree and working files may need some adjustment, and there probably needs other cases we should automatically unmark paths that are marked to be CE_VALID. But the basics seem to work, and ready to be tested by people who asked for this feature. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-09 06:15:24 +01:00			`extern int assume_unchanged;`
core.prefersymlinkrefs: use symlinks for .git/HEAD When inspecting a project whose build infrastructure used to assume that .git/HEAD is a symlink ref, core.prefersymlinkrefs in the config file of such a project would help to bisect its history. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-02 09:40:24 +02:00			`extern int prefer_symlink_refs;`
Log ref updates to logs/refs/<ref> If config parameter core.logAllRefUpdates is true or the log file already exists then append a line to ".git/logs/refs/<ref>" whenever git-update-ref <ref> is executed. Each log line contains the following information: oldsha1 <SP> newsha1 <SP> committer <LF> where committer is the current user, date, time and timezone in the standard GIT ident format. If the caller is unable to append to the log file then git-update-ref will fail without updating <ref>. An optional message may be included in the log line with the -m flag. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 11:55:40 +02:00			`extern int log_all_ref_updates;`
core.warnambiguousrefs: warns when "name" is used and both "name" branch and tag exists. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-03-21 03:45:47 +01:00			`extern int warn_ambiguous_refs;`
Introduce core.sharedrepository If the config variable 'core.sharedrepository' is set, the directories $GIT_DIR/objects/ $GIT_DIR/objects/?? $GIT_DIR/objects/pack $GIT_DIR/refs $GIT_DIR/refs/heads $GIT_DIR/refs/heads/tags are set group writable (and g+s, since the git group may be not the primary group of all users). Since all files are written as lock files first, and then moved to their destination, they do not have to be group writable. Indeed, if this leads to problems you found a bug. Note that -- as in my first attempt -- the config variable is set in the function which checks the repository format. If this were done in git_default_config instead, a lot of programs would need to be modified to call git_config(git_default_config) first. [jc: git variables should be in environment.c unless there is a compelling reason to do otherwise.] Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-22 23:13:56 +01:00			`extern int shared_repository;`
apply --whitespace: configuration option. The new configuration option apply.whitespace can take one of "warn", "error", "error-all", or "strip". When git-apply is run to apply the patch to the index, they are used as the default value if there is no command line --whitespace option. Andrew can now tell people who feed him git trees to update to this version and say: git repo-config apply.whitespace error Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-27 23:47:45 +01:00			`extern const char *apply_default_whitespace;`
Make zlib compression level configurable, and change default. With the change in default, "git add ." on kernel dir is about twice as fast as before, with only minimal (0.5%) change in object size. The speed difference is even more noticeable when committing large files, which is now up to 8 times faster. The configurability is through setting core.compression = [-1..9] which maps to the zlib constants; -1 is the default, 0 is no compression, and 1..9 are various speed/size tradeoffs, 9 being slowest. Signed-off-by: Joachim B Haga (cjhaga@fys.uio.no) Acked-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-03 22:11:47 +02:00			`extern int zlib_compression_level;`
Custom compression levels for objects and packs Add config variables pack.compression and core.loosecompression , and switch --compression=level to pack-objects. Loose objects will be compressed using core.loosecompression if set, else core.compression if set, else Z_BEST_SPEED. Packed objects will be compressed using --compression=level if seen, else pack.compression if set, else core.compression if set, else Z_DEFAULT_COMPRESSION. This is the "pack compression level". Loose objects added to a pack undeltified will be recompressed to the pack compression level if it is unequal to the current loose compression level by the preceding rules, or if the loose object was written while core.legacyheaders = true. Newly deltified loose objects are always compressed to the current pack compression level. Previously packed objects added to a pack are recompressed to the current pack compression level exactly when their deltification status changes, since the previous pack data cannot be reused. In either case, the --no-reuse-object switch from the first patch below will always force recompression to the current pack compression level, instead of assuming the pack compression level hasn't changed and pack data can be reused when possible. This applies on top of the following patches from Nicolas Pitre: [PATCH] allow for undeltified objects not to be reused [PATCH] make "repack -f" imply "pack-objects --no-reuse-object" Signed-off-by: Dana L. How <danahow@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-09 22:56:50 +02:00			`extern int core_compression_level;`
			`extern int core_compression_seen;`
Fully activate the sliding window pack access. This finally turns on the sliding window behavior for packfile data access by mapping limited size windows and chaining them under the packed_git->windows list. We consider a given byte offset to be within the window only if there would be at least 20 bytes (one hash worth of data) accessible after the requested offset. This range selection relates to the contract that use_pack() makes with its callers, allowing them to access one hash or one object header without needing to call use_pack() for every byte of data obtained. In the worst case scenario we will map the same page of data twice into memory: once at the end of one window and once again at the start of the next window. This duplicate page mapping will happen only when an object header or a delta base reference is spanned over the end of a window and is always limited to just one page of duplication, as no sane operating system will ever have a page size smaller than a hash. I am assuming that the possible wasted page of virtual address space is going to perform faster than the alternatives, which would be to copy the object header or ref delta into a temporary buffer prior to parsing, or to check the window range on every byte during header parsing. We may decide to revisit this decision in the future since this is just a gut instinct decision and has not actually been proven out by experimental testing. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-23 08:34:28 +01:00			`extern size_t packed_git_window_size;`
Introduce new config option for mmap limit. Rather than hardcoding the maximum number of bytes which can be mmapped from pack files we should make this value configurable, allowing the end user to increase or decrease this limit on a per-repository basis depending on the size of the repository and the capabilities of their operating system. In general users should not need to manually tune such a low-level setting within the core code, but being able to artifically limit the number of bytes which we can mmap at once from pack files will make it easier to craft test cases for the new mmap sliding window implementation. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-23 08:33:35 +01:00			`extern size_t packed_git_limit;`
Limit the size of the new delta_base_cache The new configuration variable core.deltaBaseCacheLimit allows the user to control how much memory they are willing to give to Git for caching base objects of deltas. This is not normally meant to be a user tweakable knob; the "out of the box" settings are meant to be suitable for almost all workloads. We default to 16 MiB under the assumption that the cache is not meant to consume all of the user's available memory, and that the cache's main purpose was to cache trees, for faster path limiters during revision traversal. Since trees tend to be relatively small objects, this relatively small limit should still allow a large number of objects. On the other hand we don't want the cache to start storing 200 different versions of a 200 MiB blob, as this could easily blow the entire address space of a 32 bit process. We evict OBJ_BLOB from the cache first (credit goes to Junio) as we want to favor OBJ_TREE within the cache. These are the objects that have the highest inflate() startup penalty, as they tend to be small and thus don't have that much of a chance to ammortize that penalty over the entire data. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-19 06:14:37 +01:00			`extern size_t delta_base_cache_limit;`
Lazy man's auto-CRLF It currently does NOT know about file attributes, so it does its conversion purely based on content. Maybe that is more in the "git philosophy" anyway, since content is king, but I think we should try to do the file attributes to turn it off on demand. Anyway, BY DEFAULT it is off regardless, because it requires a [core] AutoCRLF = true in your config file to be enabled. We could make that the default for Windows, of course, the same way we do some other things (filemode etc). But you can actually enable it on UNIX, and it will cause: - "git update-index" will write blobs without CRLF - "git diff" will diff working tree files without CRLF - "git checkout" will write files to the working tree _with_ CRLF and things work fine. Funnily, it actually shows an odd file in git itself: git clone -n git test-crlf cd test-crlf git config core.autocrlf true git checkout git diff shows a diff for "Documentation/docbook-xsl.css". Why? Because we have actually checked in that file with CRLF! So when "core.autocrlf" is true, we'll always generate a different hash for it in the index, because the index hash will be for the content _without_ CRLF. Is this complete? I dunno. It seems to work for me. It doesn't use the filename at all right now, and that's probably a deficiency (we could certainly make the "is_binary()" heuristics also take standard filename heuristics into account). I don't pass in the filename at all for the "index_fd()" case (git-update-index), so that would need to be passed around, but this actually works fine. NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours truly. I will not guarantee that they work at all reasonable. Caveat emptor. But it _is_ simple, and it _is_ safe, since it's all off by default. The patch is pretty simple - the biggest part is the new "convert.c" file, but even that is really just basic stuff that anybody can write in "Teaching C 101" as a final project for their first class in programming. Not to say that it's bug-free, of course - but at least we're not talking about rocket surgery here. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-13 20:07:23 +01:00			`extern int auto_crlf;`
Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-11 01:31:08 +02:00
Repository format version check. This adds the repository format version code, first done by Martin Atukunda. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-26 00:59:09 +01:00			`#define GIT_REPO_VERSION 0`
			`extern int repository_format_version;`
			`extern int check_repository_format(void);`

Make the cache stat information comparator public. Like the cache filename finder, it's a generically useful function, rather than something specific to the current "show-diff" thing. 2005-04-09 18:48:20 +02:00			`#define MTIME_CHANGED 0x0001`
			`#define CTIME_CHANGED 0x0002`
			`#define OWNER_CHANGED 0x0004`
			`#define MODE_CHANGED 0x0008`
			`#define INODE_CHANGED 0x0010`
			`#define DATA_CHANGED 0x0020`
[PATCH] git and symlinks as tracked content Allow to store and track symlink in the repository. A symlink is stored the same way as a regular file, only with the appropriate mode bits set. The symlink target is therefore stored in a blob object. This will hopefully make our udev repository fully functional. :) Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-05 14:38:25 +02:00			`#define TYPE_CHANGED 0x0040`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
			`/* Return a statically allocated filename matching the sha1 signature */`
[PATCH] -Werror fixes GCC's format __attribute__ is good for checking errors, especially with -Wformat=2 parameter. This fixes most of the reported problems against 2005-08-09 snapshot. 2005-08-09 17:30:22 +02:00			`extern char mkpath(const char fmt, ...) __attribute__((format (printf, 1, 2)));`
			`extern char git_path(const char fmt, ...) __attribute__((format (printf, 1, 2)));`
Add "-R" flag to "diff-tree", so that it will recursively traverse a tree of trees as it diffs them. This makes diff-tree usable again in the new world order. 2005-04-10 23:03:58 +02:00			`extern char sha1_file_name(const unsigned char sha1);`
[PATCH] Functions for managing the set of packs the library is using (whitespace fixed) This adds support for reading an uninstalled index, and installing a pack file that was added while the program was running, as well as functions for determining where to put the file. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-01 02:53:44 +02:00			`extern char sha1_pack_name(const unsigned char sha1);`
			`extern char sha1_pack_index_name(const unsigned char sha1);`
show-branch: optionally use unique prefix as name. git-show-branch acquires two new options. --sha1-name to name commits using the unique prefix of their object names, and --no-name to not to show names at all. This was outlined in <7vk6gpyuyr.fsf@assigned-by-dhcp.cox.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 00:22:48 +02:00			`extern const char find_unique_abbrev(const unsigned char sha1, int);`
Consolidate null_sha1[]. Signed-off-by: Junio C Hamano <junio@twinsun.com> 2005-09-30 23:02:47 +02:00			`extern const unsigned char null_sha1[20];`
make inline is_null_sha1 global Replace sha1 comparisons to null_sha1 with a global inline (which previously an unused static inline in builtin-apply.c) [jc: with a fix from Jonas Fonseca.] Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-15 22:37:19 +02:00			`static inline int is_null_sha1(const unsigned char *sha1)`
			`{`
			`return !memcmp(sha1, null_sha1, 20);`
			`}`
Do not use memcmp(sha1_1, sha1_2, 20) with hardcoded length. Introduces global inline: hashcmp(const unsigned char sha1, const unsigned char sha2) Uses memcmp for comparison and returns the result based on the length of the hash name (a future runtime decision). Acked-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-17 20:54:57 +02:00			`static inline int hashcmp(const unsigned char sha1, const unsigned char sha2)`
			`{`
			`return memcmp(sha1, sha2, 20);`
			`}`
Convert memcpy(a,b,20) to hashcpy(a,b). This abstracts away the size of the hash values when copying them from memory location to memory location, much as the introduction of hashcmp abstracted away hash value comparsion. A few call sites were using char* rather than unsigned char* so I added the cast rather than open hashcpy to be void. This is a reasonable tradeoff as most call sites already use unsigned char and the existing hashcmp is also declared to be unsigned char*. [jc: Splitted the patch to "master" part, to be followed by a patch for merge-recursive.c which is not in "master" yet. Fixed the cast in the latter hunk to combine-diff.c which was wrong in the original. Also converted ones left-over in combine-diff.c, diff-lib.c and upload-pack.c ] Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-23 08:49:00 +02:00			`static inline void hashcpy(unsigned char sha_dst, const unsigned char sha_src)`
			`{`
			`memcpy(sha_dst, sha_src, 20);`
			`}`
Convert memset(hash,0,20) to hashclr(hash). In the same spirit as hashcmp() and hashcpy(). Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-23 22:57:23 +02:00			`static inline void hashclr(unsigned char *hash)`
			`{`
			`memset(hash, 0, 20);`
			`}`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
[PATCH] git: add git_mkstemp() Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 22:43:03 +02:00			`int git_mkstemp(char path, size_t n, const char template);`

shared repository: optionally allow reading to "others". This enhances core.sharedrepository to have additionally specify that read and exec permissions to be given to others as well. It is useful when serving a repository via gitweb and git-daemon that runs as a user outside the project group. The configuration item can take the following values: [core] sharedrepository ; the same as "group" sharedrepository = true ; ditto sharedrepository = 1 ; ditto sharedrepository = group ; allow rwx to group sharedrepository = all ; allow rwx to group, allow rx to other sharedrepository = umask ; not shared - use umask It also extends "git init-db" to take "--shared=all" and friends from the command line. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-10 08:09:49 +02:00			`enum sharedrepo {`
			`PERM_UMASK = 0,`
			`PERM_GROUP,`
			`PERM_EVERYBODY`
			`};`
			`int git_config_perm(const char var, const char value);`
Introduce core.sharedrepository If the config variable 'core.sharedrepository' is set, the directories $GIT_DIR/objects/ $GIT_DIR/objects/?? $GIT_DIR/objects/pack $GIT_DIR/refs $GIT_DIR/refs/heads $GIT_DIR/refs/heads/tags are set group writable (and g+s, since the git group may be not the primary group of all users). Since all files are written as lock files first, and then moved to their destination, they do not have to be group writable. Indeed, if this leads to problems you found a bug. Note that -- as in my first attempt -- the config variable is set in the function which checks the repository format. If this were done in git_default_config instead, a lot of programs would need to be modified to call git_config(git_default_config) first. [jc: git variables should be in environment.c unless there is a compelling reason to do otherwise.] Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-22 23:13:56 +01:00			`int adjust_shared_perm(const char *path);`
[PATCH] clone-pack.c:write_one_ref() - Create leading directories. The function write_one_ref() is passed the list of refs received from the other end, which was obtained by directory traversal under $GIT_DIR/refs; this can contain paths other than what git-init-db prepares and would fail to clone when there is such. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-06 10:11:52 +02:00			`int safe_create_leading_directories(char *path);`
Fix sparse warnings Make some functions static and convert func() function prototypes to to func(void). Fix declaration after statement, missing declaration and redundant declaration warnings. Signed-off-by: Timo Hirvonen <tihirvon@gmail.com> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-21 01:52:52 +01:00			`char enter_repo(char path, int strict);`
Add is_absolute_path() and make_absolute_path() This patch adds convenience functions to work with absolute paths. The function is_absolute_path() should help the efforts to integrate the MinGW fork. Note that make_absolute_path() returns a pointer to a static buffer. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:28:59 +02:00			`static inline int is_absolute_path(const char *path)`
			`{`
			`return path[0] == '/';`
			`}`
			`const char make_absolute_path(const char path);`
[PATCH] clone-pack.c:write_one_ref() - Create leading directories. The function write_one_ref() is passed the list of refs received from the other end, which was obtained by directory traversal under $GIT_DIR/refs; this can contain paths other than what git-init-db prepares and would fail to clone when there is such. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-06 10:11:52 +02:00
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`/* Read and unpack a sha1 file into memory, write memory to a sha1 file */`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`extern int sha1_object_info(const unsigned char , unsigned long );`
			`extern void * read_sha1_file(const unsigned char sha1, enum object_type type, unsigned long *size);`
index-pack: use hash_sha1_file() Use hash_sha1_file() instead of duplicating code to compute object SHA1. While at it make it accept a const pointer. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-20 21:02:09 +01:00			`extern int hash_sha1_file(const void buf, unsigned long len, const char type, unsigned char *sha1);`
[PATCH] Kill a bunch of pointer sign warnings for gcc4 - Raw hashes should be unsigned char. - String functions want signed char. - Hash and compress functions want unsigned char. Signed-off By: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-18 14:14:09 +02:00			`extern int write_sha1_file(void buf, unsigned long len, const char type, unsigned char *return_sha1);`
convert object type handling from a string to a number We currently have two parallel notation for dealing with object types in the code: a string and a numerical value. One of them is obviously redundent, and the most used one requires more stack space and a bunch of strcmp() all over the place. This is an initial step for the removal of the version using a char array found in object reading code paths. The patch is unfortunately large but there is no sane way to split it in smaller parts without breaking the system. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-26 20:55:59 +01:00			`extern int pretend_sha1_file(void , unsigned long, enum object_type, unsigned char );`
[PATCH] Additional functions for the objects database This adds two functions: one to check if an object is present in the local database, and one to add an object to the local database by reading it from a file descriptor and checking its hash. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-24 03:47:23 +02:00
[PATCH] Anal retentive 'const unsigned char *sha1' Make 'sha1' parameters const where possible Signed-off-by: Jason McMullan <jason.mcmullan@timesys.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-03 17:05:39 +02:00			`extern int check_sha1_signature(const unsigned char sha1, void buf, unsigned long size, const char *type);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
[PATCH] Parallelize pulling by ssh This causes ssh-pull to request objects in prefetch() and read then in fetch(), such that it reduces the unpipelined round-trip time. This also makes sha1_write_from_fd() support having a buffer of data which it accidentally read from the fd after the object; this was formerly not a problem, because it would always get a short read at the end of an object, because the next object had not been requested. This is no longer true. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-03 01:46:29 +02:00			`extern int write_sha1_from_fd(const unsigned char sha1, int fd, char buffer,`
			`size_t bufsize, size_t *bufposn);`
[PATCH] write_sha1_to_fd() Add write_sha1_to_fd(), which writes an object to a file descriptor. This includes support for unpacking it and recompressing it. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-11 00:25:38 +02:00			`extern int write_sha1_to_fd(int fd, const unsigned char *sha1);`
Constness tightening for move/link_temp_to_file() Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-01 09:17:47 +02:00			`extern int move_temp_to_file(const char tmpfile, const char filename);`
[PATCH] Additional functions for the objects database This adds two functions: one to check if an object is present in the local database, and one to add an object to the local database by reading it from a file descriptor and checking its hash. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-24 03:47:23 +02:00
pack-objects --unpacked=<existing pack> option. Incremental repack without -a essentially boils down to: rev-list --objects --unpacked --all \| pack-objects $new_pack which picks up all loose objects that are still live and creates a new pack. This implements --unpacked=<existing pack> option to tell the revision walking machinery to pretend as if objects in such a pack are unpacked for the purpose of object listing. With this, we could say: rev-list --objects --unpacked=$active_pack --all \| pack-objects $new_pack instead, to mean "all live loose objects but pretend as if objects that are in this pack are also unpacked". The newly created pack would be perfect for updating $active_pack by replacing it. Since pack-objects now knows how to do the rev-list's work itself internally, you can also write the above example by: pack-objects --unpacked=$active_pack --all $new_pack </dev/null Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-06 11:12:09 +02:00			`extern int has_sha1_pack(const unsigned char sha1, const char *ignore);`
[PATCH] Additional functions for the objects database This adds two functions: one to check if an object is present in the local database, and one to add an object to the local database by reading it from a file descriptor and checking its hash. Signed-Off-By: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-24 03:47:23 +02:00			`extern int has_sha1_file(const unsigned char *sha1);`

[PATCH] Functions for managing the set of packs the library is using (whitespace fixed) This adds support for reading an uninstalled index, and installing a pack file that was added while the program was running, as well as functions for determining where to put the file. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-01 02:53:44 +02:00			`extern int has_pack_file(const unsigned char *sha1);`
			`extern int has_pack_index(const unsigned char *sha1);`

fix signed range problems with hex conversions Make hexval_table[] "const". Also make sure that the accessor function hexval() does not access the table with out-of-range values by declaring its parameter "unsigned char", instead of "unsigned int". With this, gcc can just generate: movzbl (%rdi), %eax movsbl hexval_table(%rax),%edx movzbl 1(%rdi), %eax movsbl hexval_table(%rax),%eax sall $4, %edx orl %eax, %edx for the code to generate a byte from two hex characters. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-30 19:32:19 +02:00			`extern const signed char hexval_table[256];`
			`static inline unsigned int hexval(unsigned char c)`
Make hexval() available to others. builtin-mailinfo.c has its own hexval implementaiton but it can share the table-lookup one recently implemented in sha1_file.c Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-21 01:04:46 +02:00			`{`
			`return hexval_table[c];`
			`}`

Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`/* Convert to/from hex/sha1 representation */`
abbrev cleanup: use symbolic constants The minimum length of abbreviated object name was hardcoded in different places to be 4, risking inconsistencies in the future. Also there were three different "default abbreviation precision". Use two C preprocessor symbols to clean up this mess. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-25 10:03:18 +01:00			`#define MINIMUM_ABBREV 4`
			`#define DEFAULT_ABBREV 7`

Add "get_sha1()" helper function. This allows the programs to use various simplified versions of the SHA1 names, eg just say "HEAD" for the SHA1 pointed to by the .git/HEAD file etc. For example, this commit has been done with git-commit-tree $(git-write-tree) -p HEAD instead of the traditional "$(cat .git/HEAD)" syntax. 2005-05-02 01:36:56 +02:00			`extern int get_sha1(const char str, unsigned char sha1);`
add get_sha1_with_mode get_sha1_with_mode basically behaves as get_sha1. It has an additional parameter for storing the mode of the object. If the mode can not be determined, it stores S_IFINVALID. Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-23 22:55:05 +02:00			`extern int get_sha1_with_mode(const char str, unsigned char sha1, unsigned *mode);`
Make "write_cache()" and friends available as generic routines. This is needed for the change to make "read-tree" just read into the cache (and then you do a "checkout-cache" to update your current dir contents). 2005-04-09 21:09:27 +02:00			`extern int get_sha1_hex(const char hex, unsigned char sha1);`
			`extern char sha1_to_hex(const unsigned char sha1); /* static buffer result! */`
[PATCH] Allow reading "symbolic refs" that point to other refs This extends the ref reading to understand a "symbolic ref": a ref file that starts with "ref: " and points to another ref file, and thus introduces the notion of ref aliases. This is in preparation of allowing HEAD to eventually not be a symlink, but one of these symbolic refs instead. [jc: Linus originally required the prefix to be "ref: " five bytes and nothing else, but I changed it to allow and strip any number of leading whitespaces to match what update-ref.c does.] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-25 18:59:37 +02:00			`extern int read_ref(const char filename, unsigned char sha1);`
Tell between packed, unpacked and symbolic refs. This adds a "int *flag" parameter to resolve_ref() and makes for_each_ref() family to call callback function with an extra "int flag" parameter. They are used to give two bits of information (REF_ISSYMREF and REF_ISPACKED) about the ref. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-21 07:02:01 +02:00			`extern const char resolve_ref(const char path, unsigned char sha1, int, int );`
dwim_ref(): Separate name-to-ref DWIM code out. I'll be using this in another function to figure out what to pass to resolve_ref(). Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-19 10:15:15 +01:00			`extern int dwim_ref(const char str, int len, unsigned char sha1, char **ref);`
log --reflog: use dwim_log Since "git log origin/master" uses dwim_log() to match "refs/remotes/origin/master", it makes sense to do that for "git log --reflog", too. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-09 01:28:23 +01:00			`extern int dwim_log(const char str, int len, unsigned char sha1, char **ref);`
dwim_ref(): Separate name-to-ref DWIM code out. I'll be using this in another function to figure out what to pass to resolve_ref(). Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-19 10:15:15 +01:00
add refname_match() We use at least two rulesets for matching abbreviated refnames with full refnames (starting with 'refs/'). git-rev-parse and git-fetch use slightly different rules. This commit introduces a new function refname_match (const char abbrev_name, const char full_name, const char *rules). abbrev_name is expanded using the rules and matched against full_name. If a match is found the function returns true. rules is a NULL-terminate list of format patterns with "%.s", for example: const char ref_rev_parse_rules[] = { "%.s", "refs/%.s", "refs/tags/%.s", "refs/heads/%.s", "refs/remotes/%.s", "refs/remotes/%.s/HEAD", NULL }; Asterisks are included in the format strings because this is the form required in sha1_name.c. Sharing the list with the functions there is a good idea to avoid duplicating the rules. Hopefully this facilitates unified matching rules in the future. This commit makes the rules used by rev-parse for resolving refs to sha1s available for string comparison. Before this change, the rules were buried in get_sha1() and dwim_ref(). A follow-up commit will refactor the rules used by fetch. refname_match() will be used for matching refspecs in git-send-pack. Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing out that ref_matches_abbrev in remote.c solves a similar problem and care should be taken to avoid confusion. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-11 15:01:46 +01:00			`extern int refname_match(const char abbrev_name, const char full_name, const char **rules);`
			`extern const char *ref_rev_parse_rules[];`
refactor fetch's ref matching to use refname_match() The old rules used by fetch were coded as a series of ifs. The old rules are: 1) match full refname if it starts with "refs/" or matches "HEAD" 2) verify that full refname starts with "refs/" 3) match abbreviated name in "refs/" if it starts with "heads/", "tags/", or "remotes/". 4) match abbreviated name in "refs/heads/" This is replaced by the new rules a) match full refname b) match abbreviated name prefixed with "refs/" c) match abbreviated name prefixed with "refs/heads/" The details of the new rules are different from the old rules. We no longer verify that the full refname starts with "refs/". The new rule (a) matches any full string. The old rules (1) and (2) were stricter. Now, the caller is responsible for using sensible full refnames. This should be the case for the current code. The new rule (b) is less strict than old rule (3). The new rule accepts abbreviated names that start with a non-standard prefix below "refs/". Despite this modifications the new rules should handle all cases as expected. Two tests are added to verify that fetch does not resolve short tags or HEAD in remotes. We may even think about loosening the rules a bit more and unify them with the rev-parse rules. This would be done by replacing ref_ref_fetch_rules with ref_ref_parse_rules. Note, the two new test would break. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-11 15:01:48 +01:00			`extern const char *ref_fetch_rules[];`
add refname_match() We use at least two rulesets for matching abbreviated refnames with full refnames (starting with 'refs/'). git-rev-parse and git-fetch use slightly different rules. This commit introduces a new function refname_match (const char abbrev_name, const char full_name, const char *rules). abbrev_name is expanded using the rules and matched against full_name. If a match is found the function returns true. rules is a NULL-terminate list of format patterns with "%.s", for example: const char ref_rev_parse_rules[] = { "%.s", "refs/%.s", "refs/tags/%.s", "refs/heads/%.s", "refs/remotes/%.s", "refs/remotes/%.s/HEAD", NULL }; Asterisks are included in the format strings because this is the form required in sha1_name.c. Sharing the list with the functions there is a good idea to avoid duplicating the rules. Hopefully this facilitates unified matching rules in the future. This commit makes the rules used by rev-parse for resolving refs to sha1s available for string comparison. Before this change, the rules were buried in get_sha1() and dwim_ref(). A follow-up commit will refactor the rules used by fetch. refname_match() will be used for matching refspecs in git-send-pack. Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing out that ref_matches_abbrev in remote.c solves a similar problem and care should be taken to avoid confusion. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-11 15:01:46 +01:00
add logref support to git-symbolic-ref Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-26 23:26:10 +01:00			`extern int create_symref(const char ref, const char refs_heads_master, const char *logmsg);`
Detached HEAD (experimental) This allows "git checkout v1.4.3" to dissociate the HEAD of repository from any branch. After this point, "git branch" starts reporting that you are not on any branch. You can go back to an existing branch by saying "git checkout master", for example. This is still experimental. While I think it makes sense to allow commits on top of detached HEAD, it is rather dangerous unless you are careful in the current form. Next "git checkout master" will obviously lose what you have done, so we might want to require "git checkout -f" out of a detached HEAD if we find that the HEAD commit is not an ancestor of any other branches. There is no such safety valve implemented right now. On the other hand, the reason the user did not start the ad-hoc work on a new branch with "git checkout -b" was probably because the work was of a throw-away nature, so the convenience of not having that safety valve might be even better. The user, after accumulating some commits on top of a detached HEAD, can always create a new branch with "git checkout -b" not to lose useful work done while the HEAD was detached. We'll see. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-02 08:31:08 +01:00			`extern int validate_headref(const char *ref);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
Introduce "base_name_compare()" helper function This one compares two pathnames that may be partial basenames, not full paths. We need to get the path sorting right, since a directory name will sort as if it had the final '/' at the end. 2005-05-20 18:09:18 +02:00			`extern int base_name_compare(const char name1, int len1, int mode1, const char name2, int len2, int mode2);`
Export "cache_name_compare()" helper function. The "diff-tree" program needs it. 2005-04-09 21:59:11 +02:00			`extern int cache_name_compare(const char name1, int len1, const char name2, int len2);`
Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00
[PATCH] Rename and extend read_tree_with_tree_or_commit_sha1 This patch renames read_tree_with_tree_or_commit_sha1() to read_object_with_reference() and extends it to automatically dereference not just "commit" objects but "tag" objects. With this patch, you can say e.g.: ls-tree $tag read-tree -m $(merge-base $tag $HEAD) $tag $HEAD diff-cache $tag diff-tree $tag $HEAD Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-29 01:42:27 +02:00			`extern void read_object_with_reference(const unsigned char sha1,`
[PATCH] Kill a bunch of pointer sign warnings for gcc4 - Raw hashes should be unsigned char. - String functions want signed char. - Hash and compress functions want unsigned char. Signed-off By: Brian Gerst <bgerst@didntduck.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-05-18 14:14:09 +02:00			`const char *required_type,`
[PATCH] Rename and extend read_tree_with_tree_or_commit_sha1 This patch renames read_tree_with_tree_or_commit_sha1() to read_object_with_reference() and extends it to automatically dereference not just "commit" objects but "tag" objects. With this patch, you can say e.g.: ls-tree $tag read-tree -m $(merge-base $tag $HEAD) $tag $HEAD diff-cache $tag diff-tree $tag $HEAD Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-29 01:42:27 +02:00			`unsigned long *size,`
			`unsigned char *sha1_ret);`
[PATCH] Accept commit in some places when tree is needed. This patch implements read_tree_with_tree_or_commit_sha1(), which can be used when you are interested in reading an unpacked raw tree data but you do not know nor care if the SHA1 you obtained your user is a tree ID or a commit ID. Before this function's introduction, you would have called read_sha1_file(), examined its type, parsed it to call read_sha1_file() again if it is a commit, and verified that the resulting object is a tree. Instead, this function does that for you. It returns NULL if the given SHA1 is not either a tree or a commit. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-04-21 03:06:49 +02:00
Make show_rfc2822_date() just another date output format. These days, show_date() takes a date_mode parameter to specify the output format, and a separate specialized function for dates in E-mails does not make much sense anymore. This retires show_rfc2822_date() function and make it just another date output format. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-14 08:14:52 +02:00			`enum date_mode {`
			`DATE_NORMAL = 0,`
			`DATE_RELATIVE,`
			`DATE_SHORT,`
			`DATE_LOCAL,`
			`DATE_ISO8601,`
			`DATE_RFC2822`
			`};`

show_date(): rename the "relative" parameter to "mode" Now, show_date() can print three different kinds of dates: normal, relative and short (%Y-%m-%s) dates. To achieve this, the "int relative" was changed to "enum date_mode mode", which has three states: DATE_NORMAL, DATE_RELATIVE and DATE_SHORT. Since existing users of show_date() only call it with relative_date being either 0 or 1, and DATE_NORMAL and DATE_RELATIVE having these values, no behaviour is changed. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-27 16:21:04 +01:00			`const char *show_date(unsigned long time, int timezone, enum date_mode mode);`
[PATCH] Return proper error valud from "parse_date()" Right now we don't return any error value at all from parse_date(), and if we can't parse it, we just silently leave the result buffer unchanged. That's fine for the current user, which will always default to the current date, but it's a crappy interface, and we might well be better off with an error message rather than just the default date. So let's change the thing to return a negative value if an error occurs, and the length of the result otherwise (snprintf behaviour: if the buffer is too small, it returns how big it _would_ have been). [ I started looking at this in case we could support date-based revision names. Looks ugly. Would have to parse relative dates.. ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-20 00:53:50 +02:00			`int parse_date(const char date, char buf, int bufsize);`
[PATCH] Do date parsing by hand... ...since everything out there is either strange (libc mktime has issues with timezones) or introduces unnecessary dependencies for people (libcurl). This goes back to the old date parsing, but moves it out into a file of its own, and does the "struct tm" to "seconds since epoch" handling by hand. I grepped through the tz-database and it seems there's one "country" left that has non-60-minute DST: Lord Howe Island. All others dropped that before 1970. 2005-04-30 18:46:49 +02:00			`void datestamp(char *buf, int bufsize);`
git's rev-parse.c function show_datestring presumes gnu date Ok. This is the insane patch to do this. It really isn't very careful, and the reason I call it "approxidate()" will become obvious when you look at the code. It is very liberal in what it accepts, to the point where sometimes the results may not make a whole lot of sense. It accepts "last week" as a date string, by virtue of "last" parsing as the number 1, and it totally ignoring superfluous fluff like "ago", so "last week" ends up being exactly the same thing as "1 week ago". Fine so far. It has strange side effects: "last december" will actually parse as "Dec 1", which actually _does_ turn out right, because it will then notice that it's not December yet, so it will decide that you must be talking about a date last year. So it actually gets it right, but it's kind of for the "wrong" reasons. It also accepts the numbers 1..10 in string format ("one" .. "ten"), so you can do "ten weeks ago" or "ten hours ago" and it will do the right thing. But it will do some really strange thigns too: the string "this will last forever", will not recognize anyting but "last", which is recognized as "1", which since it doesn't understand anything else it will think is the day of the month. So if you do gitk --since="this will last forever" the date will actually parse as the first day of the current month. And it will parse the string "now" as "now", but only because it doesn't understand it at all, and it makes everything relative to "now". Similarly, it doesn't actually parse the "ago" or "from now", so "2 weeks ago" is exactly the same as "2 weeks from now". It's the current date minus 14 days. But hey, it's probably better (and certainly faster) than depending on GNU date. So now you can portably do things like gitk --since="two weeks and three days ago" git log --since="July 5" git-whatchanged --since="10 hours ago" git log --since="last october" and it will actually do exactly what you thought it would do (I think). It will count 17 days backwards, and it will do so even if you don't have GNU date installed. (I don't do "last monday" or similar yet, but I can extend it to that too if people want). It was kind of fun trying to write code that uses such totally relaxed "understanding" of dates yet tries to get it right for the trivial cases. The result should be mixed with a few strange preprocessor tricks, and be submitted for the IOCCC ;) Feel free to try it out, and see how many strange dates it gets right. Or wrong. And if you find some interesting (and valid - not "interesting" as in "strange", but "interesting" as in "I'd be interested in actually doing this) thing it gets wrong - usually by not understanding it and silently just doing some strange things - please holler. Now, as usual this certainly hasn't been getting a lot of testing. But my code always works, no? Linus Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-15 04:29:06 +01:00			`unsigned long approxidate(const char *);`
parse_date_format(): convert a format name to an enum date_mode Factor out the code to parse --date=<format> parameter to revision walkers into a separate function, parse_date_format(). This function is passed a string and converts it to an enum date_format: - "relative" => DATE_RELATIVE - "iso8601" or "iso" => DATE_ISO8601 - "rfc2822" => DATE_RFC2822 - "short" => DATE_SHORT - "local" => DATE_LOCAL - "default" => DATE_NORMAL In the event that none of these strings is found, the function die()s. Signed-off-by: Andy Parkins <andyparkins@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 16:17:31 +02:00			`enum date_mode parse_date_format(const char *format);`
[PATCH] Do date parsing by hand... ...since everything out there is either strange (libc mktime has issues with timezones) or introduces unnecessary dependencies for people (libcurl). This goes back to the old date parsing, but moves it out into a file of its own, and does the "struct tm" to "seconds since epoch" handling by hand. I grepped through the tz-database and it seems there's one "country" left that has non-60-minute DST: Lord Howe Island. All others dropped that before 1970. 2005-04-30 18:46:49 +02:00
Re-fix "builtin-commit: fix --signoff" An earlier fix to the said commit was incomplete; it mixed up the meaning of the flag parameter passed to the internal fmt_ident() function, so this corrects it. git_author_info() and git_committer_info() can be told to issue a warning when no usable user information is found, and optionally can be told to error out. Operations that actually use the information to record a new commit or a tag will still error out, but the caller to leave reflog record will just silently use bogus user information. Not warning on misconfigured user information while writing a reflog entry is somewhat debatable, but it is probably nicer to the users to silently let it pass, because the only information you are losing is who checked out the branch. * git_author_info() and git_committer_info() used to take 1 (positive int) to error out with a warning on misconfiguration; this is now signalled with a symbolic constant IDENT_ERROR_ON_NO_NAME. * These functions used to take -1 (negative int) to warn but continue; this is now signalled with a symbolic constant IDENT_WARN_ON_NO_NAME. * fmt_ident() function implements the above error reporting behaviour common to git_author_info() and git_committer_info(). A symbolic constant IDENT_NO_DATE can be or'ed in to the flag parameter to make it return only the "Name <email@address.xz>". * fmt_name() is a thin wrapper around fmt_ident() that always passes IDENT_ERROR_ON_NO_NAME and IDENT_NO_DATE. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-09 02:32:08 +01:00			`#define IDENT_WARN_ON_NO_NAME 1`
			`#define IDENT_ERROR_ON_NO_NAME 2`
			`#define IDENT_NO_DATE 4`
Delay "empty ident" errors until they really matter. Previous one warned people upfront to encourage fixing their environment early, but some people just use repositories and git tools read-only without making any changes, and in such a case there is not much point insisting on them having a usable ident. This round attempts to move the error until either "git-var" asks for the ident explicitly or "commit-tree" wants to use it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-19 05:31:05 +01:00			`extern const char *git_author_info(int);`
			`extern const char *git_committer_info(int);`
Rename get_ident() to fmt_ident() and make it available to outside This makes the functionality of ident.c::get_ident() available to other callers. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-05 02:50:14 +01:00			`extern const char fmt_ident(const char name, const char email, const char date_str, int);`
Fix --signoff in builtin-commit differently. Introduce fmt_name() specifically meant for formatting the name and email pair, to add signed-off-by value. This reverts parts of 13208572fbe8838fd8835548d7502202d1f7b21d (builtin-commit: fix --signoff) so that an empty datestamp string given to fmt_ident() by mistake will error out as before. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-02 22:43:34 +01:00			`extern const char fmt_name(const char name, const char *email);`
Abstract out the "name <email> date" handling of commit-tree.c We'll want to use it for the tagging too. 2005-07-12 20:49:27 +02:00
Make fiel checkout function available to the git library The merge stuff will want it soon, and we don't want to duplicate all the work.. 2005-06-06 06:59:54 +02:00			`struct checkout {`
			`const char *base_dir;`
			`int base_dir_len;`
			`unsigned force:1,`
			`quiet:1,`
			`not_new:1,`
			`refresh_cache:1;`
			`};`

entry.c: Use const qualifier for 'struct checkout' parameters Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-25 16:18:08 +02:00			`extern int checkout_entry(struct cache_entry ce, const struct checkout state, char *topath);`
Add has_symlink_leading_path() function. When we are applying a patch that creates a blob at a path, or when we are switching from a branch that does not have a blob at the path to another branch that has one, we need to make sure that there is nothing at the path in the working tree, as such a file is a local modification made by the user that would be lost by the operation. Normally, lstat() on the path and making sure ENOENT is returned is good enough for that purpose. However there is a twist. We may be creating a regular file arch/x86_64/boot/Makefile, while removing an existing symbolic link at arch/x86_64/boot that points at existing ../i386/boot directory that has Makefile in it. We always first check without touching filesystem and then perform the actual operation, so when we verify the new file, arch/x86_64/boot/Makefile, does not exist, we haven't removed the symbolic link arc/x86_64/boot symbolic link yet. lstat() on the file sees through the symbolic link and reports the file is there, which is not what we want. The function has_symlink_leading_path() function takes a path, and sees if any of the leading directory component is a symbolic link. When files in a new directory are created, we tend to process them together because both index and tree are sorted. The function takes advantage of this and allows the caller to cache and reuse which symbolic link on the filesystem caused the function to return true. The calling sequence would be: char last_symlink[PATH_MAX]; last_symlink = '\0'; for each index entry { if (!lose) continue; if (lstat(it)) if (errno == ENOENT) ; / happy / else error; else if (has_symlink_leading_path(it, last_symlink)) ; / happy / else error; / would lose local changes */ unlink_entry(it, last_symlink); } Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-12 07:11:07 +02:00			`extern int has_symlink_leading_path(const char name, char last_symlink);`
Make fiel checkout function available to the git library The merge stuff will want it soon, and we don't want to duplicate all the work.. 2005-06-06 06:59:54 +02:00
[PATCH] Expose packed_git and alt_odb. The commands git-fsck-cache and probably git-*-pull needs to have a way to enumerate objects contained in packed GIT archives and alternate object pools. This commit exposes the data structure used to keep track of them from sha1_file.c, and adds a couple of accessor interface functions for use by the enhanced git-fsck-cache command. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:56:57 +02:00			`extern struct alternate_object_database {`
Alternate object pool mechanism updates. It was a mistake to use GIT_ALTERNATE_OBJECT_DIRECTORIES environment variable to specify what alternate object pools to look for missing objects when working with an object database. It is not a property of the process running the git commands, but a property of the object database that is partial and needs other object pools to complete the set of objects it lacks. This patch allows you to have $GIT_OBJECT_DIRECTORY/info/alternates whose contents is in exactly the same format as the environment variable, to let an object database name alternate object pools it depends on. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-15 02:25:57 +02:00			`struct alternate_object_database *next;`
[PATCH] Expose packed_git and alt_odb. The commands git-fsck-cache and probably git-*-pull needs to have a way to enumerate objects contained in packed GIT archives and alternate object pools. This commit exposes the data structure used to keep track of them from sha1_file.c, and adds a couple of accessor interface functions for use by the enhanced git-fsck-cache command. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:56:57 +02:00			`char *name;`
[PATCH] Compilation: zero-length array declaration. ISO C99 (and GCC 3.x or later) lets you write a flexible array at the end of a structure, like this: struct frotz { int xyzzy; char nitfol[]; /* more / }; GCC 2.95 and 2.96 let you to do this with "char nitfol[0]"; unfortunately this is not allowed by ISO C90. This declares such construct like this: struct frotz { int xyzzy; char nitfol[FLEX_ARRAY]; / more */ }; and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and empty for others. If you are using a C90 C compiler, you should be able to override this with CFLAGS=-DFLEX_ARRAY=1 from the command line of "make". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-07 10:33:54 +01:00			`char base[FLEX_ARRAY]; /* more */`
Alternate object pool mechanism updates. It was a mistake to use GIT_ALTERNATE_OBJECT_DIRECTORIES environment variable to specify what alternate object pools to look for missing objects when working with an object database. It is not a property of the process running the git commands, but a property of the object database that is partial and needs other object pools to complete the set of objects it lacks. This patch allows you to have $GIT_OBJECT_DIRECTORY/info/alternates whose contents is in exactly the same format as the environment variable, to let an object database name alternate object pools it depends on. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-15 02:25:57 +02:00			`} *alt_odb_list;`
[PATCH] Expose packed_git and alt_odb. The commands git-fsck-cache and probably git-*-pull needs to have a way to enumerate objects contained in packed GIT archives and alternate object pools. This commit exposes the data structure used to keep track of them from sha1_file.c, and adds a couple of accessor interface functions for use by the enhanced git-fsck-cache command. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:56:57 +02:00			`extern void prepare_alt_odb(void);`

Refactor packed_git to prepare for sliding mmap windows. The idea behind the sliding mmap window pack reader implementation is to have multiple mmap regions active against the same pack file, thereby allowing the process to mmap in only the active/hot sections of the pack and reduce overall virtual address space usage. To implement this we need to refactor the mmap related data (pack_base, pack_use_cnt) out of struct packed_git and move them into a new struct pack_window. We are refactoring the code to support a single struct pack_window per packfile, thereby emulating the prior behavior of mmap'ing the entire pack file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-23 08:33:44 +01:00			`struct pack_window {`
			`struct pack_window *next;`
			`unsigned char *base;`
			`off_t offset;`
			`size_t len;`
			`unsigned int last_used;`
			`unsigned int inuse_cnt;`
			`};`

[PATCH] Expose packed_git and alt_odb. The commands git-fsck-cache and probably git-*-pull needs to have a way to enumerate objects contained in packed GIT archives and alternate object pools. This commit exposes the data structure used to keep track of them from sha1_file.c, and adds a couple of accessor interface functions for use by the enhanced git-fsck-cache command. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:56:57 +02:00			`extern struct packed_git {`
			`struct packed_git *next;`
Refactor packed_git to prepare for sliding mmap windows. The idea behind the sliding mmap window pack reader implementation is to have multiple mmap regions active against the same pack file, thereby allowing the process to mmap in only the active/hot sections of the pack and reduce overall virtual address space usage. To implement this we need to refactor the mmap related data (pack_base, pack_use_cnt) out of struct packed_git and move them into a new struct pack_window. We are refactoring the code to support a single struct pack_window per packfile, thereby emulating the prior behavior of mmap'ing the entire pack file. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-23 08:33:44 +01:00			`struct pack_window *windows;`
Use off_t for index and pack file lengths. Since the index_size and pack_size members of struct packed_git are the lengths of those corresponding files we should use the off_t size of the operating system to store these file lengths, rather than an unsigned long. This would help in the future should we ever resurrect Junio's 64 bit index implementation. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-23 08:33:47 +01:00			`off_t pack_size;`
get rid of num_packed_objects() The coming index format change doesn't allow for the number of objects to be determined from the size of the index file directly. Instead, Let's initialize a field in the packed_git structure with the object count when the index is validated since the count is always known at that point. While at it let's reorder some struct packed_git fields to avoid padding due to needed 64-bit alignment for some of them. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-09 07:06:28 +02:00			`const void *index_data;`
			`size_t index_size;`
			`uint32_t num_objects;`
[PATCH] clean up pack index handling a bit Especially with the new index format to come, it is more appropriate to encapsulate more into check_packed_git_idx() and assume less of the index format in struct packed_git. To that effect, the index_base is renamed to index_data with void * type so it is not used directly but other pointers initialized with it. This allows for a couple pointer cast removal, as well as providing a better generic name to grep for when adding support for new index versions or formats. And index_data is declared const too while at it. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-16 21:42:50 +01:00			`int index_version;`
get rid of num_packed_objects() The coming index format change doesn't allow for the number of objects to be determined from the size of the index file directly. Instead, Let's initialize a field in the packed_git structure with the object count when the index is validated since the count is always known at that point. While at it let's reorder some struct packed_git fields to avoid padding due to needed 64-bit alignment for some of them. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-09 07:06:28 +02:00			`time_t mtime;`
Refactor how we open pack files to prepare for multiple windows. To efficiently support mmaping of multiple regions of the same pack file we want to keep the pack's file descriptor open while we are actively working with that pack. So we are now keeping that file descriptor in packed_git.pack_fd and closing it only after we unmap the last window. This is going to increase the number of file descriptors that are in use at once, however that will be bounded by the total number of pack files present and therefore should not be very high. It is a small tradeoff which we may need to revisit after some testing can be done on various repositories and systems. For code clarity we also want to seperate out the implementation of how we open a pack file from the implementation which locates a suitable window (or makes a new one) from the given pack file. Since this is a rather large delta I'm taking advantage of doing it now, in a fairly isolated change. When we open a pack file we need to examine the header and trailer without having a mmap in place, as we may only need to mmap the middle section of this particular pack. Consequently the verification code has been refactored to make use of the new read_or_die function. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-23 08:34:01 +01:00			`int pack_fd;`
Keep track of whether a pack is local or not If we want to re-pack just local packfiles, we need to know whether a particular object is local or not. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-14 00:38:28 +02:00			`int pack_local;`
[PATCH] Functions for managing the set of packs the library is using (whitespace fixed) This adds support for reading an uninstalled index, and installing a pack file that was added while the program was running, as well as functions for determining where to put the file. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-01 02:53:44 +02:00			`unsigned char sha1[20];`
[PATCH] Compilation: zero-length array declaration. ISO C99 (and GCC 3.x or later) lets you write a flexible array at the end of a structure, like this: struct frotz { int xyzzy; char nitfol[]; /* more / }; GCC 2.95 and 2.96 let you to do this with "char nitfol[0]"; unfortunately this is not allowed by ISO C90. This declares such construct like this: struct frotz { int xyzzy; char nitfol[FLEX_ARRAY]; / more */ }; and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and empty for others. If you are using a C90 C compiler, you should be able to override this with CFLAGS=-DFLEX_ARRAY=1 from the command line of "make". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-07 10:33:54 +01:00			`/* something like ".git/objects/pack/xxxxx.pack" */`
			`char pack_name[FLEX_ARRAY]; /* more */`
[PATCH] Expose packed_git and alt_odb. The commands git-fsck-cache and probably git-*-pull needs to have a way to enumerate objects contained in packed GIT archives and alternate object pools. This commit exposes the data structure used to keep track of them from sha1_file.c, and adds a couple of accessor interface functions for use by the enhanced git-fsck-cache command. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:56:57 +02:00			`} *packed_git;`
[PATCH] verify-pack updates. Nico pointed out that having verify_pack.c and verify-pack.c was confusing. Rename verify_pack.c to pack-check.c as suggested, and enhances the verification done quite a bit. - Built-in sha1_file unpacking knows that a base object of a deltified object _must_ be in the same pack, and takes advantage of that fact. - Earlier verify-pack command only checked the SHA1 sum for the entire pack file and did not look into its contents. It now checks everything idx file claims to have unpacks correctly. - It now has a hook to give more detailed information for objects contained in the pack under -v flag. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-01 02:15:39 +02:00
			`struct pack_entry {`
Use off_t when we really mean a file offset. Not all platforms have declared 'unsigned long' to be a 64 bit value, but we want to support a 64 bit packfile (or close enough anyway) in the near future as some projects are getting large enough that their packed size exceeds 4 GiB. By using off_t, the POSIX type that is declared to mean an offset within a file, we support whatever maximum file size the underlying operating system will handle. For most modern systems this is up around 2^60 or higher. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:30 +01:00			`off_t offset;`
[PATCH] verify-pack updates. Nico pointed out that having verify_pack.c and verify-pack.c was confusing. Rename verify_pack.c to pack-check.c as suggested, and enhances the verification done quite a bit. - Built-in sha1_file unpacking knows that a base object of a deltified object _must_ be in the same pack, and takes advantage of that fact. - Earlier verify-pack command only checked the SHA1 sum for the entire pack file and did not look into its contents. It now checks everything idx file claims to have unpacks correctly. - It now has a hook to give more detailed information for objects contained in the pack under -v flag. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-01 02:15:39 +02:00			`unsigned char sha1[20];`
			`struct packed_git *p;`
			`};`

Merge three separate "fetch refs" functions It really just boils down to one "get_remote_heads()" function, and a common "struct ref" structure definition. 2005-07-16 22:55:50 +02:00			`struct ref {`
			`struct ref *next;`
			`unsigned char old_sha1[20];`
			`unsigned char new_sha1[20];`
Fix warning about bitfield in struct ref cache.h:503: warning: type of bit-field 'force' is a GCC extension cache.h:504: warning: type of bit-field 'merge' is a GCC extension cache.h:505: warning: type of bit-field 'nonfastforward' is a GCC extension cache.h:506: warning: type of bit-field 'deletion' is a GCC extension So we change it to an 'unsigned int' which is not a GCC extension. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 10:31:37 +01:00			`unsigned int force:1,`
			`merge:1,`
			`nonfastforward:1,`
			`deletion:1;`
send-pack: track errors for each ref Instead of keeping the 'ret' variable, we instead have a status flag for each ref that tracks what happened to it. We then print the ref status after all of the refs have been examined. This paves the way for three improvements: - updating tracking refs only for non-error refs - incorporating remote rejection into the printed status - printing errors in a different order than we processed (e.g., consolidating non-ff errors near the end with a special message) Signed-off-by: Jeff King <peff@peff.net> Acked-by: Alex Riesen <raa.lkml@gmail.com> Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-17 13:54:27 +01:00			`enum {`
			`REF_STATUS_NONE = 0,`
			`REF_STATUS_OK,`
			`REF_STATUS_REJECT_NONFASTFORWARD,`
			`REF_STATUS_REJECT_NODELETE,`
			`REF_STATUS_UPTODATE,`
send-pack: assign remote errors to each ref This lets us show remote errors (e.g., a denied hook) along with the usual push output. There is a slightly clever optimization in receive_status that bears explanation. We need to correlate the returned status and our ref objects, which naively could be an O(m*n) operation. However, since the current implementation of receive-pack returns the errors to us in the same order that we sent them, we optimistically look for the next ref to be looked up to come after the last one we have found. So it should be an O(m+n) merge if the receive-pack behavior holds, but we fall back to a correct but slower behavior if it should change. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-17 13:56:03 +01:00			`REF_STATUS_REMOTE_REJECT,`
send-pack: tighten remote error reporting Previously, we set all ref pushes to 'OK', and then marked them as errors if the remote reported so. This has the problem that if the remote dies or fails to report a ref, we just assume it was OK. Instead, we use a new non-OK state to indicate that we are expecting status (if the remote doesn't support the report-status feature, we fall back on the old behavior). Thus we can flag refs for which we expected a status, but got none (conversely, we now also print a warning for refs for which we get a status, but weren't expecting one). This also allows us to simplify the receive_status exit code, since each ref is individually marked with failure until we get a success response. We can just print the usual status table, so the user still gets a sense of what we were trying to do when the failure happened. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 08:16:52 +01:00			`REF_STATUS_EXPECTING_REPORT,`
send-pack: track errors for each ref Instead of keeping the 'ret' variable, we instead have a status flag for each ref that tracks what happened to it. We then print the ref status after all of the refs have been examined. This paves the way for three improvements: - updating tracking refs only for non-error refs - incorporating remote rejection into the printed status - printing errors in a different order than we processed (e.g., consolidating non-ff errors near the end with a special message) Signed-off-by: Jeff King <peff@peff.net> Acked-by: Alex Riesen <raa.lkml@gmail.com> Acked-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-17 13:54:27 +01:00			`} status;`
send-pack: tighten remote error reporting Previously, we set all ref pushes to 'OK', and then marked them as errors if the remote reported so. This has the problem that if the remote dies or fails to report a ref, we just assume it was OK. Instead, we use a new non-OK state to indicate that we are expecting status (if the remote doesn't support the report-status feature, we fall back on the old behavior). Thus we can flag refs for which we expected a status, but got none (conversely, we now also print a warning for refs for which we get a status, but weren't expecting one). This also allows us to simplify the receive_status exit code, since each ref is individually marked with failure until we get a success response. We can just print the usual status table, so the user still gets a sense of what we were trying to do when the failure happened. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 08:16:52 +01:00			`char *remote_status;`
Renaming push. This allows git-send-pack to push local refs to a destination repository under different names. Here is the name mapping rules for refs. * If there is no ref mapping on the command line: - if '--all' is specified, it is equivalent to specifying <local> ":" <local> for all the existing local refs on the command line - otherwise, it is equivalent to specifying <ref> ":" <ref> for all the refs that exist on both sides. * <name> is just a shorthand for <name> ":" <name> * <src> ":" <dst> push ref that matches <src> to ref that matches <dst>. - It is an error if <src> does not match exactly one of local refs. - It is an error if <dst> matches more than one remote refs. - If <dst> does not match any remote refs, either - it has to start with "refs/"; <dst> is used as the destination literally in this case. - <src> == <dst> and the ref that matched the <src> must not exist in the set of remote refs; the ref matched <src> locally is used as the name of the destination. For example, - "git-send-pack --all <remote>" works exactly as before; - "git-send-pack <remote> master:upstream" pushes local master to remote ref that matches "upstream". If there is no such ref, it is an error. - "git-send-pack <remote> master:refs/heads/upstream" pushes local master to remote refs/heads/upstream, even when refs/heads/upstream does not exist. - "git-send-pack <remote> master" into an empty remote repository pushes the local ref/heads/master to the remote ref/heads/master. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-04 01:35:29 +02:00			`struct ref peer_ref; / when renaming */`
[PATCH] Compilation: zero-length array declaration. ISO C99 (and GCC 3.x or later) lets you write a flexible array at the end of a structure, like this: struct frotz { int xyzzy; char nitfol[]; /* more / }; GCC 2.95 and 2.96 let you to do this with "char nitfol[0]"; unfortunately this is not allowed by ISO C90. This declares such construct like this: struct frotz { int xyzzy; char nitfol[FLEX_ARRAY]; / more */ }; and git-compat-util.h defines FLEX_ARRAY to 0 for gcc 2.95 and empty for others. If you are using a C90 C compiler, you should be able to override this with CFLAGS=-DFLEX_ARRAY=1 from the command line of "make". Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-01-07 10:33:54 +01:00			`char name[FLEX_ARRAY]; /* more */`
Merge three separate "fetch refs" functions It really just boils down to one "get_remote_heads()" function, and a common "struct ref" structure definition. 2005-07-16 22:55:50 +02:00			`};`

Improve git-peek-remote This makes git-peek-remote able to basically do everything that git-ls-remote does (but obviously just for the native protocol, so no http[s]: or rsync: support). The default behaviour is the same, but you can now give a mixture of "--refs", "--tags" and "--heads" flags, where "--refs" forces git-peek-remote to only show real refs (ie none of the fakey tag lookups, but also not the special pseudo-refs like HEAD and MERGE_HEAD). The "--tags" and "--heads" flags respectively limit the output to just regular tags and heads, of course. You can still also ask to limit them by name too. You can combine the flags, so git peek-remote --refs --tags . will show all local _true_ tags, without the generated tag lookups (compare the output without the "--refs" flag). And "--tags --heads" will show both tags and heads, but will avoid (for example) any special refs outside of the standard locations. I'm also planning on adding a "--ignore-local" flag that allows us to ask it to ignore any refs that we already have in the local tree, but that's an independent thing. All this is obviously gearing up to making "git fetch" cheaper. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-04 21:29:10 +02:00			`#define REF_NORMAL (1u << 0)`
			`#define REF_HEADS (1u << 1)`
			`#define REF_TAGS (1u << 2)`

make "find_ref_by_name" a public function This was a static in remote.c, but is generally useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 08:13:10 +01:00			`extern struct ref find_ref_by_name(struct ref list, const char *name);`

connect: display connection progress Make git notify the user about host resolution/connection attempts. This is useful both as a progress indicator on slow links, and helps reassure the user there are no firewall problems. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-16 19:09:41 +02:00			`#define CONNECT_VERBOSE (1u << 0)`
Miscellaneous const changes and utilities The list of remote refs in struct transport should be const, because builtin-fetch will get confused if it changes. The url in git_connect should be const (and work on a copy) instead of requiring the caller to copy it. match_refs doesn't modify the refspecs it gets. get_fetch_map and get_remote_ref don't change the list they get. Allow transport get_refs_list methods to modify the struct transport. Add a function to copy a list of refs, when a function needs a mutable copy of a const list. Add a function to check the type of a ref, as per the code in connect.c Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-30 02:05:40 +01:00			`extern struct child_process git_connect(int fd[2], const char url, const char *prog, int flags);`
Change git_connect() to return a struct child_process instead of a pid_t. This prepares the API of git_connect() and finish_connect() to operate on a struct child_process. Currently, we just use that object as a placeholder for the pid that we used to return. A follow-up patch will change the implementation of git_connect() and finish_connect() to make full use of the object. Old code had early-return-on-error checks at the calling sites of git_connect(), but since git_connect() dies on errors anyway, these checks were removed. [sp: Corrected style nit of "conn == NULL" to "!conn"] Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 21:47:53 +02:00			`extern int finish_connect(struct child_process *conn);`
Move ref path matching to connect.c library It's a generic thing for matching refs from the other side. 2005-07-04 22:24:30 +02:00			`extern int path_match(const char path, int nr, char *match);`
Move "get_ack()" to common git_connect functions git-clone-pack will want it too. Soon. 2005-07-06 00:44:09 +02:00			`extern int get_ack(int fd, unsigned char *result_sha1);`
Improve git-peek-remote This makes git-peek-remote able to basically do everything that git-ls-remote does (but obviously just for the native protocol, so no http[s]: or rsync: support). The default behaviour is the same, but you can now give a mixture of "--refs", "--tags" and "--heads" flags, where "--refs" forces git-peek-remote to only show real refs (ie none of the fakey tag lookups, but also not the special pseudo-refs like HEAD and MERGE_HEAD). The "--tags" and "--heads" flags respectively limit the output to just regular tags and heads, of course. You can still also ask to limit them by name too. You can combine the flags, so git peek-remote --refs --tags . will show all local _true_ tags, without the generated tag lookups (compare the output without the "--refs" flag). And "--tags --heads" will show both tags and heads, but will avoid (for example) any special refs outside of the standard locations. I'm also planning on adding a "--ignore-local" flag that allows us to ask it to ignore any refs that we already have in the local tree, but that's an independent thing. All this is obviously gearing up to making "git fetch" cheaper. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-04 21:29:10 +02:00			`extern struct ref get_remote_heads(int in, struct ref list, int nr_match, char **match, unsigned int flags);`
Support receiving server capabilities This patch implements the client side of backward compatible upload-pack protocol extension, <20051027141619.0e8029f2.vsu@altlinux.ru> by Sergey. The updated server can append "server_capabilities" which is supposed to be a string containing space separated features of the server, after one of elements in the initial list of SHA1-refname line, hidden with an embedded NUL. After get_remote_heads(), check if the server supports the feature like if (server_supports("multi_ack")) do_something(); Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-28 04:48:54 +02:00			`extern int server_supports(const char *feature);`
Factor out the ssh connection stuff from send-pack.c I want to use it for git-fetch-pack too. 2005-07-04 20:57:58 +02:00
[PATCH] Functions for managing the set of packs the library is using (whitespace fixed) This adds support for reading an uninstalled index, and installing a pack file that was added while the program was running, as well as functions for determining where to put the file. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-01 02:53:44 +02:00			`extern struct packed_git parse_pack_index(unsigned char sha1);`
[PATCH] Possible cleanups for local-pull.c Hi. This patch contains the following possible cleanups: * Make some needlessly global functions in local-pull.c static * Change 'char ' to 'const char ' where appropriate Signed-off-by: Peter Hagervall <hager@cs.umu.se> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-09-02 14:17:10 +02:00			`extern struct packed_git parse_pack_index_file(const unsigned char sha1,`
[PATCH] clean up pack index handling a bit Especially with the new index format to come, it is more appropriate to encapsulate more into check_packed_git_idx() and assume less of the index format in struct packed_git. To that effect, the index_base is renamed to index_data with void * type so it is not used directly but other pointers initialized with it. This allows for a couple pointer cast removal, as well as providing a better generic name to grep for when adding support for new index versions or formats. And index_data is declared const too while at it. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-16 21:42:50 +01:00			`const char *idx_path);`
[PATCH] Functions for managing the set of packs the library is using (whitespace fixed) This adds support for reading an uninstalled index, and installing a pack file that was added while the program was running, as well as functions for determining where to put the file. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-01 02:53:44 +02:00
[PATCH] Expose packed_git and alt_odb. The commands git-fsck-cache and probably git-*-pull needs to have a way to enumerate objects contained in packed GIT archives and alternate object pools. This commit exposes the data structure used to keep track of them from sha1_file.c, and adds a couple of accessor interface functions for use by the enhanced git-fsck-cache command. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:56:57 +02:00			`extern void prepare_packed_git(void);`
Teach receive-pack how to keep pack files based on object count. Since keeping a pushed pack or exploding it into loose objects should be a local repository decision this teaches receive-pack to decide if it should call unpack-objects or index-pack --stdin --fix-thin based on the setting of receive.unpackLimit and the number of objects contained in the received pack. If the number of objects (hdr_entries) in the received pack is below the value of receive.unpackLimit (which is 5000 by default) then we unpack-objects as we have in the past. If the hdr_entries >= receive.unpackLimit then we call index-pack and ask it to include our pid and hostname in the .keep file to make it easier to identify why a given pack has been kept in the repository. Currently this leaves every received pack as a kept pack. We really don't want that as received packs will tend to be small. Instead we want to delete the .keep file automatically after all refs have been updated. That is being left as room for future improvement. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-11-01 23:06:21 +01:00			`extern void reprepare_packed_git(void);`
[PATCH] Functions for managing the set of packs the library is using (whitespace fixed) This adds support for reading an uninstalled index, and installing a pack file that was added while the program was running, as well as functions for determining where to put the file. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-01 02:53:44 +02:00			`extern void install_packed_git(struct packed_git *pack);`

War on whitespace This uses "git-apply --whitespace=strip" to fix whitespace errors that have crept in to our source files over time. There are a few files that need to have trailing whitespaces (most notably, test vectors). The results still passes the test, and build result in Documentation/ area is unchanged. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-07 09:04:01 +02:00			`extern struct packed_git find_sha1_pack(const unsigned char sha1,`
[PATCH] Functions for managing the set of packs the library is using (whitespace fixed) This adds support for reading an uninstalled index, and installing a pack file that was added while the program was running, as well as functions for determining where to put the file. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-08-01 02:53:44 +02:00			`struct packed_git *packs);`

cache.h; fix a couple of prototypes Trivial patch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-17 07:28:02 +01:00			`extern void pack_report(void);`
Lazily open pack index files on demand In some repository configurations the user may have many packfiles, but all of the recent commits/trees/tags/blobs are likely to be in the most recent packfile (the one with the newest mtime). It is therefore common to be able to complete an entire operation by accessing only one packfile, even if there are 25 packfiles available to the repository. Rather than opening and mmaping the corresponding .idx file for every pack found, we now only open and map the .idx when we suspect there might be an object of interest in there. Of course we cannot known in advance which packfile contains an object, so we still need to scan the entire packed_git list to locate anything. But odds are users want to access objects in the most recently created packfiles first, and that may be all they ever need for the current operation. Junio observed in b867092f that placing recent packfiles before older ones can slightly improve access times for recent objects, without degrading it for historical object access. This change improves upon Junio's observations by trying even harder to avoid the .idx files that we won't need. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-26 07:24:19 +02:00			`extern int open_pack_index(struct packed_git *);`
Use off_t when we really mean a file offset. Not all platforms have declared 'unsigned long' to be a 64 bit value, but we want to support a 64 bit packfile (or close enough anyway) in the near future as some projects are getting large enough that their packed size exceeds 4 GiB. By using off_t, the POSIX type that is declared to mean an offset within a file, we support whatever maximum file size the underlying operating system will handle. For most modern systems this is up around 2^60 or higher. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:30 +01:00			`extern unsigned char* use_pack(struct packed_git , struct pack_window , off_t, unsigned int );`
Fix random fast-import errors when compiled with NO_MMAP fast-import was relying on the fact that on most systems mmap() and write() are synchronized by the filesystem's buffer cache. We were relying on the ability to mmap() 20 bytes beyond the current end of the file, then later fill in those bytes with a future write() call, then read them through the previously obtained mmap() address. This isn't always true with some implementations of NFS, but it is especially not true with our NO_MMAP=YesPlease build time option used on some platforms. If fast-import was built with NO_MMAP=YesPlease we used the malloc()+pread() emulation and the subsequent write() call does not update the trailing 20 bytes of a previously obtained "mmap()" (aka malloc'd) address. Under NO_MMAP that behavior causes unpack_entry() in sha1_file.c to be unable to read an object header (or data) that has been unlucky enough to be written to the packfile at a location such that it is in the trailing 20 bytes of a window previously opened on that same packfile. This bug has gone unnoticed for a very long time as it is highly data dependent. Not only does the object have to be placed at the right position, but it also needs to be positioned behind some other object that has been accessed due to a branch cache invalidation. In other words the stars had to align just right, and if you did run into this bug you probably should also have purchased a lottery ticket. Fortunately the workaround is a lot easier than the bug explanation. Before we allow unpack_entry() to read data from a pack window that has also (possibly) been modified through write() we force all existing windows on that packfile to be closed. By closing the windows we ensure that any new access via the emulated mmap() will reread the packfile, updating to the current file content. This comes at a slight performance degredation as we cannot reuse previously cached windows when we update the packfile. But it is a fairly minor difference as the window closes happen at only two points: - When the packfile is finalized and its .idx is generated: At this stage we are getting ready to update the refs and any data access into the packfile is going to be random, and is going after only the branch tips (to ensure they are valid). Our existing windows (if any) are not likely to be positioned at useful locations to access those final tip commits so we probably were closing them before anyway. - When the branch cache missed and we need to reload: At this point fast-import is getting change commands for the next commit and it needs to go re-read a tree object it previously had written out to the packfile. What windows we had (if any) are not likely to cover the tree in question so we probably were closing them before anyway. We do try to avoid unnecessarily closing windows in the second case by checking to see if the packfile size has increased since the last time we called unpack_entry() on that packfile. If the size has not changed then we have not written additional data, and any existing window is still vaild. This nicely handles the cases where fast-import is going through a branch cache reload and needs to read many trees at once. During such an event we are not likely to be updating the packfile so we do not cycle the windows between reads. With this change in place t9301-fast-export.sh (which was broken by c3b0dec509fe136c5417422f31898b5a4e2d5e02) finally works again. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-18 04:57:00 +01:00			`extern void close_pack_windows(struct packed_git *);`
Replace use_packed_git with window cursors. Part of the implementation concept of the sliding mmap window for pack access is to permit multiple windows per pack to be mapped independently. Since the inuse_cnt is associated with the mmap and not with the file, this value is in struct pack_window and needs to be incremented/decremented for each pack_window accessed by any code. To faciliate that implementation we need to replace all uses of use_packed_git() and unuse_packed_git() with a different API that follows struct pack_window objects rather than struct packed_git. The way this works is when we need to start accessing a pack for the first time we should setup a new window 'cursor' by declaring a local and setting it to NULL: struct pack_windows w_curs = NULL; To obtain the memory region which contains a specific section of the pack file we invoke use_pack(), supplying the address of our current window cursor: unsigned int len; unsigned char addr = use_pack(p, &w_curs, offset, &len); the returned address `addr` will be the first byte at `offset` within the pack file. The optional variable len will also be updated with the number of bytes remaining following the address. Multiple calls to use_pack() with the same window cursor will update the window cursor, moving it from one window to another when necessary. In this way each window cursor variable maintains only one struct pack_window inuse at a time. Finally before exiting the scope which originally declared the window cursor we must invoke unuse_pack() to unuse the current window (which may be different from the one that was first obtained from use_pack): unuse_pack(&w_curs); This implementation is still not complete with regards to multiple windows, as only one window per pack file is supported right now. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-23 08:34:08 +01:00			`extern void unuse_pack(struct pack_window **);`
[PATCH] clean up pack index handling a bit Especially with the new index format to come, it is more appropriate to encapsulate more into check_packed_git_idx() and assume less of the index format in struct packed_git. To that effect, the index_base is renamed to index_data with void * type so it is not used directly but other pointers initialized with it. This allows for a couple pointer cast removal, as well as providing a better generic name to grep for when adding support for new index versions or formats. And index_data is declared const too while at it. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-16 21:42:50 +01:00			`extern struct packed_git add_packed_git(const char , int, int);`
Lazily open pack index files on demand In some repository configurations the user may have many packfiles, but all of the recent commits/trees/tags/blobs are likely to be in the most recent packfile (the one with the newest mtime). It is therefore common to be able to complete an entire operation by accessing only one packfile, even if there are 25 packfiles available to the repository. Rather than opening and mmaping the corresponding .idx file for every pack found, we now only open and map the .idx when we suspect there might be an object of interest in there. Of course we cannot known in advance which packfile contains an object, so we still need to scan the entire packed_git list to locate anything. But odds are users want to access objects in the most recently created packfiles first, and that may be all they ever need for the current operation. Junio observed in b867092f that placing recent packfiles before older ones can slightly improve access times for recent objects, without degrading it for historical object access. This change improves upon Junio's observations by trying even harder to avoid the .idx files that we won't need. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-26 07:24:19 +02:00			`extern const unsigned char nth_packed_object_sha1(struct packed_git , uint32_t);`
Use off_t when we really mean a file offset. Not all platforms have declared 'unsigned long' to be a 64 bit value, but we want to support a 64 bit packfile (or close enough anyway) in the near future as some projects are getting large enough that their packed size exceeds 4 GiB. By using off_t, the POSIX type that is declared to mean an offset within a file, we support whatever maximum file size the underlying operating system will handle. For most modern systems this is up around 2^60 or higher. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:30 +01:00			`extern off_t find_pack_entry_one(const unsigned char , struct packed_git );`
			`extern void unpack_entry(struct packed_git , off_t, enum object_type , unsigned long );`
more lightweight revalidation while reusing deflated stream in packing When copying from an existing pack and when copying from a loose object with new style header, the code makes sure that the piece we are going to copy out inflates well and inflate() consumes the data in full while doing so. The check to see if the xdelta really apply is quite expensive as you described, because you would need to have the image of the base object which can be represented as a delta against something else. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-04 06:09:18 +02:00			`extern unsigned long unpack_object_header_gently(const unsigned char buf, unsigned long len, enum object_type type, unsigned long *sizep);`
add get_size_from_delta() ... which consists of existing code split out of packed_delta_info() for other callers to use it as well. Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-16 18:31:56 +02:00			`extern unsigned long get_size_from_delta(struct packed_git , struct pack_window *, off_t);`
Use off_t when we really mean a file offset. Not all platforms have declared 'unsigned long' to be a 64 bit value, but we want to support a 64 bit packfile (or close enough anyway) in the near future as some projects are getting large enough that their packed size exceeds 4 GiB. By using off_t, the POSIX type that is declared to mean an offset within a file, we support whatever maximum file size the underlying operating system will handle. For most modern systems this is up around 2^60 or higher. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:30 +01:00			`extern const char packed_object_info_detail(struct packed_git , off_t, unsigned long , unsigned long , unsigned int , unsigned char );`
Export matches_pack_name() and fix its return value The function sounds boolean; make it behave as one, not "0 for success, non-zero for failure". Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-17 08:15:19 +02:00			`extern int matches_pack_name(struct packed_git p, const char name);`
[PATCH] Expose packed_git and alt_odb. The commands git-fsck-cache and probably git-*-pull needs to have a way to enumerate objects contained in packed GIT archives and alternate object pools. This commit exposes the data structure used to keep track of them from sha1_file.c, and adds a couple of accessor interface functions for use by the enhanced git-fsck-cache command. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-06-28 23:56:57 +02:00
[PATCH] Add update-server-info. The git-update-server-info command prepares informational files to help clients discover the contents of a repository, and pull from it via a dumb transport protocols. Currently, the following files are produced. - The $repo/info/refs file lists the name of heads and tags available in the $repo/refs/ directory, along with their SHA1. This can be used by git-ls-remote command running on the client side. - The $repo/info/rev-cache file describes the commit ancestry reachable from references in the $repo/refs/ directory. This file is in an append-only binary format to make the server side friendly to rsync mirroring scheme, and can be read by git-show-rev-cache command. - The $repo/objects/info/pack file lists the name of the packs available, the interdependencies among them, and the head commits and tags contained in them. Along with the other two files, this is designed to help clients to make smart pull decisions. The git-receive-pack command is changed to invoke it at the end, so just after a push to a public repository finishes via "git push", the server info is automatically updated. In addition, building of the rev-cache file can be done by a standalone git-build-rev-cache command separately. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org> 2005-07-24 02:54:41 +02:00			`/* Dumb servers support */`
			`extern int update_server_info(int);`

Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-11 01:31:08 +02:00			`typedef int (config_fn_t)(const char , const char *);`
			`extern int git_default_config(const char , const char );`
init-db: check template and repository format. This makes init-db repository version aware. It checks if an existing config file says the repository being reinitialized is of a wrong version and aborts before doing further harm. When copying the templates, it makes sure the they are of the right repository format version. Otherwise the templates are ignored with an warning message. It copies the templates before creating the HEAD, and if the config file is copied from the template directory, reads it, primarily to pick up the value of core.symrefsonly. It changes the way the result of the filemode reliability test is written to the configuration file using git_config_set(). The test is done even if the config file was copied from the templates. And finally, our own repository format version is written to the config file. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-26 01:03:56 +01:00			`extern int git_config_from_file(config_fn_t fn, const char *);`
Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-11 01:31:08 +02:00			`extern int git_config(config_fn_t fn);`
Add functions for parsing integers with size suffixes Split out the nnn{k,m,g} parsing code from git_config_int into git_parse_long, so command-line parameters can enjoy the same functionality. Also add get_parse_ulong for unsigned values. Make git_config_int use git_parse_long, and add get_config_ulong as well. Signed-off-by: Brian Downing <bdowning@lavos.net> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-12 15:32:26 +02:00			`extern int git_parse_long(const char , long );`
			`extern int git_parse_ulong(const char , unsigned long );`
Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-11 01:31:08 +02:00			`extern int git_config_int(const char , const char );`
Add functions for parsing integers with size suffixes Split out the nnn{k,m,g} parsing code from git_config_int into git_parse_long, so command-line parameters can enjoy the same functionality. Also add get_parse_ulong for unsigned values. Make git_config_int use git_parse_long, and add get_config_ulong as well. Signed-off-by: Brian Downing <bdowning@lavos.net> Acked-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-12 15:32:26 +02:00			`extern unsigned long git_config_ulong(const char , const char );`
Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-11 01:31:08 +02:00			`extern int git_config_bool(const char , const char );`
Add functions git_config_set() and git_config_set_multivar() The function git_config_set() does exactly what you think it does. Given a key (in the form "core.filemode") and a value, it sets the key to the value. Example: git_config_set("core.filemode", "true"); The function git_config_set_multivar() is meant for setting variables which can have several values for the same key. Example: [diff] twohead = resolve twohead = recarsive the typo in the second line can be replaced by git_config_set_multivar("diff.twohead", "recursive", "^recar"); The third argument of the function is a POSIX extended regex which has to match the value. If there is no key/value pair with a matching value, a new key/value pair is added. These commands are also capable of unsetting (deleting) entries: git_config_set_multivar("diff.twohead", NULL, "sol"); will delete the entry twohead = resolve Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-17 22:32:36 +01:00			`extern int git_config_set(const char , const char );`
git-config-set: add more options ... namely --replace-all, to replace any amount of matching lines, not just 0 or 1, --get, to get the value of one key, --get-all, the multivar version of --get, and --unset-all, which deletes all matching lines from .git/config Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-20 06:52:22 +01:00			`extern int git_config_set_multivar(const char , const char , const char *, int);`
add a function to rename sections in the config Given a config like this: # A config [very.interesting.section] not The command $ git repo-config --rename-section very.interesting.section bla.1 will lead to this config: # A config [bla "1"] not Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-16 15:14:14 +01:00			`extern int git_config_rename_section(const char , const char );`
Introduce git_etc_gitconfig() that encapsulates access of ETC_GITCONFIG. In a subsequent patch the path to the system-wide config file will be computed. This is a preparation for that change. It turns all accesses of ETC_GITCONFIG into function calls. There is no change in behavior. As a consequence, config.c is the only file that needs the definition of ETC_GITCONFIG. Hence, -DETC_GITCONFIG is removed from the CFLAGS and a special build rule for config.c is introduced. As a side-effect, changing the defintion of ETC_GITCONFIG (e.g. in config.mak) does not trigger a complete rebuild anymore. Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-13 21:05:05 +01:00			`extern const char *git_etc_gitconfig(void);`
Repository format version check. This adds the repository format version code, first done by Martin Atukunda. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-26 00:59:09 +01:00			`extern int check_repository_format_version(const char var, const char value);`
Add ".git/config" file parser This is a first cut at a very simple parser for a git config file. The format of the file is a simple ini-file like thing, with simple variable/value pairs. You can (and should) make the variables have a simple single-level scope, ie a valid file looks something like this: # # This is the config file, and # a '#' or ';' character indicates # a comment # ; core variables [core] ; Don't trust file modes filemode = false ; Our diff algorithm [diff] external = "/usr/local/bin/gnu-diff -u" renames = true which parses into three variables: "core.filemode" is associated with the string "false", and "diff.external" gets the appropriate quoted value. Right now we only react to one variable: "core.filemode" is a boolean that decides if we should care about the 0100 (user-execute) bit of the stat information. Even that is just a parsing demonstration - this doesn't actually implement that st_mode compare logic itself. Different programs can react to different config options, although they should always fall back to calling "git_default_config()" on any config option name that they don't recognize. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-11 01:31:08 +02:00
Use git config file for committer name and email info This starts using the "user.name" and "user.email" config variables if they exist as the default name and email when committing. This means that you don't have to use the GIT_COMMITTER_EMAIL environment variable to override your email - you can just edit the config file instead. The patch looks bigger than it is because it makes the default name and email information non-static and renames it appropriately. And it moves the common git environment variables into a new library file, so that you can link against libgit.a and get the git environment without having to link in zlib and libcrypt. In short, most of it is renaming and moving, the real change core is just a few new lines in "git_default_config()" that copies the user config values to the new base. It also changes "git-var -l" to list the config variables. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-12 03:47:34 +02:00			`#define MAX_GITNAME (1000)`
			`extern char git_default_email[MAX_GITNAME];`
			`extern char git_default_name[MAX_GITNAME];`

Correct new compiler warnings in builtin-revert The new builtin-revert code introduces a few new compiler errors when I'm building with my stricter set of checks enabled in CFLAGS. These all just stem from trying to store a constant string into a non-const char. Simple fix, make the variables const char. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-12 20:33:18 +01:00			`extern const char *git_commit_encoding;`
General const correctness fixes We shouldn't attempt to assign constant strings into char*, as the string is not writable at runtime. Likewise we should always be treating unsigned values as unsigned values, not as signed values. Most of these are very straightforward. The only exception is the (unnecessary) xstrdup/free in builtin-branch.c for the detached head case. Since this is a user-level interactive type program and that particular code path is executed no more than once, I feel that the extra xstrdup call is well worth the easy elimination of this warning. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:17 +01:00			`extern const char *git_log_output_encoding;`
Introduce i18n.commitencoding. This is to hold what the project-local rule as to the charset/encoding for the commit log message is. Lack of it defaults to utf-8. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-11-28 01:09:40 +01:00
Don't fflush(stdout) when it's not helpful This patch arose from a discussion started by Jim Meyering's patch whose intention was to provide better diagnostics for failed writes. Linus proposed a better way to do things, which also had the added benefit that adding a fflush() to git-log-* operations and incremental git-blame operations could improve interactive respose time feel, at the cost of making things a bit slower when we aren't piping the output to a downstream program. This patch skips the fflush() calls when stdout is a regular file, or if the environment variable GIT_FLUSH is set to "0". This latter can speed up a command such as: GIT_FLUSH=0 strace -c -f -e write time git-rev-list HEAD \| wc -l a tiny amount. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-29 19:40:46 +02:00			`/* IO helper functions */`
			`extern void maybe_flush_or_die(FILE , const char );`
pack-objects: Allow use of pre-generated pack. git-pack-objects can reuse pack files stored in $GIT_DIR/pack-cache directory, when a necessary pack is found. This is hopefully useful when upload-pack (called from git-daemon) is expected to receive requests for the same set of objects many times (e.g full cloning request of any project, or updates from the set of heads previous day to the latest for a slow moving project). Currently git-pack-objects does not keep pack files it creates for reusing. It might be useful to add --update-cache option to it, which would allow it store pack files it created in the pack-cache directory, and prune rarely used ones from it. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-10-22 10:28:13 +02:00			`extern int copy_fd(int ifd, int ofd);`
short i/o: fix calls to read to use xread or read_in_full We have a number of badly checked read() calls. Often we are expecting read() to read exactly the size we requested or fail, this fails to handle interrupts or short reads. Add a read_in_full() providing those semantics. Otherwise we at a minimum need to check for EINTR and EAGAIN, where this is appropriate use xread(). Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 16:58:08 +01:00			`extern int read_in_full(int fd, void *buf, size_t count);`
short i/o: clean up the naming for the write_{in,or}_xxx family We recently introduced a write_in_full() which would either write the specified object or emit an error message and fail. In order to fix the read side we now want to introduce a read_in_full() but without an error emit. This patch cleans up the naming of this family of calls: 1) convert the existing write_or_whine() to write_or_whine_pipe() to better indicate its pipe specific nature, 2) convert the existing write_in_full() calls to write_or_whine() to better indicate its nature, 3) introduce a write_in_full() providing a write or fail semantic, and 4) convert write_or_whine() and write_or_whine_pipe() to use write_in_full(). Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 16:57:52 +01:00			`extern int write_in_full(int fd, const void *buf, size_t count);`
Add write_or_die(), a helper function The little helper write_or_die() won't come back with bad news about full disks or broken pipes. It either succeeds or terminates the program, making additional error handling unnecessary. This patch adds the new function and uses it to replace two similar ones (the one in tar-tree originally has been copied from cat-file btw.). I chose to add the fd parameter which both lacked to make write_or_die() just as flexible as write() and thus suitable for lib-ification. There is a regression: error messages emitted by this function don't show the program name, while the replaced two functions did. That's acceptable, I think; a lot of other functions do the same. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-21 20:43:43 +02:00			`extern void write_or_die(int fd, const void *buf, size_t count);`
Trace into a file or an open fd and refactor tracing code. If GIT_TRACE is set to an absolute path (starting with a '/' character), we interpret this as a file path and we trace into it. Also if GIT_TRACE is set to an integer value greater than 1 and lower than 10, we interpret this as an open fd value and we trace into it. Note that this behavior is not compatible with the previous one. We also trace whole messages using one write(2) call to make sure messages from processes do net get mixed up in the middle. This patch makes it possible to get trace information when running "make test". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-02 18:23:48 +02:00			`extern int write_or_whine(int fd, const void buf, size_t count, const char msg);`
short i/o: clean up the naming for the write_{in,or}_xxx family We recently introduced a write_in_full() which would either write the specified object or emit an error message and fail. In order to fix the read side we now want to introduce a read_in_full() but without an error emit. This patch cleans up the naming of this family of calls: 1) convert the existing write_or_whine() to write_or_whine_pipe() to better indicate its pipe specific nature, 2) convert the existing write_in_full() calls to write_or_whine() to better indicate its nature, 3) introduce a write_in_full() providing a write or fail semantic, and 4) convert write_or_whine() and write_or_whine_pipe() to use write_in_full(). Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 16:57:52 +01:00			`extern int write_or_whine_pipe(int fd, const void buf, size_t count, const char msg);`
fetch-pack: -k option to keep downloaded pack. Split out the functions that deal with the socketpair after finishing git protocol handshake to receive the packed data into a separate file, and use it in fetch-pack to keep/explode the received pack data. We earlier had something like that on clone-pack side once, but the list discussion resulted in the decision that it makes sense to always keep the pack for clone-pack, so unpacking option is not enabled on the clone-pack side, but we later still could do so easily if we wanted to with this change. Signed-off-by: Junio C Hamano <junkio@cox.net> 2005-12-15 07:17:38 +01:00
Introduce trivial new pager.c helper infrastructure This introduces the new function void setup_pager(void); to set up output to be written through a pager applocation. All in preparation for doing the simple scripts in C. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-28 20:26:21 +01:00			`/* pager.c */`
			`extern void setup_pager(void);`
Add core.pager config variable. This adds a configuration variable that performs the same function as, but is overridden by, GIT_PAGER. Signed-off-by: Brian Gernhardt <benji@silverinsanity.com> Acked-by: Johannes E. Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-03 20:18:11 +02:00			`extern char *pager_program;`
Support GIT_PAGER_IN_USE environment variable When deciding whether or not to turn on automatic color support, git_config_colorbool checks whether stdout is a tty. However, because we run a pager, if stdout is not a tty, we must check whether it is because we started the pager. This used to be done by checking the pager_in_use variable. This variable was set only when the git program being run started the pager; there was no way for an external program running git indicate that it had already started a pager. This patch allows a program to set GIT_PAGER_IN_USE to a true value to indicate that even though stdout is not a tty, it is because a pager is being used. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-11 07:27:33 +01:00			`extern int pager_in_use(void);`
pager: config variable pager.color enable/disable colored output when the pager is in use Signed-off-by: Matthias Lederhofer <matled@gmx.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-07-30 00:27:43 +02:00			`extern int pager_use_color;`
Introduce trivial new pager.c helper infrastructure This introduces the new function void setup_pager(void); to set up output to be written through a pager applocation. All in preparation for doing the simple scripts in C. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-02-28 20:26:21 +01:00
launch_editor(): Heed GIT_EDITOR and core.editor settings In the commit 'Add GIT_EDITOR environment and core.editor configuration variables', this was done for the shell scripts. Port it over to builtin-tag's version of launch_editor(), which is just about to be refactored into editor.c. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-20 14:06:09 +02:00			`extern char *editor_program;`
core.excludesfile clean-up There are inconsistencies in the way commands currently handle the core.excludesfile configuration variable. The problem is the variable is too new to be noticed by anything other than git-add and git-status. * git-ls-files does not notice any of the "ignore" files by default, as it predates the standardized set of ignore files. The calling scripts established the convention to use .git/info/exclude, .gitignore, and later core.excludesfile. * git-add and git-status know about it because they call add_excludes_from_file() directly with their own notion of which standard set of ignore files to use. This is just a stupid duplication of code that need to be updated every time the definition of the standard set of ignore files is changed. * git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. * git-merge-recursive uses hardcoded per-directory .gitignore and nothing else. git-clean (scripted version) does not honor core.* because its call to underlying ls-files does not know about it. git-clean in C (parked in 'pu') doesn't either. We probably could change git-ls-files to use the standard set when no excludes are specified on the command line and ignore processing was asked, or something like that, but that will be a change in semantics and might break people's scripts in a subtle way. I am somewhat reluctant to make such a change. On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 09:05:00 +01:00			`extern char *excludes_file;`
launch_editor(): Heed GIT_EDITOR and core.editor settings In the commit 'Add GIT_EDITOR environment and core.editor configuration variables', this was done for the shell scripts. Port it over to builtin-tag's version of launch_editor(), which is just about to be refactored into editor.c. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-07-20 14:06:09 +02:00
binary patch. This adds "binary patch" to the diff output and teaches apply what to do with them. On the diff generation side, traditionally, we said "Binary files differ\n" without giving anything other than the preimage and postimage object name on the index line. This was good enough for applying a patch generated from your own repository (very useful while rebasing), because the postimage would be available in such a case. However, this was not useful when the recipient of such a patch via e-mail were to apply it, even if the preimage was available. This patch allows the diff to generate "binary" patch when operating under --full-index option. The binary patch follows the usual extended git diff headers, and looks like this: "GIT binary patch\n" <length byte><data>"\n" ... "\n" Each line is prefixed with a "length-byte", whose value is upper or lowercase alphabet that encodes number of bytes that the data on the line decodes to (1..52 -- 'A' means 1, 'B' means 2, ..., 'Z' means 26, 'a' means 27, ...). <data> is 1 or more groups of 5-byte sequence, each of which encodes up to 4 bytes in base85 encoding. Because 52 / 4 * 5 = 65 and we have the length byte, an output line is capped to 66 characters. The payload is the same diff-delta as we use in the packfiles. On the consumption side, git-apply now can decode and apply the binary patch when --allow-binary-replacement is given, the diff was generated with --full-index, and the receiving repository has the preimage blob, which is the same condition as it always required when accepting an "Binary files differ\n" patch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-05 01:51:44 +02:00			`/* base85 */`
(encode_85, decode_85): Mark source buffer pointer as "const". Signed-off-by: Jim Meyering <jim@meyering.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 00:56:33 +02:00			`int decode_85(char dst, const char line, int linelen);`
			`void encode_85(char buf, const unsigned char data, int bytes);`
binary patch. This adds "binary patch" to the diff output and teaches apply what to do with them. On the diff generation side, traditionally, we said "Binary files differ\n" without giving anything other than the preimage and postimage object name on the index line. This was good enough for applying a patch generated from your own repository (very useful while rebasing), because the postimage would be available in such a case. However, this was not useful when the recipient of such a patch via e-mail were to apply it, even if the preimage was available. This patch allows the diff to generate "binary" patch when operating under --full-index option. The binary patch follows the usual extended git diff headers, and looks like this: "GIT binary patch\n" <length byte><data>"\n" ... "\n" Each line is prefixed with a "length-byte", whose value is upper or lowercase alphabet that encodes number of bytes that the data on the line decodes to (1..52 -- 'A' means 1, 'B' means 2, ..., 'Z' means 26, 'a' means 27, ...). <data> is 1 or more groups of 5-byte sequence, each of which encodes up to 4 bytes in base85 encoding. Because 52 / 4 * 5 = 65 and we have the length byte, an output line is capped to 66 characters. The payload is the same diff-delta as we use in the packfiles. On the consumption side, git-apply now can decode and apply the binary patch when --allow-binary-replacement is given, the diff was generated with --full-index, and the receiving repository has the preimage blob, which is the same condition as it always required when accepting an "Binary files differ\n" patch. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-05 01:51:44 +02:00
Add specialized object allocator This creates a simple specialized object allocator for basic objects. This avoids wasting space with malloc overhead (metadata and extra alignment), since the specialized allocator knows the alignment, and that objects, once allocated, are never freed. It also allows us to track some basic statistics about object allocations. For example, for the mozilla import, it shows object usage as follows: blobs: 627629 (14710 kB) trees: 1119035 (34969 kB) commits: 196423 (8440 kB) tags: 1336 (46 kB) and the simpler allocator shaves off about 2.5% off the memory footprint off a "git-rev-list --all --objects", and is a bit faster too. [ Side note: this concludes the series of "save memory in object storage". The thing is, there simply isn't much more to be saved on the objects. Doing "git-rev-list --all --objects" on the mozilla archive has a final total RSS of 131498 pages for me: that's about 513MB. Of that, the object overhead is now just 56MB, the rest is going somewhere else (put another way: the fact that this patch shaves off 2.5% of the total memory overhead, considering that objects are now not much more than 10% of the total shows how big the wasted space really was: this makes object allocations much more memory- and time-efficient). I haven't looked at where the rest is, but I suspect the bulk of it is just the pack-file loading. It may be that we should pack the tree objects separately from the blob objects: for git-rev-list --objects, we don't actually ever need to even look at the blobs, but since trees and blobs are interspersed in the pack-file, we end up not being dense in the tree accesses, so we end up looking at more pages than we strictly need to. So with a 535MB pack-file, it's entirely possible - even likely - that most of the remaining RSS is just the mmap of the pack-file itself. We don't need to map in _all_ of it, but we do end up mapping a fair amount. ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-19 19:44:15 +02:00			`/* alloc.c */`
Clean up object creation to use more common code This replaces the fairly odd "created_object()" function that did _most_ of the object setup with a more complete "create_object()" function that also has a more natural calling convention. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-17 07:11:43 +02:00			`extern void *alloc_blob_node(void);`
			`extern void *alloc_tree_node(void);`
			`extern void *alloc_commit_node(void);`
			`extern void *alloc_tag_node(void);`
			`extern void *alloc_object_node(void);`
Add specialized object allocator This creates a simple specialized object allocator for basic objects. This avoids wasting space with malloc overhead (metadata and extra alignment), since the specialized allocator knows the alignment, and that objects, once allocated, are never freed. It also allows us to track some basic statistics about object allocations. For example, for the mozilla import, it shows object usage as follows: blobs: 627629 (14710 kB) trees: 1119035 (34969 kB) commits: 196423 (8440 kB) tags: 1336 (46 kB) and the simpler allocator shaves off about 2.5% off the memory footprint off a "git-rev-list --all --objects", and is a bit faster too. [ Side note: this concludes the series of "save memory in object storage". The thing is, there simply isn't much more to be saved on the objects. Doing "git-rev-list --all --objects" on the mozilla archive has a final total RSS of 131498 pages for me: that's about 513MB. Of that, the object overhead is now just 56MB, the rest is going somewhere else (put another way: the fact that this patch shaves off 2.5% of the total memory overhead, considering that objects are now not much more than 10% of the total shows how big the wasted space really was: this makes object allocations much more memory- and time-efficient). I haven't looked at where the rest is, but I suspect the bulk of it is just the pack-file loading. It may be that we should pack the tree objects separately from the blob objects: for git-rev-list --objects, we don't actually ever need to even look at the blobs, but since trees and blobs are interspersed in the pack-file, we end up not being dense in the tree accesses, so we end up looking at more pages than we strictly need to. So with a 535MB pack-file, it's entirely possible - even likely - that most of the remaining RSS is just the mmap of the pack-file itself. We don't need to map in _all_ of it, but we do end up mapping a fair amount. ] Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-06-19 19:44:15 +02:00			`extern void alloc_report(void);`

Trace into a file or an open fd and refactor tracing code. If GIT_TRACE is set to an absolute path (starting with a '/' character), we interpret this as a file path and we trace into it. Also if GIT_TRACE is set to an integer value greater than 1 and lower than 10, we interpret this as an open fd value and we trace into it. Note that this behavior is not compatible with the previous one. We also trace whole messages using one write(2) call to make sure messages from processes do net get mixed up in the middle. This patch makes it possible to get trace information when running "make test". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-02 18:23:48 +02:00			`/* trace.c */`
			`extern void trace_printf(const char *format, ...);`
Trace and quote with argv: get rid of unneeded count argument. Now that str_buf takes care of all the allocations, there is no more gain to pass an argument count. So this patch removes the "count" argument from: - "sq_quote_argv" - "trace_argv_printf" and all the callers. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-03 05:51:50 +01:00			`extern void trace_argv_printf(const char *argv, const char format, ...);`
Trace into a file or an open fd and refactor tracing code. If GIT_TRACE is set to an absolute path (starting with a '/' character), we interpret this as a file path and we trace into it. Also if GIT_TRACE is set to an integer value greater than 1 and lower than 10, we interpret this as an open fd value and we trace into it. Note that this behavior is not compatible with the previous one. We also trace whole messages using one write(2) call to make sure messages from processes do net get mixed up in the middle. This patch makes it possible to get trace information when running "make test". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-02 18:23:48 +02:00
Lazy man's auto-CRLF It currently does NOT know about file attributes, so it does its conversion purely based on content. Maybe that is more in the "git philosophy" anyway, since content is king, but I think we should try to do the file attributes to turn it off on demand. Anyway, BY DEFAULT it is off regardless, because it requires a [core] AutoCRLF = true in your config file to be enabled. We could make that the default for Windows, of course, the same way we do some other things (filemode etc). But you can actually enable it on UNIX, and it will cause: - "git update-index" will write blobs without CRLF - "git diff" will diff working tree files without CRLF - "git checkout" will write files to the working tree _with_ CRLF and things work fine. Funnily, it actually shows an odd file in git itself: git clone -n git test-crlf cd test-crlf git config core.autocrlf true git checkout git diff shows a diff for "Documentation/docbook-xsl.css". Why? Because we have actually checked in that file with CRLF! So when "core.autocrlf" is true, we'll always generate a different hash for it in the index, because the index hash will be for the content _without_ CRLF. Is this complete? I dunno. It seems to work for me. It doesn't use the filename at all right now, and that's probably a deficiency (we could certainly make the "is_binary()" heuristics also take standard filename heuristics into account). I don't pass in the filename at all for the "index_fd()" case (git-update-index), so that would need to be passed around, but this actually works fine. NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours truly. I will not guarantee that they work at all reasonable. Caveat emptor. But it _is_ simple, and it _is_ safe, since it's all off by default. The patch is pretty simple - the biggest part is the new "convert.c" file, but even that is really just basic stuff that anybody can write in "Teaching C 101" as a final project for their first class in programming. Not to say that it's bug-free, of course - but at least we're not talking about rocket surgery here. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-13 20:07:23 +01:00			`/* convert.c */`
Rewrite convert_to_{git,working_tree} to use strbuf's. * Now, those functions take an "out" strbuf argument, where they store their result if any. In that case, it also returns 1, else it returns 0. * those functions support "in place" editing, in the sense that it's OK to call them this way: convert_to_git(path, sb->buf, sb->len, sb); When doable, conversions are done in place for real, else the strbuf content is just replaced with the new one, transparentely for the caller. If you want to create a new filter working this way, being the accumulation of filter1, filter2, ... filtern, then your meta_filter would be: int meta_filter(..., const char src, size_t len, struct strbuf sb) { int ret = 0; ret \|= filter1(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } ret \|= filter2(...., src, len, sb); if (ret) { src = sb->buf; len = sb->len; } .... return ret \| filtern(..., src, len, sb); } That's why subfilters the convert_to_* functions called were also rewritten to work this way. Signed-off-by: Pierre Habouzit <madcoder@debian.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-16 15:51:04 +02:00			`/* returns 1 if dst was used /`
			`extern int convert_to_git(const char path, const char src, size_t len, struct strbuf *dst);`
			`extern int convert_to_working_tree(const char path, const char src, size_t len, struct strbuf *dst);`
Lazy man's auto-CRLF It currently does NOT know about file attributes, so it does its conversion purely based on content. Maybe that is more in the "git philosophy" anyway, since content is king, but I think we should try to do the file attributes to turn it off on demand. Anyway, BY DEFAULT it is off regardless, because it requires a [core] AutoCRLF = true in your config file to be enabled. We could make that the default for Windows, of course, the same way we do some other things (filemode etc). But you can actually enable it on UNIX, and it will cause: - "git update-index" will write blobs without CRLF - "git diff" will diff working tree files without CRLF - "git checkout" will write files to the working tree _with_ CRLF and things work fine. Funnily, it actually shows an odd file in git itself: git clone -n git test-crlf cd test-crlf git config core.autocrlf true git checkout git diff shows a diff for "Documentation/docbook-xsl.css". Why? Because we have actually checked in that file with CRLF! So when "core.autocrlf" is true, we'll always generate a different hash for it in the index, because the index hash will be for the content _without_ CRLF. Is this complete? I dunno. It seems to work for me. It doesn't use the filename at all right now, and that's probably a deficiency (we could certainly make the "is_binary()" heuristics also take standard filename heuristics into account). I don't pass in the filename at all for the "index_fd()" case (git-update-index), so that would need to be passed around, but this actually works fine. NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours truly. I will not guarantee that they work at all reasonable. Caveat emptor. But it _is_ simple, and it _is_ safe, since it's all off by default. The patch is pretty simple - the biggest part is the new "convert.c" file, but even that is really just basic stuff that anybody can write in "Teaching C 101" as a final project for their first class in programming. Not to say that it's bug-free, of course - but at least we're not talking about rocket surgery here. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-13 20:07:23 +01:00
Fix add_files_to_cache() to take pathspec, not user specified list of files This separates the logic to limit the extent of change to the index by where you are (controlled by "prefix") and what you specify from the command line (controlled by "pathspec"). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 10:12:04 +01:00			`/* add */`
			`void add_files_to_cache(int verbose, const char prefix, const char *pathspec);`

git-diff: resurrect the traditional empty "diff --git" behaviour The warning message to suggest "Consider running git-status" from "git-diff" that we experimented with during the 1.5.3 cycle turns out to be a bad idea. It robbed cache-dirty information from people who valued it, while still asking users to run "update-index --refresh". It was hoped that the new behaviour would at least have some educational value, but not showing the cache-dirty paths like before meant that the user would not even know easily which paths were cache-dirty, and it made the need to refresh the index look like even more unnecessary chore. This commit reinstates the traditional behaviour, but with a twist. By default, the empty "diff --git" output is totally squelched out from "git diff" output. At the end of the command, it automatically runs "update-index --refresh" as needed, without even bothering the user. In other words, people who do not care about the cache-dirtyness do not even have to see the warning. The traditional behaviour to see the stat-dirty output and to bypassing the overhead of content comparison can be specified by setting the configuration variable diff.autorefreshindex to false. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-31 22:13:42 +02:00			`/* diff.c */`
			`extern int diff_auto_refresh_index;`

A new merge stragety 'subtree'. This merge strategy largely piggy-backs on git-merge-recursive. When merging trees A and B, if B corresponds to a subtree of A, B is first adjusted to match the tree structure of A, instead of reading the trees at the same level. This adjustment is also done to the common ancestor tree. If you are pulling updates from git-gui repository into git.git repository, the root level of the former corresponds to git-gui/ subdirectory of the latter. The tree object of git-gui's toplevel is wrapped in a fake tree object, whose sole entry has name 'git-gui' and records object name of the true tree, before being used by the 3-way merge code. If you are merging the other way, only the git-gui/ subtree of git.git is extracted and merged into git-gui's toplevel. The detection of corresponding subtree is done by comparing the pathnames and types in the toplevel of the tree. Heuristics galore! That's the git way ;-). Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-02-16 01:32:45 +01:00			`/* match-trees.c */`
			`void shift_tree(const unsigned char , const unsigned char , unsigned char *, int);`

War on whitespace: first, a bit of retreat. This introduces core.whitespace configuration variable that lets you specify the definition of "whitespace error". Currently there are two kinds of whitespace errors defined: * trailing-space: trailing whitespaces at the end of the line. * space-before-tab: a SP appears immediately before HT in the indent part of the line. You can specify the desired types of errors to be detected by listing their names (unique abbreviations are accepted) separated by comma. By default, these two errors are always detected, as that is the traditional behaviour. You can disable detection of a particular type of error by prefixing a '-' in front of the name of the error, like this: [core] whitespace = -trailing-space This patch teaches the code to output colored diff with DIFF_WHITESPACE color to highlight the detected whitespace errors to honor the new configuration. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-02 08:24:27 +01:00			`/*`
			`* whitespace rules.`
			`* used by both diff and apply`
			`*/`
			`#define WS_TRAILING_SPACE 01`
			`#define WS_SPACE_BEFORE_TAB 02`
git-diff: complain about >=8 consecutive spaces in initial indent This introduces a new whitespace error type, "indent-with-non-tab". The error is about starting a line with 8 or more SP, instead of indenting it with a HT. This is not enabled by default, as some projects employ an indenting policy to use only SPs and no HTs. The kernel folks and git contributors may want to enable this detection with: [core] whitespace = indent-with-non-tab Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-03 03:00:27 +02:00			`#define WS_INDENT_WITH_NON_TAB 04`
War on whitespace: first, a bit of retreat. This introduces core.whitespace configuration variable that lets you specify the definition of "whitespace error". Currently there are two kinds of whitespace errors defined: * trailing-space: trailing whitespaces at the end of the line. * space-before-tab: a SP appears immediately before HT in the indent part of the line. You can specify the desired types of errors to be detected by listing their names (unique abbreviations are accepted) separated by comma. By default, these two errors are always detected, as that is the traditional behaviour. You can disable detection of a particular type of error by prefixing a '-' in front of the name of the error, like this: [core] whitespace = -trailing-space This patch teaches the code to output colored diff with DIFF_WHITESPACE color to highlight the detected whitespace errors to honor the new configuration. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-02 08:24:27 +01:00			`#define WS_DEFAULT_RULE (WS_TRAILING_SPACE\|WS_SPACE_BEFORE_TAB)`
Use gitattributes to define per-path whitespace rule The `core.whitespace` configuration variable allows you to define what `diff` and `apply` should consider whitespace errors for all paths in the project (See gitlink:git-config[1]). This attribute gives you finer control per path. For example, if you have these in the .gitattributes: frotz whitespace nitfol -whitespace xyzzy whitespace=-trailing all types of whitespace problems known to git are noticed in path 'frotz' (i.e. diff shows them in diff.whitespace color, and apply warns about them), no whitespace problem is noticed in path 'nitfol', and the default types of whitespace problems except "trailing whitespace" are noticed for path 'xyzzy'. A project with mixed Python and C might want to have: .c whitespace .py whitespace=-indent-with-non-tab in its toplevel .gitattributes file. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-06 09:14:14 +01:00			`extern unsigned whitespace_rule_cfg;`
			`extern unsigned whitespace_rule(const char *);`
			`extern unsigned parse_whitespace_rule(const char *);`
Unify whitespace checking This commit unifies three separate places where whitespace checking was performed: - the whitespace checking previously done in builtin-apply.c is extracted into a function in ws.c - the equivalent logic in "git diff" is removed - the emit_line_with_ws() function is also removed because that also rechecks the whitespace, and its functionality is rolled into ws.c The new function is called check_and_emit_line() and it does two things: checks a line for whitespace errors and optionally emits it. The checking is based on lines of content rather than patch lines (in other words, the caller must strip the leading "+" or "-"); this was suggested by Junio on the mailing list to allow for a future extension to "git show" to display whitespace errors in blobs. At the same time we teach it to report all classes of whitespace errors found for a given line rather than reporting only the first found error. Signed-off-by: Wincent Colaiuta <win@wincent.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-13 14:32:29 +01:00			`extern unsigned check_and_emit_line(const char *line, int len, unsigned ws_rule,`
			`FILE stream, const char set,`
			`const char reset, const char ws);`
			`extern char *whitespace_error_string(unsigned ws);`
War on whitespace: first, a bit of retreat. This introduces core.whitespace configuration variable that lets you specify the definition of "whitespace error". Currently there are two kinds of whitespace errors defined: * trailing-space: trailing whitespaces at the end of the line. * space-before-tab: a SP appears immediately before HT in the indent part of the line. You can specify the desired types of errors to be detected by listing their names (unique abbreviations are accepted) separated by comma. By default, these two errors are always detected, as that is the traditional behaviour. You can disable detection of a particular type of error by prefixing a '-' in front of the name of the error, like this: [core] whitespace = -trailing-space This patch teaches the code to output colored diff with DIFF_WHITESPACE color to highlight the detected whitespace errors to honor the new configuration. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-02 08:24:27 +01:00
Export three helper functions from ls-files This exports three helper functions from ls-files. * pathspec_match() checks if a given path matches a set of pathspecs and optionally records which pathspec was used. This function used to be called "match()" but renamed to be a bit less vague. * report_path_error() takes a set of pathspecs and the record pathspec_match() above leaves, and gives error message. This was split out of the main function of ls-files. * overlay_tree_on_cache() takes a tree-ish (typically "HEAD") and overlays it on the current in-core index. By iterating over the resulting index, the caller can find out the paths in either the index or the HEAD. This function used to be called "overlay_tree()" but renamed to be a bit more descriptive. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 10:13:32 +01:00			`/* ls-files */`
			`int pathspec_match(const char *spec, char matched, const char *filename, int skiplen);`
			`int report_path_error(const char ps_matched, const char *pathspec, int prefix_offset);`
			`void overlay_tree_on_cache(const char tree_name, const char prefix);`

Initial revision of "git", the information manager from hell 2005-04-08 00:13:13 +02:00			`#endif /* CACHE_H */`