mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-14 13:13:01 +01:00

840 lines

20 KiB

C

Raw Normal View History

libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`/*`
			`* This handles recursive filename detection with exclude`
			`* files, index knowledge etc..`
			`*`
			`* Copyright (C) Linus Torvalds, 2005-2006`
			`* Junio Hamano, 2005-2006`
			`*/`
			`#include "cache.h"`
			`#include "dir.h"`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`#include "refs.h"`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`struct path_simplify {`
			`int len;`
			`const char *path;`
			`};`

Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`static int read_directory_recursive(struct dir_struct *dir,`
			`const char path, const char base, int baselen,`
			`int check_only, const struct path_simplify *simplify);`
gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`static int get_dtype(struct dirent de, const char path);`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`int common_prefix(const char **pathspec)`
			`{`
			`const char path, slash, *next;`
			`int prefix;`

			`if (!pathspec)`
			`return 0;`

			`path = *pathspec;`
			`slash = strrchr(path, '/');`
			`if (!slash)`
			`return 0;`

			`prefix = slash - path + 1;`
			`while ((next = *++pathspec) != NULL) {`
			`int len = strlen(next);`
dir.c(common_prefix): Fix two bugs The function common_prefix() is used to find the common subdirectory of a couple of pathnames. When checking if the next pathname matches up with the prefix, it incorrectly checked the whole path, not just the prefix (including the slash). Thus, the expensive part of the loop was executed always. The other bug is more serious: if the first and the last pathname in the list have a longer common prefix than the common prefix for _all_ pathnames in the list, the longer one would be chosen. This bug was probably hidden by the fact that bash's wildcard expansion sorts the results, and the code just so happens to work with sorted input. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-23 10:21:25 +02:00			`if (len >= prefix && !memcmp(path, next, prefix))`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`continue;`
dir.c(common_prefix): Fix two bugs The function common_prefix() is used to find the common subdirectory of a couple of pathnames. When checking if the next pathname matches up with the prefix, it incorrectly checked the whole path, not just the prefix (including the slash). Thus, the expensive part of the loop was executed always. The other bug is more serious: if the first and the last pathname in the list have a longer common prefix than the common prefix for _all_ pathnames in the list, the longer one would be chosen. This bug was probably hidden by the fact that bash's wildcard expansion sorts the results, and the code just so happens to work with sorted input. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-23 10:21:25 +02:00			`len = prefix - 1;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`for (;;) {`
			`if (!len)`
			`return 0;`
			`if (next[--len] != '/')`
			`continue;`
			`if (memcmp(path, next, len+1))`
			`continue;`
			`prefix = len + 1;`
			`break;`
			`}`
			`}`
			`return prefix;`
			`}`

Optimize match_pathspec() to avoid fnmatch() "git add " is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(nm), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add " is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything expensive until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-19 23:22:38 +02:00			`static inline int special_char(unsigned char c1)`
			`{`
			`return !c1 \|\| c1 == '*' \|\| c1 == '[' \|\| c1 == '?';`
			`}`

match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`/*`
			`* Does 'match' matches the given name?`
			`* A match is found if`
			`*`
			`* (1) the 'match' string is leading directory of 'name', or`
			`* (2) the 'match' string is a wildcard and matches 'name', or`
			`* (3) the 'match' string is exactly the same as 'name'.`
			`*`
			`* and the return value tells which case it was.`
			`*`
			`* It returns 0 when there is no match.`
			`*/`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`static int match_one(const char match, const char name, int namelen)`
			`{`
			`int matchlen;`

			`/* If the match was just the prefix, we matched */`
Optimize match_pathspec() to avoid fnmatch() "git add " is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(nm), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add " is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything expensive until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-19 23:22:38 +02:00			`if (!*match)`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`return MATCHED_RECURSIVELY;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00
Optimize match_pathspec() to avoid fnmatch() "git add " is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(nm), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add " is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything expensive until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-19 23:22:38 +02:00			`for (;;) {`
			`unsigned char c1 = *match;`
			`unsigned char c2 = *name;`
			`if (special_char(c1))`
			`break;`
			`if (c1 != c2)`
			`return 0;`
			`match++;`
			`name++;`
			`namelen--;`
			`}`


Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`/*`
			`* If we don't match the matchstring exactly,`
			`* we need to match by fnmatch`
			`*/`
Optimize match_pathspec() to avoid fnmatch() "git add " is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(nm), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add " is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything expensive until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-19 23:22:38 +02:00			`matchlen = strlen(match);`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`if (strncmp(match, name, matchlen))`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`return !fnmatch(match, name, 0) ? MATCHED_FNMATCH : 0;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00
git clean: Don't automatically remove directories when run within subdirectory When git clean is run from a subdirectory it should follow the normal policy and only remove directories if they are passed in as a pathspec, or -d is specified. The fix is to send len which could be shorter than ent->len because we have stripped the trailing '/' that read_directory adds. Additionaly match_one() was modified to allow a name[] that is not NUL terminated. This allows us to check if the name matched the pathspec exactly instead of recursively. Signed-off-by: Shawn Bohrer <shawn.bohrer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-15 05:14:09 +02:00			`if (namelen == matchlen)`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`return MATCHED_EXACTLY;`
			`if (match[matchlen-1] == '/' \|\| name[matchlen] == '/')`
			`return MATCHED_RECURSIVELY;`
			`return 0;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`}`

match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`/*`
			`* Given a name and a list of pathspecs, see if the name matches`
			`* any of the pathspecs. The caller is also interested in seeing`
			`* all pathspec matches some names it calls this function with`
			`* (otherwise the user could have mistyped the unmatched pathspec),`
			`* and a mark is left in seen[] array for pathspec element that`
			`* actually matched anything.`
			`*/`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`int match_pathspec(const char *pathspec, const char name, int namelen, int prefix, char *seen)`
			`{`
			`int retval;`
			`const char *match;`

			`name += prefix;`
			`namelen -= prefix;`

			`for (retval = 0; (match = *pathspec++) != NULL; seen++) {`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`int how;`
			`if (retval && *seen == MATCHED_EXACTLY)`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`continue;`
			`match += prefix;`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`how = match_one(match, name, namelen);`
			`if (how) {`
			`if (retval < how)`
			`retval = how;`
			`if (*seen < how)`
			`*seen = how;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`}`
			`}`
			`return retval;`
			`}`

Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`static int no_wildcard(const char *string)`
			`{`
			`return string[strcspn(string, "*?[{")] == '\0';`
			`}`

libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`void add_exclude(const char string, const char base,`
			`int baselen, struct exclude_list *which)`
			`{`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`struct exclude *x;`
			`size_t len;`
			`int to_exclude = 1;`
			`int flags = 0;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`if (*string == '!') {`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`to_exclude = 0;`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`string++;`
			`}`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`len = strlen(string);`
			`if (len && string[len - 1] == '/') {`
			`char *s;`
			`x = xmalloc(sizeof(*x) + len);`
			`s = (char*)(x+1);`
			`memcpy(s, string, len - 1);`
			`s[len - 1] = '\0';`
			`string = s;`
			`x->pattern = s;`
			`flags = EXC_FLAG_MUSTBEDIR;`
			`} else {`
			`x = xmalloc(sizeof(*x));`
			`x->pattern = string;`
			`}`
			`x->to_exclude = to_exclude;`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`x->patternlen = strlen(string);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`x->base = base;`
			`x->baselen = baselen;`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`x->flags = flags;`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`if (!strchr(string, '/'))`
			`x->flags \|= EXC_FLAG_NODIR;`
			`if (no_wildcard(string))`
			`x->flags \|= EXC_FLAG_NOWILDCARD;`
			`if (string == '' && no_wildcard(string+1))`
			`x->flags \|= EXC_FLAG_ENDSWITH;`
dir.c: minor clean-up Replace handcrafted reallocation with ALLOC_GROW(). Reindent "file_exists()" helper function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 10:11:46 +01:00			`ALLOC_GROW(which->excludes, which->nr + 1, which->alloc);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`which->excludes[which->nr++] = x;`
			`}`

			`static int add_excludes_from_file_1(const char *fname,`
			`const char *base,`
			`int baselen,`
per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`char **buf_p,`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`struct exclude_list *which)`
			`{`
Use fstat instead of fseek Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-28 01:55:46 +02:00			`struct stat st;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`int fd, i;`
Cast 64 bit off_t to 32 bit size_t Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:37 +01:00			`size_t size;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`char buf, entry;`

			`fd = open(fname, O_RDONLY);`
Use fstat instead of fseek Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-28 01:55:46 +02:00			`if (fd < 0 \|\| fstat(fd, &st) < 0)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`goto err;`
Cast 64 bit off_t to 32 bit size_t Some systems have sizeof(off_t) == 8 while sizeof(size_t) == 4. This implies that we are able to access and work on files whose maximum length is around 2^63-1 bytes, but we can only malloc or mmap somewhat less than 2^32-1 bytes of memory. On such a system an implicit conversion of off_t to size_t can cause the size_t to wrap, resulting in unexpected and exciting behavior. Right now we are working around all gcc warnings generated by the -Wshorten-64-to-32 option by passing the off_t through xsize_t(). In the future we should make xsize_t on such problematic platforms detect the wrapping and die if such a file is accessed. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-07 02:44:37 +01:00			`size = xsize_t(st.st_size);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`if (size == 0) {`
			`close(fd);`
			`return 0;`
			`}`
			`buf = xmalloc(size+1);`
short i/o: fix calls to read to use xread or read_in_full We have a number of badly checked read() calls. Often we are expecting read() to read exactly the size we requested or fail, this fails to handle interrupts or short reads. Add a read_in_full() providing those semantics. Otherwise we at a minimum need to check for EINTR and EAGAIN, where this is appropriate use xread(). Signed-off-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-01-08 16:58:08 +01:00			`if (read_in_full(fd, buf, size) != size)`
Fix a memory leak Signed-off-by: Li Hong <leehong@pku.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-16 05:53:26 +01:00			`{`
			`free(buf);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`goto err;`
Fix a memory leak Signed-off-by: Li Hong <leehong@pku.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-16 05:53:26 +01:00			`}`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`close(fd);`

per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`if (buf_p)`
			`*buf_p = buf;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`buf[size++] = '\n';`
			`entry = buf;`
			`for (i = 0; i < size; i++) {`
			`if (buf[i] == '\n') {`
			`if (entry != buf + i && entry[0] != '#') {`
			`buf[i - (i && buf[i-1] == '\r')] = 0;`
			`add_exclude(entry, base, baselen, which);`
			`}`
			`entry = buf + i + 1;`
			`}`
			`}`
			`return 0;`

			`err:`
			`if (0 <= fd)`
			`close(fd);`
			`return -1;`
			`}`

			`void add_excludes_from_file(struct dir_struct dir, const char fname)`
			`{`
per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`if (add_excludes_from_file_1(fname, "", 0, NULL,`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`&dir->exclude_list[EXC_FILE]) < 0)`
			`die("cannot use %s as an exclude file", fname);`
			`}`

per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`static void prep_exclude(struct dir_struct dir, const char base, int baselen)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`struct exclude_list *el;`
			`struct exclude_stack *stk = NULL;`
			`int current;`

			`if ((!dir->exclude_per_dir) \|\|`
			`(baselen + strlen(dir->exclude_per_dir) >= PATH_MAX))`
			`return; /* too long a path -- ignore */`

			`/* Pop the ones that are not the prefix of the path being checked. */`
			`el = &dir->exclude_list[EXC_DIRS];`
			`while ((stk = dir->exclude_stack) != NULL) {`
			`if (stk->baselen <= baselen &&`
			`!strncmp(dir->basebuf, base, stk->baselen))`
			`break;`
			`dir->exclude_stack = stk->prev;`
			`while (stk->exclude_ix < el->nr)`
			`free(el->excludes[--el->nr]);`
			`free(stk->filebuf);`
			`free(stk);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`/* Read from the parent directories and push them down. */`
			`current = stk ? stk->baselen : -1;`
			`while (current < baselen) {`
			`struct exclude_stack stk = xcalloc(1, sizeof(stk));`
			`const char *cp;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`if (current < 0) {`
			`cp = base;`
			`current = 0;`
			`}`
			`else {`
			`cp = strchr(base + current + 1, '/');`
			`if (!cp)`
			`die("oops in prep_exclude");`
			`cp++;`
			`}`
			`stk->prev = dir->exclude_stack;`
			`stk->baselen = cp - base;`
			`stk->exclude_ix = el->nr;`
			`memcpy(dir->basebuf + current, base + current,`
			`stk->baselen - current);`
			`strcpy(dir->basebuf + stk->baselen, dir->exclude_per_dir);`
			`add_excludes_from_file_1(dir->basebuf,`
			`dir->basebuf, stk->baselen,`
			`&stk->filebuf, el);`
			`dir->exclude_stack = stk;`
			`current = stk->baselen;`
			`}`
			`dir->basebuf[baselen] = '\0';`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

			`/* Scan the list and let the last match determines the fate.`
			`* Return 1 for exclude, 0 for include and -1 for undecided.`
			`*/`
			`static int excluded_1(const char *pathname,`
gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`int pathlen, const char basename, int dtype,`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`struct exclude_list *el)`
			`{`
			`int i;`

			`if (el->nr) {`
			`for (i = el->nr - 1; 0 <= i; i--) {`
			`struct exclude *x = el->excludes[i];`
			`const char *exclude = x->pattern;`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`int to_exclude = x->to_exclude;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`if (x->flags & EXC_FLAG_MUSTBEDIR) {`
			`if (*dtype == DT_UNKNOWN)`
			`*dtype = get_dtype(NULL, pathname);`
			`if (*dtype != DT_DIR)`
			`continue;`
			`}`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`if (x->flags & EXC_FLAG_NODIR) {`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`/* match basename */`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`if (x->flags & EXC_FLAG_NOWILDCARD) {`
			`if (!strcmp(exclude, basename))`
			`return to_exclude;`
			`} else if (x->flags & EXC_FLAG_ENDSWITH) {`
			`if (x->patternlen - 1 <= pathlen &&`
			`!strcmp(exclude + 1, pathname + pathlen - x->patternlen + 1))`
			`return to_exclude;`
			`} else {`
			`if (fnmatch(exclude, basename, 0) == 0)`
			`return to_exclude;`
			`}`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
			`else {`
			`/* match with FNM_PATHNAME:`
			`* exclude has base (baselen long) implicitly`
			`* in front of it.`
			`*/`
			`int baselen = x->baselen;`
			`if (*exclude == '/')`
			`exclude++;`

			`if (pathlen < baselen \|\|`
			`(baselen && pathname[baselen-1] != '/') \|\|`
			`strncmp(pathname, x->base, baselen))`
			`continue;`

Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`if (x->flags & EXC_FLAG_NOWILDCARD) {`
			`if (!strcmp(exclude, pathname + baselen))`
			`return to_exclude;`
			`} else {`
			`if (fnmatch(exclude, pathname+baselen,`
			`FNM_PATHNAME) == 0)`
			`return to_exclude;`
			`}`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
			`}`
			`}`
			`return -1; /* undecided */`
			`}`

gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`int excluded(struct dir_struct dir, const char pathname, int *dtype_p)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
			`int pathlen = strlen(pathname);`
			`int st;`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`const char *basename = strrchr(pathname, '/');`
			`basename = (basename) ? basename+1 : pathname;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`prep_exclude(dir, pathname, basename-pathname);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`for (st = EXC_CMDL; st <= EXC_FILE; st++) {`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`switch (excluded_1(pathname, pathlen, basename,`
gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`dtype_p, &dir->exclude_list[st])) {`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`case 0:`
			`return 0;`
			`case 1:`
			`return 1;`
			`}`
			`}`
			`return 0;`
			`}`

Style: place opening brace of a function definition at column 1 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-09 00:35:32 +01:00			`static struct dir_entry dir_entry_new(const char pathname, int len)`
			`{`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`struct dir_entry *ent;`

			`ent = xmalloc(sizeof(*ent) + len + 1);`
			`ent->len = len;`
			`memcpy(ent->name, pathname, len);`
			`ent->name[len] = 0;`
Fix 'git add' with .gitignore When '*.ig' is ignored, and you have two files f.ig and d.ig/foo in the working tree, $ git add . correctly ignored f.ig but failed to ignore d.ig/foo. This was caused by a thinko in an earlier commit 4888c534, when we tried to allow adding otherwise ignored files. After reverting that commit, this takes a much simpler approach. When we have an unmatched pathspec that talks about an existing pathname, we know it is an ignored path the user tried to add, so we include it in the set of paths directory walker returned. This does not let you say "git add -f D" on an ignored directory D and add everything under D. People can submit a patch to further allow it if they want to, but I think it is a saner behaviour to require explicit paths to be spelled out in such a case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-29 20:01:31 +01:00			`return ent;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`struct dir_entry dir_add_name(struct dir_struct dir, const char *pathname, int len)`
			`{`
Add 'core.ignorecase' option ..and start using it for directory entry traversal (ie "git status" will not consider entries that match an existing entry case-insensitively to be a new file) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-03-22 00:52:46 +01:00			`if (cache_name_exists(pathname, len, ignore_case))`
refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`return NULL;`

Fix ALLOC_GROW calls with obsolete semantics ALLOC_GROW now expects the 'nr' argument to be "how much you want" and not "how much you have". This fixes all cases where we weren't previously adding anything to the 'nr'. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-17 00:43:40 +02:00			`ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);`
refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`return dir->entries[dir->nr++] = dir_entry_new(pathname, len);`
			`}`

dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`struct dir_entry dir_add_ignored(struct dir_struct dir, const char *pathname, int len)`
			`{`
			`if (cache_name_pos(pathname, len) >= 0)`
			`return NULL;`

Fix ALLOC_GROW calls with obsolete semantics ALLOC_GROW now expects the 'nr' argument to be "how much you want" and not "how much you have". This fixes all cases where we weren't previously adding anything to the 'nr'. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-17 00:43:40 +02:00			`ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->ignored_alloc);`
dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);`
			`}`

Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`enum exist_status {`
			`index_nonexistent = 0,`
			`index_directory,`
			`index_gitdir,`
			`};`

			`/*`
			`* The index sorts alphabetically by entry name, which`
			`* means that a gitlink sorts as '\0' at the end, while`
			`* a directory (which is defined not as an entry, but as`
			`* the files it contains) will sort with the '/' at the`
			`* end.`
			`*/`
			`static enum exist_status directory_exists_in_index(const char *dirname, int len)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
			`int pos = cache_name_pos(dirname, len);`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`if (pos < 0)`
			`pos = -pos-1;`
			`while (pos < active_nr) {`
			`struct cache_entry *ce = active_cache[pos++];`
			`unsigned char endchar;`

			`if (strncmp(ce->name, dirname, len))`
			`break;`
			`endchar = ce->name[len];`
			`if (endchar > '/')`
			`break;`
			`if (endchar == '/')`
			`return index_directory;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`if (!endchar && S_ISGITLINK(ce->ce_mode))`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`return index_gitdir;`
			`}`
			`return index_nonexistent;`
			`}`

			`/*`
			`* When we find a directory when traversing the filesystem, we`
			`* have three distinct cases:`
			`*`
			`* - ignore it`
			`* - see it as a directory`
			`* - recurse into it`
			`*`
			`* and which one we choose depends on a combination of existing`
			`* git index contents and the flags passed into the directory`
			`* traversal routine.`
			`*`
			`* Case 1: If we already have entries in the index under that`
			`* directory name, we always recurse into the directory to see`
			`* all the files.`
			`*`
			`* Case 2: If we already have that directory name as a gitlink,`
			`* we always continue to see it as a gitlink, regardless of whether`
			`* there is an actual git directory there or not (it might not`
			`* be checked out as a subproject!)`
			`*`
			`* Case 3: if we didn't have it in the index previously, we`
			`* have a few sub-cases:`
			`*`
			`* (a) if "show_other_directories" is true, we show it as`
			`* just a directory, unless "hide_empty_directories" is`
			`* also true and the directory is empty, in which case`
			`* we just ignore it entirely.`
			`* (b) if it looks like a git directory, and we don't have`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`* 'no_gitlinks' set we treat it as a gitlink, and show it`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`* as a directory.`
			`* (c) otherwise, we recurse into it.`
			`*/`
			`enum directory_treatment {`
			`show_directory,`
			`ignore_directory,`
			`recurse_into_directory,`
			`};`

			`static enum directory_treatment treat_directory(struct dir_struct *dir,`
			`const char *dirname, int len,`
			`const struct path_simplify *simplify)`
			`{`
			`/* The "len-1" is to strip the final '/' */`
			`switch (directory_exists_in_index(dirname, len-1)) {`
			`case index_directory:`
			`return recurse_into_directory;`

			`case index_gitdir:`
Don't show gitlink directories when we want "other" files When "show_other_directories" is set, that implies that we are looking for untracked files, which obviously means that we should ignore directories that are marked as gitlinks in the index. This fixes "git status" in a superproject, that would otherwise always report that subprojects were "Untracked files:" Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-12 23:32:21 +02:00			`if (dir->show_other_directories)`
			`return ignore_directory;`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`return show_directory;`

			`case index_nonexistent:`
			`if (dir->show_other_directories)`
			`break;`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`if (!dir->no_gitlinks) {`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`unsigned char sha1[20];`
			`if (resolve_gitlink_ref(dirname, "HEAD", sha1) == 0)`
			`return show_directory;`
			`}`
			`return recurse_into_directory;`
			`}`

			`/* This is the "show_other_directories" case */`
			`if (!dir->hide_empty_directories)`
			`return show_directory;`
			`if (!read_directory_recursive(dir, dirname, dirname, len, 1, simplify))`
			`return ignore_directory;`
			`return show_directory;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`/*`
			`* This is an inexact early pruning of any recursive directory`
			`* reading - if the path cannot possibly be in the pathspec,`
			`* return true, and we'll skip it early.`
			`*/`
			`static int simplify_away(const char path, int pathlen, const struct path_simplify simplify)`
			`{`
			`if (simplify) {`
			`for (;;) {`
			`const char *match = simplify->path;`
			`int len = simplify->len;`

			`if (!match)`
			`break;`
			`if (len > pathlen)`
			`len = pathlen;`
			`if (!memcmp(path, match, len))`
			`return 0;`
			`simplify++;`
			`}`
			`return 1;`
			`}`
			`return 0;`
			`}`

builtin-add: simplify (and increase accuracy of) exclude handling Previously, the code would always set up the excludes, and then manually pick through the pathspec we were given, assuming that non-added but existing paths were just ignored. This was mostly correct, but would erroneously mark a totally empty directory as 'ignored'. Instead, we now use the collect_ignored option of dir_struct, which unambiguously tells us whether a path was ignored. This simplifies the code, and means empty directories are now just not mentioned at all. Furthermore, we now conditionally ask dir_struct to respect excludes, depending on whether the '-f' flag has been set. This means we don't have to pick through the result, checking for an 'ignored' flag; ignored entries were either added or not in the first place. We can safely get rid of the special 'ignored' flags to dir_entry, which were not used anywhere else. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-12 23:42:14 +02:00			`static int in_pathspec(const char path, int len, const struct path_simplify simplify)`
			`{`
			`if (simplify) {`
			`for (; simplify->path; simplify++) {`
			`if (len == simplify->len`
			`&& !memcmp(path, simplify->path, len))`
			`return 1;`
			`}`
			`}`
			`return 0;`
			`}`

Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00			`static int get_dtype(struct dirent de, const char path)`
			`{`
gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`int dtype = de ? DTYPE(de) : DT_UNKNOWN;`
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00			`struct stat st;`

			`if (dtype != DT_UNKNOWN)`
			`return dtype;`
			`if (lstat(path, &st))`
			`return dtype;`
			`if (S_ISREG(st.st_mode))`
			`return DT_REG;`
			`if (S_ISDIR(st.st_mode))`
			`return DT_DIR;`
			`if (S_ISLNK(st.st_mode))`
			`return DT_LNK;`
			`return dtype;`
			`}`

libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`/*`
			`* Read a directory tree. We currently ignore anything but`
			`* directories, regular files and symlinks. That's because git`
			`* doesn't handle them at all yet. Maybe that will change some`
			`* day.`
			`*`
			`* Also, we ignore the name ".git" (even if it is not a directory).`
			`* That likely will not change.`
			`*/`
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`static int read_directory_recursive(struct dir_struct dir, const char path, const char base, int baselen, int check_only, const struct path_simplify simplify)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
			`DIR *fdir = opendir(path);`
			`int contents = 0;`

			`if (fdir) {`
			`struct dirent *de;`
Use PATH_MAX instead of MAXPATHLEN According to sys/paramh.h it's a "BSD name" for values defined in <limits.h>. Besides PATH_MAX seems to be more commonly used. Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-26 16:09:17 +02:00			`char fullname[PATH_MAX + 1];`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`memcpy(fullname, base, baselen);`

			`while ((de = readdir(fdir)) != NULL) {`
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00			`int len, dtype;`
dir.c: Omit non-excluded directories with dir->show_ignored This makes "git-ls-files --others --directory --ignored" behave as documented and consequently also fixes "git-clean -d -X". Previously, git-clean would remove non-excluded directories even when using the -X option. Signed-off-by: Michael Spang <mspang@uwaterloo.ca> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-07 04:35:04 +02:00			`int exclude;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
			`if ((de->d_name[0] == '.') &&`
			`(de->d_name[1] == 0 \|\|`
			`!strcmp(de->d_name + 1, ".") \|\|`
			`!strcmp(de->d_name + 1, "git")))`
			`continue;`
			`len = strlen(de->d_name);`
Avoid overflowing name buffer in deep directory structures This just makes sure that when we do a read_directory(), we check that the filename fits in the buffer we allocated (with a bit of slop) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-10 06:13:58 +02:00			`/* Ignore overly long pathnames! */`
			`if (len + baselen + 8 > sizeof(fullname))`
			`continue;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`memcpy(fullname + baselen, de->d_name, len+1);`
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`if (simplify_away(fullname, baselen + len, simplify))`
			`continue;`
dir.c: Omit non-excluded directories with dir->show_ignored This makes "git-ls-files --others --directory --ignored" behave as documented and consequently also fixes "git-clean -d -X". Previously, git-clean would remove non-excluded directories even when using the -X option. Signed-off-by: Michael Spang <mspang@uwaterloo.ca> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-07 04:35:04 +02:00
gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`dtype = DTYPE(de);`
			`exclude = excluded(dir, fullname, &dtype);`
builtin-add: simplify (and increase accuracy of) exclude handling Previously, the code would always set up the excludes, and then manually pick through the pathspec we were given, assuming that non-added but existing paths were just ignored. This was mostly correct, but would erroneously mark a totally empty directory as 'ignored'. Instead, we now use the collect_ignored option of dir_struct, which unambiguously tells us whether a path was ignored. This simplifies the code, and means empty directories are now just not mentioned at all. Furthermore, we now conditionally ask dir_struct to respect excludes, depending on whether the '-f' flag has been set. This means we don't have to pick through the result, checking for an 'ignored' flag; ignored entries were either added or not in the first place. We can safely get rid of the special 'ignored' flags to dir_entry, which were not used anywhere else. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-12 23:42:14 +02:00			`if (exclude && dir->collect_ignored`
			`&& in_pathspec(fullname, baselen + len, simplify))`
dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`dir_add_ignored(dir, fullname, baselen + len);`
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00
			`/*`
			`* Excluded? If we don't explicitly want to show`
			`* ignored files, ignore it`
			`*/`
			`if (exclude && !dir->show_ignored)`
			`continue;`

gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`if (dtype == DT_UNKNOWN)`
			`dtype = get_dtype(de, fullname);`
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00
			`/*`
			`* Do we want to see just the ignored files?`
			`* We still need to recurse into directories,`
			`* even if we don't ignore them, since the`
			`* directory may contain files that we do..`
			`*/`
			`if (!exclude && dir->show_ignored) {`
			`if (dtype != DT_DIR)`
Revert "read_directory: show_both option." This reverts commit 4888c534099012d71d24051deb5b14319747bd1a. 2006-12-29 19:08:19 +01:00			`continue;`
			`}`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00			`switch (dtype) {`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`default:`
			`continue;`
			`case DT_DIR:`
			`memcpy(fullname + baselen + len, "/", 2);`
			`len++;`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`switch (treat_directory(dir, fullname, baselen + len, simplify)) {`
			`case show_directory:`
dir.c: Omit non-excluded directories with dir->show_ignored This makes "git-ls-files --others --directory --ignored" behave as documented and consequently also fixes "git-clean -d -X". Previously, git-clean would remove non-excluded directories even when using the -X option. Signed-off-by: Michael Spang <mspang@uwaterloo.ca> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-07 04:35:04 +02:00			`if (exclude != dir->show_ignored)`
			`continue;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`break;`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`case recurse_into_directory:`
			`contents += read_directory_recursive(dir,`
			`fullname, fullname, baselen + len, 0, simplify);`
			`continue;`
			`case ignore_directory:`
			`continue;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`break;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`case DT_REG:`
			`case DT_LNK:`
			`break;`
			`}`
			`contents++;`
runstatus: do not recurse into subdirectories if not needed This speeds up the case when you run git-status, having an untracked subdirectory containing huge amounts of files. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-28 02:44:30 +02:00			`if (check_only)`
			`goto exit_early;`
			`else`
Fix 'git add' with .gitignore When '*.ig' is ignored, and you have two files f.ig and d.ig/foo in the working tree, $ git add . correctly ignored f.ig but failed to ignore d.ig/foo. This was caused by a thinko in an earlier commit 4888c534, when we tried to allow adding otherwise ignored files. After reverting that commit, this takes a much simpler approach. When we have an unmatched pathspec that talks about an existing pathname, we know it is an ignored path the user tried to add, so we include it in the set of paths directory walker returned. This does not let you say "git add -f D" on an ignored directory D and add everything under D. People can submit a patch to further allow it if they want to, but I think it is a saner behaviour to require explicit paths to be spelled out in such a case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-29 20:01:31 +01:00			`dir_add_name(dir, fullname, baselen + len);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
runstatus: do not recurse into subdirectories if not needed This speeds up the case when you run git-status, having an untracked subdirectory containing huge amounts of files. Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-28 02:44:30 +02:00			`exit_early:`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`closedir(fdir);`
			`}`

			`return contents;`
			`}`

			`static int cmp_name(const void p1, const void p2)`
			`{`
			`const struct dir_entry e1 = (const struct dir_entry **)p1;`
			`const struct dir_entry e2 = (const struct dir_entry **)p2;`

			`return cache_name_compare(e1->name, e1->len,`
			`e2->name, e2->len);`
			`}`

Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`/*`
			`* Return the length of the "simple" part of a path match limiter.`
			`*/`
			`static int simple_length(const char *match)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`const char special[256] = {`
			`[0] = 1, ['?'] = 1,`
			`['\\'] = 1, ['*'] = 1,`
			`['['] = 1`
			`};`
			`int len = -1;`

			`for (;;) {`
			`unsigned char c = *match++;`
			`len++;`
			`if (special[c])`
			`return len;`
			`}`
			`}`

			`static struct path_simplify create_simplify(const char *pathspec)`
			`{`
			`int nr, alloc = 0;`
			`struct path_simplify *simplify = NULL;`

			`if (!pathspec)`
			`return NULL;`

			`for (nr = 0 ; ; nr++) {`
			`const char *match;`
			`if (nr >= alloc) {`
			`alloc = alloc_nr(alloc);`
			`simplify = xrealloc(simplify, alloc * sizeof(*simplify));`
			`}`
			`match = *pathspec++;`
			`if (!match)`
			`break;`
			`simplify[nr].path = match;`
			`simplify[nr].len = simple_length(match);`
			`}`
			`simplify[nr].path = NULL;`
			`simplify[nr].len = 0;`
			`return simplify;`
			`}`

			`static void free_simplify(struct path_simplify *simplify)`
			`{`
Avoid unnecessary "if-before-free" tests. This change removes all obvious useless if-before-free tests. E.g., it replaces code like this: if (some_expression) free (some_expression); with the now-equivalent: free (some_expression); It is equivalent not just because POSIX has required free(NULL) to work for a long time, but simply because it has worked for so long that no reasonable porting target fails the test. Here's some evidence from nearly 1.5 years ago: http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html FYI, the change below was prepared by running the following: git ls-files -z \| xargs -0 \ perl -0x3b -pi -e \ 's/\bif\s\(\s(\S+?)(?:\s!=\sNULL)?\s\)\s+(free\s\(\s\1\s\))/$2/s' Note however, that it doesn't handle brace-enclosed blocks like "if (x) { free (x); }". But that's ok, since there were none like that in git sources. Beware: if you do use the above snippet, note that it can produce syntactically invalid C code. That happens when the affected "if"-statement has a matching "else". E.g., it would transform this if (x) free (x); else foo (); into this: free (x); else foo (); There were none of those here, either. If you're interested in automating detection of the useless tests, you might like the useless-if-before-free script in gnulib: [it does detect brace-enclosed free statements, and has a --name=S option to make it detect free-like functions with different names] http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free Addendum: Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 18:26:32 +01:00			`free(simplify);`
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`}`

			`int read_directory(struct dir_struct dir, const char path, const char base, int baselen, const char *pathspec)`
			`{`
			`struct path_simplify *simplify = create_simplify(pathspec);`
Clean up git-ls-file directory walking library interface This moves the code to add the per-directory ignore files for the base directory into the library routine. That not only allows us to turn the function push_exclude_per_directory() static again, it also simplifies the library interface a lot (the caller no longer needs to worry about any of the per-directory exclude files at all). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:46:16 +02:00
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`read_directory_recursive(dir, path, base, baselen, 0, simplify);`
			`free_simplify(simplify);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);`
dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`qsort(dir->ignored, dir->ignored_nr, sizeof(struct dir_entry *), cmp_name);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`return dir->nr;`
			`}`
git-commit.sh: convert run_status to a C builtin This creates a new git-runstatus which should do roughly the same thing as the run_status function from git-commit.sh. Except for color support, the main focus has been to keep the output identical, so that it can be verified as correct and then used as a C platform for other improvements to the status printing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-08 10:05:34 +02:00
dir.c: minor clean-up Replace handcrafted reallocation with ALLOC_GROW(). Reindent "file_exists()" helper function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 10:11:46 +01:00			`int file_exists(const char *f)`
git-commit.sh: convert run_status to a C builtin This creates a new git-runstatus which should do roughly the same thing as the run_status function from git-commit.sh. Except for color support, the main focus has been to keep the output identical, so that it can be verified as correct and then used as a C platform for other improvements to the status printing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-08 10:05:34 +02:00			`{`
dir.c: minor clean-up Replace handcrafted reallocation with ALLOC_GROW(). Reindent "file_exists()" helper function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 10:11:46 +01:00			`struct stat sb;`
file_exists(): dangling symlinks do exist This function is used to see if a path given by the user does exist on the filesystem. A symbolic link that does not point anywhere does exist but running stat() on it would yield an error, and it incorrectly said it does not exist. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 10:58:16 +01:00			`return lstat(f, &sb) == 0;`
git-commit.sh: convert run_status to a C builtin This creates a new git-runstatus which should do roughly the same thing as the run_status function from git-commit.sh. Except for color support, the main focus has been to keep the output identical, so that it can be verified as correct and then used as a C platform for other improvements to the status printing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-08 10:05:34 +02:00			`}`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00
			`/*`
			`* get_relative_cwd() gets the prefix of the current working directory`
			`* relative to 'dir'. If we are not inside 'dir', it returns NULL.`
get_relative_cwd(): clarify why it handles dir == NULL The comment did not make a good case why it makes sense. Clarify, and remove stale comment about the caller being lazy. The behaviour on NULL input is pretty much intentional. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 20:26:59 +02:00			`*`
			`* As a convenience, it also returns NULL if 'dir' is already NULL. The`
			`* reason for this behaviour is that it is natural for functions returning`
			`* directory names to return NULL to say "this directory does not exist"`
			`* or "this directory is invalid". These cases are usually handled the`
			`* same as if the cwd is not inside 'dir' at all, so get_relative_cwd()`
			`* returns NULL for both of them.`
			`*`
			`* Most notably, get_relative_cwd(buffer, size, get_git_work_tree())`
			`* unifies the handling of "outside work tree" with "no work tree at all".`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00			`*/`
			`char get_relative_cwd(char buffer, int size, const char *dir)`
			`{`
			`char *cwd = buffer;`

			`if (!dir)`
			`return NULL;`
			`if (!getcwd(buffer, size))`
			`die("can't find the current directory: %s", strerror(errno));`

			`if (!is_absolute_path(dir))`
			`dir = make_absolute_path(dir);`

			`while (dir && dir == *cwd) {`
			`dir++;`
			`cwd++;`
			`}`
			`if (*dir)`
			`return NULL;`
			`if (*cwd == '/')`
			`return cwd + 1;`
			`return cwd;`
			`}`

			`int is_inside_dir(const char *dir)`
			`{`
			`char buffer[PATH_MAX];`
			`return get_relative_cwd(buffer, sizeof(buffer), dir) != NULL;`
			`}`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00
			`int remove_dir_recursively(struct strbuf *path, int only_empty)`
			`{`
			`DIR *dir = opendir(path->buf);`
			`struct dirent *e;`
			`int ret = 0, original_len = path->len, len;`

			`if (!dir)`
			`return -1;`
			`if (path->buf[original_len - 1] != '/')`
			`strbuf_addch(path, '/');`

			`len = path->len;`
			`while ((e = readdir(dir)) != NULL) {`
			`struct stat st;`
			`if ((e->d_name[0] == '.') &&`
			`((e->d_name[1] == 0) \|\|`
			`((e->d_name[1] == '.') && e->d_name[2] == 0)))`
			`continue; /* "." and ".." */`

			`strbuf_setlen(path, len);`
			`strbuf_addstr(path, e->d_name);`
			`if (lstat(path->buf, &st))`
			`; /* fall thru */`
			`else if (S_ISDIR(st.st_mode)) {`
			`if (!remove_dir_recursively(path, only_empty))`
			`continue; /* happy */`
			`} else if (!only_empty && !unlink(path->buf))`
			`continue; /* happy, too */`

			`/* path too long, stat fails, or non-directory still exists */`
			`ret = -1;`
			`break;`
			`}`
			`closedir(dir);`

			`strbuf_setlen(path, original_len);`
			`if (!ret)`
			`ret = rmdir(path->buf);`
			`return ret;`
			`}`
core.excludesfile clean-up There are inconsistencies in the way commands currently handle the core.excludesfile configuration variable. The problem is the variable is too new to be noticed by anything other than git-add and git-status. * git-ls-files does not notice any of the "ignore" files by default, as it predates the standardized set of ignore files. The calling scripts established the convention to use .git/info/exclude, .gitignore, and later core.excludesfile. * git-add and git-status know about it because they call add_excludes_from_file() directly with their own notion of which standard set of ignore files to use. This is just a stupid duplication of code that need to be updated every time the definition of the standard set of ignore files is changed. * git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. * git-merge-recursive uses hardcoded per-directory .gitignore and nothing else. git-clean (scripted version) does not honor core.* because its call to underlying ls-files does not know about it. git-clean in C (parked in 'pu') doesn't either. We probably could change git-ls-files to use the standard set when no excludes are specified on the command line and ignore processing was asked, or something like that, but that will be a change in semantics and might break people's scripts in a subtle way. I am somewhat reluctant to make such a change. On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 09:05:00 +01:00
			`void setup_standard_excludes(struct dir_struct *dir)`
			`{`
			`const char *path;`

			`dir->exclude_per_dir = ".gitignore";`
			`path = git_path("info/exclude");`
			`if (!access(path, R_OK))`
			`add_excludes_from_file(dir, path);`
			`if (excludes_file && !access(excludes_file, R_OK))`
			`add_excludes_from_file(dir, excludes_file);`
			`}`