mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-16 14:04:52 +01:00

1664 lines

40 KiB

C

Raw Normal View History

libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`/*`
			`* This handles recursive filename detection with exclude`
			`* files, index knowledge etc..`
			`*`
Improve documentation and comments regarding directory traversal API traversal API has a few potentially confusing properties. These comments clarify a few key aspects and will hopefully make it easier to understand for other newcomers in the future. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:21 +01:00			`* See Documentation/technical/api-directory-listing.txt`
			`*`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`* Copyright (C) Linus Torvalds, 2005-2006`
			`* Junio Hamano, 2005-2006`
			`*/`
			`#include "cache.h"`
			`#include "dir.h"`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`#include "refs.h"`
Support "**" wildcard in .gitignore and .gitattributes Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:26:02 +02:00			`#include "wildmatch.h"`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`struct path_simplify {`
			`int len;`
			`const char *path;`
			`};`

dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`/*`
			`* Tells read_directory_recursive how a file or directory should be treated.`
			`* Values are ordered by significance, e.g. if a directory contains both`
			`* excluded and untracked files, it is listed as untracked because`
			`* path_untracked > path_excluded.`
			`*/`
			`enum path_treatment {`
			`path_none = 0,`
			`path_recurse,`
			`path_excluded,`
			`path_untracked`
			`};`

			`static enum path_treatment read_directory_recursive(struct dir_struct *dir,`
			`const char *path, int len,`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`int check_only, const struct path_simplify *simplify);`
Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry On filesystems without d_type, we can look at the cache entry first. Doing an lstat() can be expensive. Reported by Dmitry Potapov for Cygwin. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-09 04:31:49 +02:00			`static int get_dtype(struct dirent de, const char path, int len);`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00
Add string comparison functions that respect the ignore_case variable. Multiple locations within this patch series alter a case sensitive string comparison call such as strcmp() to be a call to a string comparison call that selects case comparison based on the global ignore_case variable. Behaviorally, when core.ignorecase=false, the _icase() versions are functionally equivalent to their C runtime counterparts. When core.ignorecase=true, the _icase() versions perform a case insensitive comparison. Like Linus' earlier ignorecase patch, these may ignore filename conventions on certain file systems. By isolating filename comparisons to certain functions, support for those filename conventions may be more easily met. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:41 +02:00			`/* helper string functions with support for the ignore_case flag */`
			`int strcmp_icase(const char a, const char b)`
			`{`
			`return ignore_case ? strcasecmp(a, b) : strcmp(a, b);`
			`}`

			`int strncmp_icase(const char a, const char b, size_t count)`
			`{`
			`return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count);`
			`}`

			`int fnmatch_icase(const char pattern, const char string, int flags)`
			`{`
			`return fnmatch(pattern, string, flags \| (ignore_case ? FNM_CASEFOLD : 0));`
			`}`

pathspec: do exact comparison on the leading non-wildcard part Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-11-24 05:33:49 +01:00			`inline int git_fnmatch(const char pattern, const char string,`
			`int flags, int prefix)`
			`{`
			`int fnm_flags = 0;`
			`if (flags & GFNM_PATHNAME)`
			`fnm_flags \|= FNM_PATHNAME;`
			`if (prefix > 0) {`
			`if (strncmp(pattern, string, prefix))`
			`return FNM_NOMATCH;`
			`pattern += prefix;`
			`string += prefix;`
			`}`
pathspec: apply ".c" optimization from exclude When a pattern contains only a single asterisk as wildcard, e.g. "foobar", after literally comparing the leading part "foo" with the string, we can compare the tail of the string and make sure it matches "bar", instead of running fnmatch() on "bar" against the remainder of the string. -O2 build on linux-2.6, without the patch: $ time git rev-list --quiet HEAD -- '.c' real 0m40.770s user 0m40.290s sys 0m0.256s With the patch $ time ~/w/git/git rev-list --quiet HEAD -- '.c' real 0m34.288s user 0m33.997s sys 0m0.205s The above command is not supposed to be widely popular. It's chosen because it exercises pathspec matching a lot. The point is it cuts down matching time for popular patterns like .c, which could be used as pathspec in other places. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-11-24 05:33:50 +01:00			`if (flags & GFNM_ONESTAR) {`
			`int pattern_len = strlen(++pattern);`
			`int string_len = strlen(string);`
			`return string_len < pattern_len \|\|`
			`strcmp(pattern,`
			`string + string_len - pattern_len);`
			`}`
pathspec: do exact comparison on the leading non-wildcard part Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-11-24 05:33:49 +01:00			`return fnmatch(pattern, string, fnm_flags);`
			`}`

dir.c::match_basename(): pay attention to the length of string parameters The function takes two counted strings (<basename, basenamelen> and <pattern, patternlen>) as parameters, together with prefix (the length of the prefix in pattern that is to be matched literally without globbing against the basename) and EXC_* flags that tells it how to match the pattern against the basename. However, it did not pay attention to the length of these counted strings. Update them to do the following: * When the entire pattern is to be matched literally, the pattern matches the basename only when the lengths of them are the same, and they match up to that length. * When the pattern is "" followed by a string to be matched literally, make sure that the basenamelen is equal or longer than the "literal" part of the pattern, and the tail of the basename string matches that literal part. Otherwise, use the new fnmatch_icase_mem helper to make sure we only lookmake sure we use only look at the counted part of the strings. Because these counted strings are full strings most of the time, we check for termination to avoid unnecessary allocation. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:47:28 +01:00			`static int fnmatch_icase_mem(const char *pattern, int patternlen,`
			`const char *string, int stringlen,`
			`int flags)`
			`{`
			`int match_status;`
			`struct strbuf pat_buf = STRBUF_INIT;`
			`struct strbuf str_buf = STRBUF_INIT;`
			`const char *use_pat = pattern;`
			`const char *use_str = string;`

			`if (pattern[patternlen]) {`
			`strbuf_add(&pat_buf, pattern, patternlen);`
			`use_pat = pat_buf.buf;`
			`}`
			`if (string[stringlen]) {`
			`strbuf_add(&str_buf, string, stringlen);`
			`use_str = str_buf.buf;`
			`}`

Merge branch 'jc/directory-attrs-regression-fix' Fix 1.8.1.x regression that stopped matching "dir" (without trailing slash) to a directory "dir". * jc/directory-attrs-regression-fix: t: check that a pattern without trailing slash matches a directory dir.c::match_pathname(): pay attention to the length of string parameters dir.c::match_pathname(): adjust patternlen when shifting pattern dir.c::match_basename(): pay attention to the length of string parameters attr.c::path_matches(): special case paths that end with a slash attr.c::path_matches(): the basename is part of the pathname 2013-04-03 18:34:04 +02:00			`if (ignore_case)`
			`flags \|= WM_CASEFOLD;`
			`match_status = wildmatch(use_pat, use_str, flags, NULL);`
dir.c::match_basename(): pay attention to the length of string parameters The function takes two counted strings (<basename, basenamelen> and <pattern, patternlen>) as parameters, together with prefix (the length of the prefix in pattern that is to be matched literally without globbing against the basename) and EXC_* flags that tells it how to match the pattern against the basename. However, it did not pay attention to the length of these counted strings. Update them to do the following: * When the entire pattern is to be matched literally, the pattern matches the basename only when the lengths of them are the same, and they match up to that length. * When the pattern is "" followed by a string to be matched literally, make sure that the basenamelen is equal or longer than the "literal" part of the pattern, and the tail of the basename string matches that literal part. Otherwise, use the new fnmatch_icase_mem helper to make sure we only lookmake sure we use only look at the counted part of the strings. Because these counted strings are full strings most of the time, we check for termination to avoid unnecessary allocation. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:47:28 +01:00
			`strbuf_release(&pat_buf);`
			`strbuf_release(&str_buf);`

			`return match_status;`
			`}`

rename pathspec_prefix() to common_prefix() and move to dir.[ch] Also make common_prefix_len() static as this refactoring makes dir.c itself the only caller of this helper function. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-04 12:42:01 +02:00			`static size_t common_prefix_len(const char **pathspec)`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`{`
consolidate pathspec_prefix and common_prefix The implementation from pathspec_prefix (slightly modified) replaces the current common_prefix, because it also respects glob characters. Based on a patch by Clemens Buchacher. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-06 21:32:30 +02:00			`const char n, first;`
			`size_t max = 0;`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00			`int literal = limit_pathspec_to_literal();`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00
			`if (!pathspec)`
consolidate pathspec_prefix and common_prefix The implementation from pathspec_prefix (slightly modified) replaces the current common_prefix, because it also respects glob characters. Based on a patch by Clemens Buchacher. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-06 21:32:30 +02:00			`return max;`

			`first = *pathspec;`
			`while ((n = *pathspec++)) {`
			`size_t i, len = 0;`
			`for (i = 0; first == n \|\| i < max; i++) {`
			`char c = n[i];`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00			`if (!c \|\| c != first[i] \|\| (!literal && is_glob_special(c)))`
consolidate pathspec_prefix and common_prefix The implementation from pathspec_prefix (slightly modified) replaces the current common_prefix, because it also respects glob characters. Based on a patch by Clemens Buchacher. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-06 21:32:30 +02:00			`break;`
			`if (c == '/')`
			`len = i + 1;`
			`}`
			`if (first == n \|\| len < max) {`
			`max = len;`
			`if (!max)`
			`break;`
			`}`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`}`
consolidate pathspec_prefix and common_prefix The implementation from pathspec_prefix (slightly modified) replaces the current common_prefix, because it also respects glob characters. Based on a patch by Clemens Buchacher. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-06 21:32:30 +02:00			`return max;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`}`

rename pathspec_prefix() to common_prefix() and move to dir.[ch] Also make common_prefix_len() static as this refactoring makes dir.c itself the only caller of this helper function. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-04 12:42:01 +02:00			`/*`
			`* Returns a copy of the longest leading path common among all`
			`* pathspecs.`
			`*/`
			`char common_prefix(const char *pathspec)`
			`{`
			`unsigned long len = common_prefix_len(pathspec);`

			`return len ? xmemdupz(*pathspec, len) : NULL;`
			`}`

Add 'fill_directory()' helper function for directory traversal Most of the users of "read_directory()" actually want a much simpler interface than the whole complex (but rather powerful) one. In fact 'git add' had already largely abstracted out the core interface issues into a private "fill_directory()" function that was largely applicable almost as-is to a number of callers. Yes, 'git add' wants to do some extra work of its own, specific to the add semantics, but we can easily split that out, and use the core as a generic function. This function does exactly that, and now that much simplified 'fill_directory()' function can be shared with a number of callers, while also ensuring that the rather more complex calling conventions of read_directory() are used by fewer call-sites. This also makes the 'common_prefix()' helper function private to dir.c, since all callers are now in that file. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-14 22:22:36 +02:00			`int fill_directory(struct dir_struct dir, const char *pathspec)`
			`{`
consolidate pathspec_prefix and common_prefix The implementation from pathspec_prefix (slightly modified) replaces the current common_prefix, because it also respects glob characters. Based on a patch by Clemens Buchacher. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-06 21:32:30 +02:00			`size_t len;`
Add 'fill_directory()' helper function for directory traversal Most of the users of "read_directory()" actually want a much simpler interface than the whole complex (but rather powerful) one. In fact 'git add' had already largely abstracted out the core interface issues into a private "fill_directory()" function that was largely applicable almost as-is to a number of callers. Yes, 'git add' wants to do some extra work of its own, specific to the add semantics, but we can easily split that out, and use the core as a generic function. This function does exactly that, and now that much simplified 'fill_directory()' function can be shared with a number of callers, while also ensuring that the rather more complex calling conventions of read_directory() are used by fewer call-sites. This also makes the 'common_prefix()' helper function private to dir.c, since all callers are now in that file. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-14 22:22:36 +02:00
			`/*`
			`* Calculate common prefix for the pathspec, and`
			`* use that to optimize the directory walk`
			`*/`
consolidate pathspec_prefix and common_prefix The implementation from pathspec_prefix (slightly modified) replaces the current common_prefix, because it also respects glob characters. Based on a patch by Clemens Buchacher. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-09-06 21:32:30 +02:00			`len = common_prefix_len(pathspec);`
Add 'fill_directory()' helper function for directory traversal Most of the users of "read_directory()" actually want a much simpler interface than the whole complex (but rather powerful) one. In fact 'git add' had already largely abstracted out the core interface issues into a private "fill_directory()" function that was largely applicable almost as-is to a number of callers. Yes, 'git add' wants to do some extra work of its own, specific to the add semantics, but we can easily split that out, and use the core as a generic function. This function does exactly that, and now that much simplified 'fill_directory()' function can be shared with a number of callers, while also ensuring that the rather more complex calling conventions of read_directory() are used by fewer call-sites. This also makes the 'common_prefix()' helper function private to dir.c, since all callers are now in that file. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-14 22:22:36 +02:00
			`/* Read the directory and prune it */`
dir: simplify fill_directory() Now that read_directory_recursive() (reached through read_directory()) respects the string length limit we provide, we don't need to create a NUL-limited copy of the common prefix anymore. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-11 16:59:25 +02:00			`read_directory(dir, pathspec ? *pathspec : "", len, pathspec);`
Simplify read_directory[_recursive]() arguments Stop the insanity with separate 'path' and 'base' arguments that must match. We don't need that crazy interface any more, since we cleaned up handling of 'path' in commit da4b3e8c28b1dc2b856d2555ac7bb47ab712598c. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-09 04:24:39 +02:00			`return len;`
Add 'fill_directory()' helper function for directory traversal Most of the users of "read_directory()" actually want a much simpler interface than the whole complex (but rather powerful) one. In fact 'git add' had already largely abstracted out the core interface issues into a private "fill_directory()" function that was largely applicable almost as-is to a number of callers. Yes, 'git add' wants to do some extra work of its own, specific to the add semantics, but we can easily split that out, and use the core as a generic function. This function does exactly that, and now that much simplified 'fill_directory()' function can be shared with a number of callers, while also ensuring that the rather more complex calling conventions of read_directory() are used by fewer call-sites. This also makes the 'common_prefix()' helper function private to dir.c, since all callers are now in that file. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-14 22:22:36 +02:00			`}`

tree_entry_interesting(): support depth limit This is needed to replace pathspec_matches() in builtin/grep.c. max_depth == -1 means infinite depth. Depth limit is only effective when pathspec.recursive == 1. When pathspec.recursive == 0, the behavior depends on match functions: non-recursive for tree_entry_interesting() and recursive for match_pathspec{,_depth} Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:44 +01:00			`int within_depth(const char *name, int namelen,`
			`int depth, int max_depth)`
			`{`
			`const char cp = name, cpe = name + namelen;`

			`while (cp < cpe) {`
			`if (*cp++ != '/')`
			`continue;`
			`depth++;`
			`if (depth > max_depth)`
			`return 0;`
			`}`
			`return 1;`
			`}`

match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`/*`
dir.c: Fix two minor grammatical errors in comments Signed-off-by: Allan Caffee <allan.caffee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-04 19:37:30 +02:00			`* Does 'match' match the given name?`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`* A match is found if`
			`*`
			`* (1) the 'match' string is leading directory of 'name', or`
			`* (2) the 'match' string is a wildcard and matches 'name', or`
			`* (3) the 'match' string is exactly the same as 'name'.`
			`*`
			`* and the return value tells which case it was.`
			`*`
			`* It returns 0 when there is no match.`
			`*/`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`static int match_one(const char match, const char name, int namelen)`
			`{`
			`int matchlen;`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00			`int literal = limit_pathspec_to_literal();`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00
			`/* If the match was just the prefix, we matched */`
Optimize match_pathspec() to avoid fnmatch() "git add " is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(nm), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add " is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything expensive until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-19 23:22:38 +02:00			`if (!*match)`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`return MATCHED_RECURSIVELY;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00
Add case insensitivity support when using git ls-files When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and MyDir/fileb.txt is added, running git ls-files mydir only shows mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt. Running git ls-files mYdIR shows nothing. With this patch running git ls-files for mydir, MyDir, and mYdIR shows mydir/filea.txt and MyDir/fileb.txt. Wildcards are not handled case insensitively in this patch. Example: MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git ls-files mydir/a* does not. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:44 +02:00			`if (ignore_case) {`
			`for (;;) {`
			`unsigned char c1 = tolower(*match);`
			`unsigned char c2 = tolower(*name);`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00			`if (c1 == '\0' \|\| (!literal && is_glob_special(c1)))`
Add case insensitivity support when using git ls-files When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and MyDir/fileb.txt is added, running git ls-files mydir only shows mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt. Running git ls-files mYdIR shows nothing. With this patch running git ls-files for mydir, MyDir, and mYdIR shows mydir/filea.txt and MyDir/fileb.txt. Wildcards are not handled case insensitively in this patch. Example: MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git ls-files mydir/a* does not. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:44 +02:00			`break;`
			`if (c1 != c2)`
			`return 0;`
			`match++;`
			`name++;`
			`namelen--;`
			`}`
			`} else {`
			`for (;;) {`
			`unsigned char c1 = *match;`
			`unsigned char c2 = *name;`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00			`if (c1 == '\0' \|\| (!literal && is_glob_special(c1)))`
Add case insensitivity support when using git ls-files When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and MyDir/fileb.txt is added, running git ls-files mydir only shows mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt. Running git ls-files mYdIR shows nothing. With this patch running git ls-files for mydir, MyDir, and mYdIR shows mydir/filea.txt and MyDir/fileb.txt. Wildcards are not handled case insensitively in this patch. Example: MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git ls-files mydir/a* does not. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:44 +02:00			`break;`
			`if (c1 != c2)`
			`return 0;`
			`match++;`
			`name++;`
			`namelen--;`
			`}`
Optimize match_pathspec() to avoid fnmatch() "git add " is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(nm), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add " is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything expensive until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-19 23:22:38 +02:00			`}`

Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`/*`
			`* If we don't match the matchstring exactly,`
			`* we need to match by fnmatch`
			`*/`
Optimize match_pathspec() to avoid fnmatch() "git add " is actually fundamentally different from "git add .", and yeah, you should generally use the latter. The reason? The argument list is actually something different from what you think it is. For git, it's a "pathspec", so what actualy happens is that in both* cases, it will really traverse the whole tree, and then match every file it finds against the pathspec. So think of the arguments not as a file list, but as a random bunch of patterns to match against the files you have! Which is why the cost is actually approximately O(nm), where "n" is the size of the working tree, and "m" is the number of pathspecs. So the reason "git add ." is fast is actually that "m" in that case is just 1 (just one trivial pattern), and then "git add " is slow because "m" is large (lots of complicated patterns). In both cases, 'n' is the same (== the whole set of files in your working tree). Anyway, here's a trivial patch that doesn't change this fundamental fact, but that avoids doing anything expensive until we've done some cheap initial tests. It may or may not help your test-case, but it's pretty simple and it matches the other git optimizations in this area (ie "conceptually handle the general case, but optimize the simple cases where we can exit early") Notice how this patch doesn' actually change the fundamental O(n^2) behaviour, but it makes it much cheaper by generally avoiding the expensive 'fnmatch' and 'strlen/strncmp' when they are obviously not needed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-19 23:22:38 +02:00			`matchlen = strlen(match);`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00			`if (strncmp_icase(match, name, matchlen)) {`
			`if (literal)`
			`return 0;`
Add case insensitivity support when using git ls-files When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and MyDir/fileb.txt is added, running git ls-files mydir only shows mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt. Running git ls-files mYdIR shows nothing. With this patch running git ls-files for mydir, MyDir, and mYdIR shows mydir/filea.txt and MyDir/fileb.txt. Wildcards are not handled case insensitively in this patch. Example: MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git ls-files mydir/a* does not. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:44 +02:00			`return !fnmatch_icase(match, name, 0) ? MATCHED_FNMATCH : 0;`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00			`}`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00
git clean: Don't automatically remove directories when run within subdirectory When git clean is run from a subdirectory it should follow the normal policy and only remove directories if they are passed in as a pathspec, or -d is specified. The fix is to send len which could be shorter than ent->len because we have stripped the trailing '/' that read_directory adds. Additionaly match_one() was modified to allow a name[] that is not NUL terminated. This allows us to check if the name matched the pathspec exactly instead of recursively. Signed-off-by: Shawn Bohrer <shawn.bohrer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-04-15 05:14:09 +02:00			`if (namelen == matchlen)`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`return MATCHED_EXACTLY;`
			`if (match[matchlen-1] == '/' \|\| name[matchlen] == '/')`
			`return MATCHED_RECURSIVELY;`
			`return 0;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`}`

match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`/*`
dir.c: improve docs for match_pathspec() and match_pathspec_depth() Fix a grammatical issue in the description of these functions, and make it more obvious how and why seen[] can be reused across multiple invocations. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:06 +01:00			`* Given a name and a list of pathspecs, returns the nature of the`
			`* closest (i.e. most specific) match of the name to any of the`
			`* pathspecs.`
			`*`
			`* The caller typically calls this multiple times with the same`
			`* pathspec and seen[] array but with different name/namelen`
			`* (e.g. entries from the index) and is interested in seeing if and`
			`* how each pathspec matches all the names it calls this function`
			`* with. A mark is left in the seen[] array for each pathspec element`
			`* indicating the closest type of match that element achieved, so if`
			`* seen[n] remains zero after multiple invocations, that means the nth`
			`* pathspec did not match any names, which could indicate that the`
			`* user mistyped the nth pathspec.`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`*/`
remove pathspec_match, use match_pathspec instead Both versions have the same functionality. This removes any redundancy. This also adds makes two extensions to match_pathspec: - If pathspec is NULL, return 1. This reflects the behavior of git commands, for which no paths usually means "match all paths". - If seen is NULL, do not use it. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-14 15:54:35 +01:00			`int match_pathspec(const char *pathspec, const char name, int namelen,`
			`int prefix, char *seen)`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`{`
remove pathspec_match, use match_pathspec instead Both versions have the same functionality. This removes any redundancy. This also adds makes two extensions to match_pathspec: - If pathspec is NULL, return 1. This reflects the behavior of git commands, for which no paths usually means "match all paths". - If seen is NULL, do not use it. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-14 15:54:35 +01:00			`int i, retval = 0;`

			`if (!pathspec)`
			`return 1;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00
			`name += prefix;`
			`namelen -= prefix;`

remove pathspec_match, use match_pathspec instead Both versions have the same functionality. This removes any redundancy. This also adds makes two extensions to match_pathspec: - If pathspec is NULL, return 1. This reflects the behavior of git commands, for which no paths usually means "match all paths". - If seen is NULL, do not use it. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-14 15:54:35 +01:00			`for (i = 0; pathspec[i] != NULL; i++) {`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`int how;`
remove pathspec_match, use match_pathspec instead Both versions have the same functionality. This removes any redundancy. This also adds makes two extensions to match_pathspec: - If pathspec is NULL, return 1. This reflects the behavior of git commands, for which no paths usually means "match all paths". - If seen is NULL, do not use it. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-14 15:54:35 +01:00			`const char *match = pathspec[i] + prefix;`
			`if (seen && seen[i] == MATCHED_EXACTLY)`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`continue;`
match_pathspec() -- return how well the spec matched This updates the return value from match_pathspec() so that the caller can tell cases between exact match, leading pathname match (i.e. file "foo/bar" matches a pathspec "foo"), or filename glob match. This can be used to prevent "rm dir" from removing "dir/file" without explicitly asking for recursive behaviour with -r flag, for example. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-25 12:09:52 +01:00			`how = match_one(match, name, namelen);`
			`if (how) {`
			`if (retval < how)`
			`retval = how;`
remove pathspec_match, use match_pathspec instead Both versions have the same functionality. This removes any redundancy. This also adds makes two extensions to match_pathspec: - If pathspec is NULL, return 1. This reflects the behavior of git commands, for which no paths usually means "match all paths". - If seen is NULL, do not use it. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-14 15:54:35 +01:00			`if (seen && seen[i] < how)`
			`seen[i] = how;`
Move pathspec matching from builtin-add.c into dir.c I'll use it for builtin-rm.c too. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-20 01:07:51 +02:00			`}`
			`}`
			`return retval;`
			`}`

pathspec: add match_pathspec_depth() match_pathspec_depth() is a clone of match_pathspec() except that it can take depth limit. Computation is a bit lighter compared to match_pathspec() because it's usually precomputed and stored in struct pathspec. In long term, match_pathspec() and match_one() should be removed in favor of this function. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:48 +01:00			`/*`
			`* Does 'match' match the given name?`
			`* A match is found if`
			`*`
			`* (1) the 'match' string is leading directory of 'name', or`
			`* (2) the 'match' string is a wildcard and matches 'name', or`
			`* (3) the 'match' string is exactly the same as 'name'.`
			`*`
			`* and the return value tells which case it was.`
			`*`
			`* It returns 0 when there is no match.`
			`*/`
			`static int match_pathspec_item(const struct pathspec_item *item, int prefix,`
			`const char *name, int namelen)`
			`{`
			`/* name/namelen has prefix cut off by caller */`
			`const char *match = item->match + prefix;`
			`int matchlen = item->len - prefix;`

			`/* If the match was just the prefix, we matched */`
			`if (!*match)`
			`return MATCHED_RECURSIVELY;`

			`if (matchlen <= namelen && !strncmp(match, name, matchlen)) {`
			`if (matchlen == namelen)`
			`return MATCHED_EXACTLY;`

			`if (match[matchlen-1] == '/' \|\| name[matchlen] == '/')`
			`return MATCHED_RECURSIVELY;`
			`}`

pathspec: do exact comparison on the leading non-wildcard part Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-11-24 05:33:49 +01:00			`if (item->nowildcard_len < item->len &&`
pathspec: apply ".c" optimization from exclude When a pattern contains only a single asterisk as wildcard, e.g. "foobar", after literally comparing the leading part "foo" with the string, we can compare the tail of the string and make sure it matches "bar", instead of running fnmatch() on "bar" against the remainder of the string. -O2 build on linux-2.6, without the patch: $ time git rev-list --quiet HEAD -- '.c' real 0m40.770s user 0m40.290s sys 0m0.256s With the patch $ time ~/w/git/git rev-list --quiet HEAD -- '.c' real 0m34.288s user 0m33.997s sys 0m0.205s The above command is not supposed to be widely popular. It's chosen because it exercises pathspec matching a lot. The point is it cuts down matching time for popular patterns like .c, which could be used as pathspec in other places. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-11-24 05:33:50 +01:00			`!git_fnmatch(match, name,`
			`item->flags & PATHSPEC_ONESTAR ? GFNM_ONESTAR : 0,`
			`item->nowildcard_len - prefix))`
pathspec: add match_pathspec_depth() match_pathspec_depth() is a clone of match_pathspec() except that it can take depth limit. Computation is a bit lighter compared to match_pathspec() because it's usually precomputed and stored in struct pathspec. In long term, match_pathspec() and match_one() should be removed in favor of this function. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:48 +01:00			`return MATCHED_FNMATCH;`

			`return 0;`
			`}`

			`/*`
dir.c: improve docs for match_pathspec() and match_pathspec_depth() Fix a grammatical issue in the description of these functions, and make it more obvious how and why seen[] can be reused across multiple invocations. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:06 +01:00			`* Given a name and a list of pathspecs, returns the nature of the`
			`* closest (i.e. most specific) match of the name to any of the`
			`* pathspecs.`
			`*`
			`* The caller typically calls this multiple times with the same`
			`* pathspec and seen[] array but with different name/namelen`
			`* (e.g. entries from the index) and is interested in seeing if and`
			`* how each pathspec matches all the names it calls this function`
			`* with. A mark is left in the seen[] array for each pathspec element`
			`* indicating the closest type of match that element achieved, so if`
			`* seen[n] remains zero after multiple invocations, that means the nth`
			`* pathspec did not match any names, which could indicate that the`
			`* user mistyped the nth pathspec.`
pathspec: add match_pathspec_depth() match_pathspec_depth() is a clone of match_pathspec() except that it can take depth limit. Computation is a bit lighter compared to match_pathspec() because it's usually precomputed and stored in struct pathspec. In long term, match_pathspec() and match_one() should be removed in favor of this function. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:48 +01:00			`*/`
			`int match_pathspec_depth(const struct pathspec *ps,`
			`const char *name, int namelen,`
			`int prefix, char *seen)`
			`{`
			`int i, retval = 0;`

			`if (!ps->nr) {`
			`if (!ps->recursive \|\| ps->max_depth == -1)`
			`return MATCHED_RECURSIVELY;`

			`if (within_depth(name, namelen, 0, ps->max_depth))`
			`return MATCHED_EXACTLY;`
			`else`
			`return 0;`
			`}`

			`name += prefix;`
			`namelen -= prefix;`

			`for (i = ps->nr - 1; i >= 0; i--) {`
			`int how;`
			`if (seen && seen[i] == MATCHED_EXACTLY)`
			`continue;`
			`how = match_pathspec_item(ps->items+i, prefix, name, namelen);`
			`if (ps->recursive && ps->max_depth != -1 &&`
			`how && how != MATCHED_FNMATCH) {`
			`int len = ps->items[i].len;`
			`if (name[len] == '/')`
			`len++;`
			`if (within_depth(name+len, namelen-len, 0, ps->max_depth))`
			`how = MATCHED_EXACTLY;`
			`else`
			`how = 0;`
			`}`
			`if (how) {`
			`if (retval < how)`
			`retval = how;`
			`if (seen && seen[i] < how)`
			`seen[i] = how;`
			`}`
			`}`
			`return retval;`
			`}`

dir.c: get rid of the wildcard symbol set in no_wildcard() Elsewhere in this file is_glob_special() is also used to check for wildcards, which is defined in ctype. Make no_wildcard() also use this function (indirectly via simple_length()) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-06-07 09:53:35 +02:00			`/*`
			`* Return the length of the "simple" part of a path match limiter.`
			`*/`
			`static int simple_length(const char *match)`
			`{`
			`int len = -1;`

			`for (;;) {`
			`unsigned char c = *match++;`
			`len++;`
			`if (c == '\0' \|\| is_glob_special(c))`
			`return len;`
			`}`
			`}`

Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`static int no_wildcard(const char *string)`
			`{`
dir.c: get rid of the wildcard symbol set in no_wildcard() Elsewhere in this file is_glob_special() is also used to check for wildcards, which is defined in ctype. Make no_wildcard() also use this function (indirectly via simple_length()) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-06-07 09:53:35 +02:00			`return string[simple_length(string)] == '\0';`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`}`

attr: more matching optimizations from .gitignore .gitattributes and .gitignore share the same pattern syntax but has separate matching implementation. Over the years, ignore's implementation accumulates more optimizations while attr's stays the same. This patch reuses the core matching functions that are also used by excluded_from_list. excluded_from_list and path_matches can't be merged due to differences in exclude and attr, for example: * "!pattern" syntax is forbidden in .gitattributes. As an attribute can be unset (i.e. set to a special value "false") or made back to unspecified (i.e. not even set to "false"), "!pattern attr" is unclear which one it means. * we support attaching attributes to directories, but git-core internally does not currently make use of attributes on directories. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:39 +02:00			`void parse_exclude_pattern(const char **pattern,`
			`int *patternlen,`
			`int *flags,`
			`int *nowildcardlen)`
gitignore: make pattern parsing code a separate function This function can later be reused by attr.c. Also turn to_exclude field into a flag. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:38 +02:00			`{`
			`const char p = pattern;`
			`size_t i, len;`

			`*flags = 0;`
			`if (*p == '!') {`
			`*flags \|= EXC_FLAG_NEGATIVE;`
			`p++;`
			`}`
			`len = strlen(p);`
			`if (len && p[len - 1] == '/') {`
			`len--;`
			`*flags \|= EXC_FLAG_MUSTBEDIR;`
			`}`
			`for (i = 0; i < len; i++) {`
			`if (p[i] == '/')`
			`break;`
			`}`
			`if (i == len)`
			`*flags \|= EXC_FLAG_NODIR;`
			`*nowildcardlen = simple_length(p);`
			`/*`
			`* we should have excluded the trailing slash from 'p' too,`
			`* but that's one more allocation. Instead just make sure`
			`* nowildcardlen does not exceed real patternlen`
			`*/`
			`if (*nowildcardlen > len)`
			`*nowildcardlen = len;`
			`if (p == '' && no_wildcard(p + 1))`
			`*flags \|= EXC_FLAG_ENDSWITH;`
			`*pattern = p;`
			`*patternlen = len;`
			`}`

libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`void add_exclude(const char string, const char base,`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`int baselen, struct exclude_list *el, int srcpos)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`struct exclude *x;`
gitignore: make pattern parsing code a separate function This function can later be reused by attr.c. Also turn to_exclude field into a flag. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:38 +02:00			`int patternlen;`
			`int flags;`
			`int nowildcardlen;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
gitignore: make pattern parsing code a separate function This function can later be reused by attr.c. Also turn to_exclude field into a flag. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:38 +02:00			`parse_exclude_pattern(&string, &patternlen, &flags, &nowildcardlen);`
			`if (flags & EXC_FLAG_MUSTBEDIR) {`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`char *s;`
gitignore: make pattern parsing code a separate function This function can later be reused by attr.c. Also turn to_exclude field into a flag. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:38 +02:00			`x = xmalloc(sizeof(*x) + patternlen + 1);`
Fix a bunch of pointer declarations (codestyle) Essentially; s/type* /type */ as per the coding guidelines. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-01 11:06:36 +02:00			`s = (char *)(x+1);`
gitignore: make pattern parsing code a separate function This function can later be reused by attr.c. Also turn to_exclude field into a flag. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:38 +02:00			`memcpy(s, string, patternlen);`
			`s[patternlen] = '\0';`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`x->pattern = s;`
			`} else {`
			`x = xmalloc(sizeof(*x));`
			`x->pattern = string;`
			`}`
gitignore: make pattern parsing code a separate function This function can later be reused by attr.c. Also turn to_exclude field into a flag. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:38 +02:00			`x->patternlen = patternlen;`
			`x->nowildcardlen = nowildcardlen;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`x->base = base;`
			`x->baselen = baselen;`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00			`x->flags = flags;`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`x->srcpos = srcpos;`
dir.c: rename cryptic 'which' variable to more consistent name 'el' is only slightly less cryptic, but is already used as the variable name for a struct exclude_list pointer in numerous other places, so this reduces the number of cryptic variable names in use by one :-) Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:22 +01:00			`ALLOC_GROW(el->excludes, el->nr + 1, el->alloc);`
			`el->excludes[el->nr++] = x;`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`x->el = el;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

Read .gitignore from index if it is skip-worktree This adds index as a prerequisite for directory listing (with exclude). At the moment directory listing is used by "git clean", "git add", "git ls-files" and "git status"/"git commit" and unpack_trees()-related commands. These commands have been checked/modified to populate index before doing directory listing. add_excludes_from_file() does not enable this feature, because it is used to read .git/info/exclude and some explicit files specified by "git ls-files". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:01 +02:00			`static void read_skip_worktree_file_from_index(const char path, size_t *size)`
			`{`
			`int pos, len;`
			`unsigned long sz;`
			`enum object_type type;`
			`void *data;`
			`struct index_state *istate = &the_index;`

			`len = strlen(path);`
			`pos = index_name_pos(istate, path, len);`
			`if (pos < 0)`
			`return NULL;`
			`if (!ce_skip_worktree(istate->cache[pos]))`
			`return NULL;`
			`data = read_sha1_file(istate->cache[pos]->sha1, &type, &sz);`
			`if (!data \|\| type != OBJ_BLOB) {`
			`free(data);`
			`return NULL;`
			`}`
			`*size = xsize_t(sz);`
			`return data;`
			`}`

dir.c: rename free_excludes() to clear_exclude_list() It is clearer to use a 'clear_' prefix for functions which empty and deallocate the contents of a data structure without freeing the structure itself, and a 'free_' prefix for functions which also free the structure itself. http://article.gmane.org/gmane.comp.version-control.git/206128 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:29 +01:00			`/*`
			`* Frees memory within el which was allocated for exclude patterns and`
			`* the file buffer. Does not free el itself.`
			`*/`
			`void clear_exclude_list(struct exclude_list *el)`
dir.c: add free_excludes() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-26 19:17:44 +01:00			`{`
			`int i;`

			`for (i = 0; i < el->nr; i++)`
			`free(el->excludes[i]);`
			`free(el->excludes);`
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00			`free(el->filebuf);`
dir.c: add free_excludes() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-26 19:17:44 +01:00
			`el->nr = 0;`
			`el->excludes = NULL;`
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00			`el->filebuf = NULL;`
dir.c: add free_excludes() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-26 19:17:44 +01:00			`}`

dir.c: export excluded_1() and add_excludes_from_file_1() These functions are used to handle .gitignore. They are now exported so that sparse checkout can reuse. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:04 +02:00			`int add_excludes_from_file_to_list(const char *fname,`
			`const char *base,`
			`int baselen,`
dir.c: rename cryptic 'which' variable to more consistent name 'el' is only slightly less cryptic, but is already used as the variable name for a struct exclude_list pointer in numerous other places, so this reduces the number of cryptic variable names in use by one :-) Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:22 +01:00			`struct exclude_list *el,`
dir.c: export excluded_1() and add_excludes_from_file_1() These functions are used to handle .gitignore. They are now exported so that sparse checkout can reuse. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:04 +02:00			`int check_index)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
Use fstat instead of fseek Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-08-28 01:55:46 +02:00			`struct stat st;`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`int fd, i, lineno = 1;`
dir.c: squelch false uninitialized memory warning GCC 4.4.4 on MacOS incorrectly warns about potential use of uninitialized memory. Signed-off-by: Pat Notz <patnotz@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-09-16 22:53:22 +02:00			`size_t size = 0;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`char buf, entry;`

			`fd = open(fname, O_RDONLY);`
Read .gitignore from index if it is skip-worktree This adds index as a prerequisite for directory listing (with exclude). At the moment directory listing is used by "git clean", "git add", "git ls-files" and "git status"/"git commit" and unpack_trees()-related commands. These commands have been checked/modified to populate index before doing directory listing. add_excludes_from_file() does not enable this feature, because it is used to read .git/info/exclude and some explicit files specified by "git ls-files". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:01 +02:00			`if (fd < 0 \|\| fstat(fd, &st) < 0) {`
gitignore: report access errors of exclude files When we try to access gitignore files, we check for their existence with a call to "access". We silently ignore missing files. However, if a file is not readable, this may be a configuration error; let's warn the user. For $GIT_DIR/info/excludes or core.excludesfile, we can just use access_or_warn. However, for per-directory files we actually try to open them, so we must add a custom warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-21 08:26:07 +02:00			`if (errno != ENOENT)`
warn_on_inaccessible(): a helper to warn on inaccessible paths The previous series introduced warnings to multiple places, but it could become tiring to see the warning on the same path over and over again during a single run of Git. Making just one function responsible for issuing this warning, we could later choose to keep track of which paths we issued a warning (it would involve a hash table of paths after running them through real_path() or something) in order to reduce noise. Right now we do not know if the noise reduction is necessary, but it still would be a good code reduction/sharing anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-21 23:52:07 +02:00			`warn_on_inaccessible(fname);`
Read .gitignore from index if it is skip-worktree This adds index as a prerequisite for directory listing (with exclude). At the moment directory listing is used by "git clean", "git add", "git ls-files" and "git status"/"git commit" and unpack_trees()-related commands. These commands have been checked/modified to populate index before doing directory listing. add_excludes_from_file() does not enable this feature, because it is used to read .git/info/exclude and some explicit files specified by "git ls-files". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:01 +02:00			`if (0 <= fd)`
			`close(fd);`
			`if (!check_index \|\|`
			`(buf = read_skip_worktree_file_from_index(fname, &size)) == NULL)`
			`return -1;`
Fix memory corruption when .gitignore does not end by \n Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1()) tried not to append '\n' at the end because the next commit may return a buffer that does not have extra space for that. Unfortunately it left this assignment in the loop: buf[i - (i && buf[i-1] == '\r')] = 0; that can corrupt memory if "buf" is not '\n' terminated. But even if it does not corrupt memory, the last line would not be NULL-terminated, leading to errors later inside add_exclude(). This patch fixes it by reverting the faulty commit and make sure "buf" is always \n terminated. While at it, free unused memory properly. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-20 15:09:16 +01:00			`if (size == 0) {`
			`free(buf);`
			`return 0;`
			`}`
			`if (buf[size-1] != '\n') {`
			`buf = xrealloc(buf, size+1);`
			`buf[size++] = '\n';`
			`}`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
Read .gitignore from index if it is skip-worktree This adds index as a prerequisite for directory listing (with exclude). At the moment directory listing is used by "git clean", "git add", "git ls-files" and "git status"/"git commit" and unpack_trees()-related commands. These commands have been checked/modified to populate index before doing directory listing. add_excludes_from_file() does not enable this feature, because it is used to read .git/info/exclude and some explicit files specified by "git ls-files". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:01 +02:00			`else {`
			`size = xsize_t(st.st_size);`
			`if (size == 0) {`
			`close(fd);`
			`return 0;`
			`}`
Fix memory corruption when .gitignore does not end by \n Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1()) tried not to append '\n' at the end because the next commit may return a buffer that does not have extra space for that. Unfortunately it left this assignment in the loop: buf[i - (i && buf[i-1] == '\r')] = 0; that can corrupt memory if "buf" is not '\n' terminated. But even if it does not corrupt memory, the last line would not be NULL-terminated, leading to errors later inside add_exclude(). This patch fixes it by reverting the faulty commit and make sure "buf" is always \n terminated. While at it, free unused memory properly. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-20 15:09:16 +01:00			`buf = xmalloc(size+1);`
Read .gitignore from index if it is skip-worktree This adds index as a prerequisite for directory listing (with exclude). At the moment directory listing is used by "git clean", "git add", "git ls-files" and "git status"/"git commit" and unpack_trees()-related commands. These commands have been checked/modified to populate index before doing directory listing. add_excludes_from_file() does not enable this feature, because it is used to read .git/info/exclude and some explicit files specified by "git ls-files". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:01 +02:00			`if (read_in_full(fd, buf, size) != size) {`
Fix memory corruption when .gitignore does not end by \n Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1()) tried not to append '\n' at the end because the next commit may return a buffer that does not have extra space for that. Unfortunately it left this assignment in the loop: buf[i - (i && buf[i-1] == '\r')] = 0; that can corrupt memory if "buf" is not '\n' terminated. But even if it does not corrupt memory, the last line would not be NULL-terminated, leading to errors later inside add_exclude(). This patch fixes it by reverting the faulty commit and make sure "buf" is always \n terminated. While at it, free unused memory properly. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-20 15:09:16 +01:00			`free(buf);`
Read .gitignore from index if it is skip-worktree This adds index as a prerequisite for directory listing (with exclude). At the moment directory listing is used by "git clean", "git add", "git ls-files" and "git status"/"git commit" and unpack_trees()-related commands. These commands have been checked/modified to populate index before doing directory listing. add_excludes_from_file() does not enable this feature, because it is used to read .git/info/exclude and some explicit files specified by "git ls-files". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:01 +02:00			`close(fd);`
			`return -1;`
			`}`
Fix memory corruption when .gitignore does not end by \n Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1()) tried not to append '\n' at the end because the next commit may return a buffer that does not have extra space for that. Unfortunately it left this assignment in the loop: buf[i - (i && buf[i-1] == '\r')] = 0; that can corrupt memory if "buf" is not '\n' terminated. But even if it does not corrupt memory, the last line would not be NULL-terminated, leading to errors later inside add_exclude(). This patch fixes it by reverting the faulty commit and make sure "buf" is always \n terminated. While at it, free unused memory properly. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-20 15:09:16 +01:00			`buf[size++] = '\n';`
Read .gitignore from index if it is skip-worktree This adds index as a prerequisite for directory listing (with exclude). At the moment directory listing is used by "git clean", "git add", "git ls-files" and "git status"/"git commit" and unpack_trees()-related commands. These commands have been checked/modified to populate index before doing directory listing. add_excludes_from_file() does not enable this feature, because it is used to read .git/info/exclude and some explicit files specified by "git ls-files". Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-08-20 15:47:01 +02:00			`close(fd);`
Fix a memory leak Signed-off-by: Li Hong <leehong@pku.edu.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-12-16 05:53:26 +01:00			`}`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00			`el->filebuf = buf;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`entry = buf;`
Fix memory corruption when .gitignore does not end by \n Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1()) tried not to append '\n' at the end because the next commit may return a buffer that does not have extra space for that. Unfortunately it left this assignment in the loop: buf[i - (i && buf[i-1] == '\r')] = 0; that can corrupt memory if "buf" is not '\n' terminated. But even if it does not corrupt memory, the last line would not be NULL-terminated, leading to errors later inside add_exclude(). This patch fixes it by reverting the faulty commit and make sure "buf" is always \n terminated. While at it, free unused memory properly. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-20 15:09:16 +01:00			`for (i = 0; i < size; i++) {`
			`if (buf[i] == '\n') {`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`if (entry != buf + i && entry[0] != '#') {`
			`buf[i - (i && buf[i-1] == '\r')] = 0;`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`add_exclude(entry, base, baselen, el, lineno);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`lineno++;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`entry = buf + i + 1;`
			`}`
			`}`
			`return 0;`
			`}`

dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`struct exclude_list add_exclude_list(struct dir_struct dir,`
			`int group_type, const char *src)`
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00			`{`
			`struct exclude_list *el;`
			`struct exclude_list_group *group;`

			`group = &dir->exclude_list_group[group_type];`
			`ALLOC_GROW(group->el, group->nr + 1, group->alloc);`
			`el = &group->el[group->nr++];`
			`memset(el, 0, sizeof(*el));`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`el->src = src;`
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00			`return el;`
			`}`

			`/*`
			`* Used to set up core.excludesfile and .git/info/exclude lists.`
			`*/`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`void add_excludes_from_file(struct dir_struct dir, const char fname)`
			`{`
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00			`struct exclude_list *el;`
dir.c: keep track of where patterns came from For exclude patterns read in from files, the filename is stored in the exclude list, and the originating line number is stored in the individual exclude (counting starting at 1). For exclude patterns provided on the command line, a string describing the source of the patterns is stored in the exclude list, and the sequence number assigned to each exclude pattern is negative, with counting starting at -1. So for example the 2nd pattern provided via --exclude would be numbered -2. This allows any future consumers of that data to easily distinguish between exclude patterns from files vs. from the CLI. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:04 +01:00			`el = add_exclude_list(dir, EXC_FILE, fname);`
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00			`if (add_excludes_from_file_to_list(fname, "", 0, el, 0) < 0)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`die("cannot use %s as an exclude file", fname);`
			`}`

attr: more matching optimizations from .gitignore .gitattributes and .gitignore share the same pattern syntax but has separate matching implementation. Over the years, ignore's implementation accumulates more optimizations while attr's stays the same. This patch reuses the core matching functions that are also used by excluded_from_list. excluded_from_list and path_matches can't be merged due to differences in exclude and attr, for example: * "!pattern" syntax is forbidden in .gitattributes. As an attribute can be unset (i.e. set to a special value "false") or made back to unspecified (i.e. not even set to "false"), "!pattern attr" is unclear which one it means. * we support attaching attributes to directories, but git-core internally does not currently make use of attributes on directories. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:39 +02:00			`int match_basename(const char *basename, int basenamelen,`
			`const char *pattern, int prefix, int patternlen,`
			`int flags)`
exclude: split basename matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:35 +02:00			`{`
			`if (prefix == patternlen) {`
dir.c::match_basename(): pay attention to the length of string parameters The function takes two counted strings (<basename, basenamelen> and <pattern, patternlen>) as parameters, together with prefix (the length of the prefix in pattern that is to be matched literally without globbing against the basename) and EXC_* flags that tells it how to match the pattern against the basename. However, it did not pay attention to the length of these counted strings. Update them to do the following: * When the entire pattern is to be matched literally, the pattern matches the basename only when the lengths of them are the same, and they match up to that length. * When the pattern is "" followed by a string to be matched literally, make sure that the basenamelen is equal or longer than the "literal" part of the pattern, and the tail of the basename string matches that literal part. Otherwise, use the new fnmatch_icase_mem helper to make sure we only lookmake sure we use only look at the counted part of the strings. Because these counted strings are full strings most of the time, we check for termination to avoid unnecessary allocation. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:47:28 +01:00			`if (patternlen == basenamelen &&`
			`!strncmp_icase(pattern, basename, basenamelen))`
exclude: split basename matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:35 +02:00			`return 1;`
			`} else if (flags & EXC_FLAG_ENDSWITH) {`
dir.c::match_basename(): pay attention to the length of string parameters The function takes two counted strings (<basename, basenamelen> and <pattern, patternlen>) as parameters, together with prefix (the length of the prefix in pattern that is to be matched literally without globbing against the basename) and EXC_* flags that tells it how to match the pattern against the basename. However, it did not pay attention to the length of these counted strings. Update them to do the following: * When the entire pattern is to be matched literally, the pattern matches the basename only when the lengths of them are the same, and they match up to that length. * When the pattern is "" followed by a string to be matched literally, make sure that the basenamelen is equal or longer than the "literal" part of the pattern, and the tail of the basename string matches that literal part. Otherwise, use the new fnmatch_icase_mem helper to make sure we only lookmake sure we use only look at the counted part of the strings. Because these counted strings are full strings most of the time, we check for termination to avoid unnecessary allocation. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:47:28 +01:00			`/* "literal" matching against "fooliteral" /`
exclude: split basename matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:35 +02:00			`if (patternlen - 1 <= basenamelen &&`
dir.c::match_basename(): pay attention to the length of string parameters The function takes two counted strings (<basename, basenamelen> and <pattern, patternlen>) as parameters, together with prefix (the length of the prefix in pattern that is to be matched literally without globbing against the basename) and EXC_* flags that tells it how to match the pattern against the basename. However, it did not pay attention to the length of these counted strings. Update them to do the following: * When the entire pattern is to be matched literally, the pattern matches the basename only when the lengths of them are the same, and they match up to that length. * When the pattern is "" followed by a string to be matched literally, make sure that the basenamelen is equal or longer than the "literal" part of the pattern, and the tail of the basename string matches that literal part. Otherwise, use the new fnmatch_icase_mem helper to make sure we only lookmake sure we use only look at the counted part of the strings. Because these counted strings are full strings most of the time, we check for termination to avoid unnecessary allocation. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:47:28 +01:00			`!strncmp_icase(pattern + 1,`
			`basename + basenamelen - (patternlen - 1),`
			`patternlen - 1))`
exclude: split basename matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:35 +02:00			`return 1;`
			`} else {`
dir.c::match_basename(): pay attention to the length of string parameters The function takes two counted strings (<basename, basenamelen> and <pattern, patternlen>) as parameters, together with prefix (the length of the prefix in pattern that is to be matched literally without globbing against the basename) and EXC_* flags that tells it how to match the pattern against the basename. However, it did not pay attention to the length of these counted strings. Update them to do the following: * When the entire pattern is to be matched literally, the pattern matches the basename only when the lengths of them are the same, and they match up to that length. * When the pattern is "" followed by a string to be matched literally, make sure that the basenamelen is equal or longer than the "literal" part of the pattern, and the tail of the basename string matches that literal part. Otherwise, use the new fnmatch_icase_mem helper to make sure we only lookmake sure we use only look at the counted part of the strings. Because these counted strings are full strings most of the time, we check for termination to avoid unnecessary allocation. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:47:28 +01:00			`if (fnmatch_icase_mem(pattern, patternlen,`
			`basename, basenamelen,`
			`0) == 0)`
exclude: split basename matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:35 +02:00			`return 1;`
			`}`
			`return 0;`
			`}`

attr: more matching optimizations from .gitignore .gitattributes and .gitignore share the same pattern syntax but has separate matching implementation. Over the years, ignore's implementation accumulates more optimizations while attr's stays the same. This patch reuses the core matching functions that are also used by excluded_from_list. excluded_from_list and path_matches can't be merged due to differences in exclude and attr, for example: * "!pattern" syntax is forbidden in .gitattributes. As an attribute can be unset (i.e. set to a special value "false") or made back to unspecified (i.e. not even set to "false"), "!pattern attr" is unclear which one it means. * we support attaching attributes to directories, but git-core internally does not currently make use of attributes on directories. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:39 +02:00			`int match_pathname(const char *pathname, int pathlen,`
			`const char *base, int baselen,`
			`const char *pattern, int prefix, int patternlen,`
			`int flags)`
exclude: split pathname matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:37 +02:00			`{`
			`const char *name;`
			`int namelen;`

			`/*`
			`* match with FNM_PATHNAME; the pattern has base implicitly`
			`* in front of it.`
			`*/`
			`if (*pattern == '/') {`
			`pattern++;`
dir.c::match_pathname(): adjust patternlen when shifting pattern If we receive a pattern that starts with "/", we shift it forward to avoid looking at the "/" part. Since the prefix and patternlen parameters are counts of what is in the pattern, we must decrement them as we increment the pointer. We remembered to handle prefix, but not patternlen. This didn't cause any bugs, though, because the patternlen parameter is not actually used. Since it will be used in future patches, let's correct this oversight. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:47:47 +01:00			`patternlen--;`
exclude: split pathname matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:37 +02:00			`prefix--;`
			`}`

			`/*`
			`* baselen does not count the trailing slash. base[] may or`
			`* may not end with a trailing slash though.`
			`*/`
			`if (pathlen < baselen + 1 \|\|`
			`(baselen && pathname[baselen] != '/') \|\|`
			`strncmp_icase(pathname, base, baselen))`
			`return 0;`

			`namelen = baselen ? pathlen - baselen - 1 : pathlen;`
			`name = pathname + pathlen - namelen;`

			`if (prefix) {`
			`/*`
			`* if the non-wildcard part is longer than the`
			`* remaining pathname, surely it cannot match.`
			`*/`
			`if (prefix > namelen)`
			`return 0;`

			`if (strncmp_icase(pattern, name, prefix))`
			`return 0;`
			`pattern += prefix;`
dir.c::match_pathname(): pay attention to the length of string parameters This function takes two counted strings: a <pattern, patternlen> pair and a <pathname, pathlen> pair. But we end up feeding the result to fnmatch, which expects NUL-terminated strings. We can fix this by calling the fnmatch_icase_mem function, which handles re-allocating into a NUL-terminated string if necessary. While we're at it, we can avoid even calling fnmatch in some cases. In addition to patternlen, we get "prefix", the size of the pattern that contains no wildcard characters. We do a straight match of the prefix part first, and then use fnmatch to cover the rest. But if there are no wildcards in the pattern at all, we do not even need to call fnmatch; we would simply be comparing two empty strings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:48:21 +01:00			`patternlen -= prefix;`
exclude: split pathname matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:37 +02:00			`name += prefix;`
			`namelen -= prefix;`
dir.c::match_pathname(): pay attention to the length of string parameters This function takes two counted strings: a <pattern, patternlen> pair and a <pathname, pathlen> pair. But we end up feeding the result to fnmatch, which expects NUL-terminated strings. We can fix this by calling the fnmatch_icase_mem function, which handles re-allocating into a NUL-terminated string if necessary. While we're at it, we can avoid even calling fnmatch in some cases. In addition to patternlen, we get "prefix", the size of the pattern that contains no wildcard characters. We do a straight match of the prefix part first, and then use fnmatch to cover the rest. But if there are no wildcards in the pattern at all, we do not even need to call fnmatch; we would simply be comparing two empty strings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:48:21 +01:00
			`/*`
			`* If the whole pattern did not have a wildcard,`
			`* then our prefix match is all we need; we`
			`* do not need to call fnmatch at all.`
			`*/`
			`if (!patternlen && !namelen)`
			`return 1;`
exclude: split pathname matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:37 +02:00			`}`

dir.c::match_pathname(): pay attention to the length of string parameters This function takes two counted strings: a <pattern, patternlen> pair and a <pathname, pathlen> pair. But we end up feeding the result to fnmatch, which expects NUL-terminated strings. We can fix this by calling the fnmatch_icase_mem function, which handles re-allocating into a NUL-terminated string if necessary. While we're at it, we can avoid even calling fnmatch in some cases. In addition to patternlen, we get "prefix", the size of the pattern that contains no wildcard characters. We do a straight match of the prefix part first, and then use fnmatch to cover the rest. But if there are no wildcards in the pattern at all, we do not even need to call fnmatch; we would simply be comparing two empty strings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-03-28 22:48:21 +01:00			`return fnmatch_icase_mem(pattern, patternlen,`
			`name, namelen,`
Merge branch 'jc/directory-attrs-regression-fix' Fix 1.8.1.x regression that stopped matching "dir" (without trailing slash) to a directory "dir". * jc/directory-attrs-regression-fix: t: check that a pattern without trailing slash matches a directory dir.c::match_pathname(): pay attention to the length of string parameters dir.c::match_pathname(): adjust patternlen when shifting pattern dir.c::match_basename(): pay attention to the length of string parameters attr.c::path_matches(): special case paths that end with a slash attr.c::path_matches(): the basename is part of the pathname 2013-04-03 18:34:04 +02:00			`WM_PATHNAME) == 0;`
exclude: split pathname matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:37 +02:00			`}`

dir.c: refactor is_excluded_from_list() The excluded function uses a new helper function called last_exclude_matching_from_list() to perform the inner loop over all of the exclude patterns. The helper just tells us whether the path is included, excluded, or undecided. However, it may be useful to know _which_ pattern was triggered. So let's pass out the entire exclude match, which contains the status information we were already passing out. Further patches can make use of this. This is a modified forward port of a patch from 2009 by Jeff King: http://article.gmane.org/gmane.comp.version-control.git/108815 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:26 +01:00			`/*`
			`* Scan the given exclude list in reverse to see whether pathname`
			`* should be ignored. The first match (i.e. the last on the list), if`
			`* any, determines the fate. Returns the exclude_list element which`
			`* matched, or NULL for undecided.`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`*/`
dir.c: refactor is_excluded_from_list() The excluded function uses a new helper function called last_exclude_matching_from_list() to perform the inner loop over all of the exclude patterns. The helper just tells us whether the path is included, excluded, or undecided. However, it may be useful to know _which_ pattern was triggered. So let's pass out the entire exclude match, which contains the status information we were already passing out. Further patches can make use of this. This is a modified forward port of a patch from 2009 by Jeff King: http://article.gmane.org/gmane.comp.version-control.git/108815 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:26 +01:00			`static struct exclude last_exclude_matching_from_list(const char pathname,`
			`int pathlen,`
			`const char *basename,`
			`int *dtype,`
			`struct exclude_list *el)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
			`int i;`

Unindent excluded_from_list() Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME code block as this block is getting complex and may need more indentation. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-26 14:31:12 +02:00			`if (!el->nr)`
dir.c: refactor is_excluded_from_list() The excluded function uses a new helper function called last_exclude_matching_from_list() to perform the inner loop over all of the exclude patterns. The helper just tells us whether the path is included, excluded, or undecided. However, it may be useful to know _which_ pattern was triggered. So let's pass out the entire exclude match, which contains the status information we were already passing out. Further patches can make use of this. This is a modified forward port of a patch from 2009 by Jeff King: http://article.gmane.org/gmane.comp.version-control.git/108815 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:26 +01:00			`return NULL; /* undefined */`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00
Unindent excluded_from_list() Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME code block as this block is getting complex and may need more indentation. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-26 14:31:12 +02:00			`for (i = el->nr - 1; 0 <= i; i--) {`
			`struct exclude *x = el->excludes[i];`
exclude: split pathname matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:37 +02:00			`const char *exclude = x->pattern;`
			`int prefix = x->nowildcardlen;`
Unindent excluded_from_list() Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME code block as this block is getting complex and may need more indentation. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-26 14:31:12 +02:00
			`if (x->flags & EXC_FLAG_MUSTBEDIR) {`
			`if (*dtype == DT_UNKNOWN)`
			`*dtype = get_dtype(NULL, pathname, pathlen);`
			`if (*dtype != DT_DIR)`
			`continue;`
			`}`
gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 10:17:48 +01:00
Unindent excluded_from_list() Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME code block as this block is getting complex and may need more indentation. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-26 14:31:12 +02:00			`if (x->flags & EXC_FLAG_NODIR) {`
exclude: split basename matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:35 +02:00			`if (match_basename(basename,`
			`pathlen - (basename - pathname),`
			`exclude, prefix, x->patternlen,`
			`x->flags))`
dir.c: refactor is_excluded_from_list() The excluded function uses a new helper function called last_exclude_matching_from_list() to perform the inner loop over all of the exclude patterns. The helper just tells us whether the path is included, excluded, or undecided. However, it may be useful to know _which_ pattern was triggered. So let's pass out the entire exclude match, which contains the status information we were already passing out. Further patches can make use of this. This is a modified forward port of a patch from 2009 by Jeff King: http://article.gmane.org/gmane.comp.version-control.git/108815 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:26 +01:00			`return x;`
Unindent excluded_from_list() Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME code block as this block is getting complex and may need more indentation. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-26 14:31:12 +02:00			`continue;`
			`}`

exclude: split pathname matching code into a separate function Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-10-15 08:24:37 +02:00			`assert(x->baselen == 0 \|\| x->base[x->baselen - 1] == '/');`
			`if (match_pathname(pathname, pathlen,`
			`x->base, x->baselen ? x->baselen - 1 : 0,`
			`exclude, prefix, x->patternlen, x->flags))`
dir.c: refactor is_excluded_from_list() The excluded function uses a new helper function called last_exclude_matching_from_list() to perform the inner loop over all of the exclude patterns. The helper just tells us whether the path is included, excluded, or undecided. However, it may be useful to know _which_ pattern was triggered. So let's pass out the entire exclude match, which contains the status information we were already passing out. Further patches can make use of this. This is a modified forward port of a patch from 2009 by Jeff King: http://article.gmane.org/gmane.comp.version-control.git/108815 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:26 +01:00			`return x;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
dir.c: refactor is_excluded_from_list() The excluded function uses a new helper function called last_exclude_matching_from_list() to perform the inner loop over all of the exclude patterns. The helper just tells us whether the path is included, excluded, or undecided. However, it may be useful to know _which_ pattern was triggered. So let's pass out the entire exclude match, which contains the status information we were already passing out. Further patches can make use of this. This is a modified forward port of a patch from 2009 by Jeff King: http://article.gmane.org/gmane.comp.version-control.git/108815 Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:26 +01:00			`return NULL; /* undecided */`
			`}`

			`/*`
			`* Scan the list and let the last match determine the fate.`
			`* Return 1 for exclude, 0 for include and -1 for undecided.`
			`*/`
			`int is_excluded_from_list(const char *pathname,`
			`int pathlen, const char basename, int dtype,`
			`struct exclude_list *el)`
			`{`
			`struct exclude *exclude;`
			`exclude = last_exclude_matching_from_list(pathname, pathlen, basename, dtype, el);`
			`if (exclude)`
			`return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`return -1; /* undecided */`
			`}`

dir.c: factor out parts of last_exclude_matching for later reuse Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:11:02 +02:00			`static struct exclude last_exclude_matching_from_lists(struct dir_struct dir,`
			`const char pathname, int pathlen, const char basename,`
			`int *dtype_p)`
			`{`
			`int i, j;`
			`struct exclude_list_group *group;`
			`struct exclude *exclude;`
			`for (i = EXC_CMDL; i <= EXC_FILE; i++) {`
			`group = &dir->exclude_list_group[i];`
			`for (j = group->nr - 1; j >= 0; j--) {`
			`exclude = last_exclude_matching_from_list(`
			`pathname, pathlen, basename, dtype_p,`
			`&group->el[j]);`
			`if (exclude)`
			`return exclude;`
			`}`
			`}`
			`return NULL;`
			`}`

dir.c: move prep_exclude Move prep_exclude in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:11:37 +02:00			`/*`
			`* Loads the per-directory exclude list for the substring of base`
			`* which has a char length of baselen.`
			`*/`
			`static void prep_exclude(struct dir_struct dir, const char base, int baselen)`
			`{`
			`struct exclude_list_group *group;`
			`struct exclude_list *el;`
			`struct exclude_stack *stk = NULL;`
			`int current;`

			`group = &dir->exclude_list_group[EXC_DIRS];`

			`/* Pop the exclude lists from the EXCL_DIRS exclude_list_group`
			`* which originate from directories not in the prefix of the`
			`* path being checked. */`
			`while ((stk = dir->exclude_stack) != NULL) {`
			`if (stk->baselen <= baselen &&`
			`!strncmp(dir->basebuf, base, stk->baselen))`
			`break;`
			`el = &group->el[dir->exclude_stack->exclude_ix];`
			`dir->exclude_stack = stk->prev;`
dir.c: unify is_excluded and is_path_excluded APIs The is_excluded and is_path_excluded APIs are very similar, except for a few noteworthy differences: is_excluded doesn't handle ignored directories, results for paths within ignored directories are incorrect. This is probably based on the premise that recursive directory scans should stop at ignored directories, which is no longer true (in certain cases, read_directory_recursive currently calls is_excluded and is_path_excluded to get correct ignored state). is_excluded caches parsed .gitignore files of the last directory in struct dir_struct. If the directory changes, it finds a common parent directory and is very careful to drop only as much state as necessary. On the other hand, is_excluded will also read and parse .gitignore files in already ignored directories, which are completely irrelevant. is_path_excluded correctly handles ignored directories by checking if any component in the path is excluded. As it uses is_excluded internally, this unfortunately forces is_excluded to drop and re-read all .gitignore files, as there is no common parent directory for the root dir. is_path_excluded tracks state in a separate struct path_exclude_check, which is essentially a wrapper of dir_struct with two more fields. However, as is_path_excluded also modifies dir_struct, it is not possible to e.g. use multiple path_exclude_check structures with the same dir_struct in parallel. The additional structure just unnecessarily complicates the API. Teach is_excluded / prep_exclude about ignored directories: whenever entering a new directory, first check if the entire directory is excluded. Remember the excluded state in dir_struct. Don't traverse into already ignored directories (i.e. don't read irrelevant .gitignore files). Directories could also be excluded by exclude patterns specified on the command line or .git/info/exclude, so we cannot simply skip prep_exclude entirely if there's no .gitignore file name (dir_struct.exclude_per_dir). Move this check to just before actually reading the file. is_path_excluded is now equivalent to is_excluded, so we can simply redirect to it (the public API is cleaned up in the next patch). The performance impact of the additional ignored check per directory is hardly noticeable when reading directories recursively (e.g. 'git status'). However, performance of git commands using the is_path_excluded API (e.g. 'git ls-files --cached --ignored --exclude-standard') is greatly improved as this no longer re-reads .gitignore files on each call. Here's some performance data from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| ls-files -ci \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------+-------+--------- before \| 0.506 \| 6.539 \| 0.212 \| 1.555 \| 0.323 \| 2.541 after \| 0.080 \| 1.191 \| 0.218 \| 1.583 \| 0.321 \| 2.579 gain \| 6.325 \| 5.490 \| 0.972 \| 0.982 \| 1.006 \| 0.985 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:12:14 +02:00			`dir->exclude = NULL;`
dir.c: move prep_exclude Move prep_exclude in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:11:37 +02:00			`free((char )el->src); / see strdup() below */`
			`clear_exclude_list(el);`
			`free(stk);`
			`group->nr--;`
			`}`

dir.c: unify is_excluded and is_path_excluded APIs The is_excluded and is_path_excluded APIs are very similar, except for a few noteworthy differences: is_excluded doesn't handle ignored directories, results for paths within ignored directories are incorrect. This is probably based on the premise that recursive directory scans should stop at ignored directories, which is no longer true (in certain cases, read_directory_recursive currently calls is_excluded and is_path_excluded to get correct ignored state). is_excluded caches parsed .gitignore files of the last directory in struct dir_struct. If the directory changes, it finds a common parent directory and is very careful to drop only as much state as necessary. On the other hand, is_excluded will also read and parse .gitignore files in already ignored directories, which are completely irrelevant. is_path_excluded correctly handles ignored directories by checking if any component in the path is excluded. As it uses is_excluded internally, this unfortunately forces is_excluded to drop and re-read all .gitignore files, as there is no common parent directory for the root dir. is_path_excluded tracks state in a separate struct path_exclude_check, which is essentially a wrapper of dir_struct with two more fields. However, as is_path_excluded also modifies dir_struct, it is not possible to e.g. use multiple path_exclude_check structures with the same dir_struct in parallel. The additional structure just unnecessarily complicates the API. Teach is_excluded / prep_exclude about ignored directories: whenever entering a new directory, first check if the entire directory is excluded. Remember the excluded state in dir_struct. Don't traverse into already ignored directories (i.e. don't read irrelevant .gitignore files). Directories could also be excluded by exclude patterns specified on the command line or .git/info/exclude, so we cannot simply skip prep_exclude entirely if there's no .gitignore file name (dir_struct.exclude_per_dir). Move this check to just before actually reading the file. is_path_excluded is now equivalent to is_excluded, so we can simply redirect to it (the public API is cleaned up in the next patch). The performance impact of the additional ignored check per directory is hardly noticeable when reading directories recursively (e.g. 'git status'). However, performance of git commands using the is_path_excluded API (e.g. 'git ls-files --cached --ignored --exclude-standard') is greatly improved as this no longer re-reads .gitignore files on each call. Here's some performance data from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| ls-files -ci \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------+-------+--------- before \| 0.506 \| 6.539 \| 0.212 \| 1.555 \| 0.323 \| 2.541 after \| 0.080 \| 1.191 \| 0.218 \| 1.583 \| 0.321 \| 2.579 gain \| 6.325 \| 5.490 \| 0.972 \| 0.982 \| 1.006 \| 0.985 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:12:14 +02:00			`/* Skip traversing into sub directories if the parent is excluded */`
			`if (dir->exclude)`
			`return;`

dir.c: move prep_exclude Move prep_exclude in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:11:37 +02:00			`/* Read from the parent directories and push them down. */`
			`current = stk ? stk->baselen : -1;`
			`while (current < baselen) {`
			`struct exclude_stack stk = xcalloc(1, sizeof(stk));`
			`const char *cp;`

			`if (current < 0) {`
			`cp = base;`
			`current = 0;`
			`}`
			`else {`
			`cp = strchr(base + current + 1, '/');`
			`if (!cp)`
			`die("oops in prep_exclude");`
			`cp++;`
			`}`
			`stk->prev = dir->exclude_stack;`
			`stk->baselen = cp - base;`
dir.c: unify is_excluded and is_path_excluded APIs The is_excluded and is_path_excluded APIs are very similar, except for a few noteworthy differences: is_excluded doesn't handle ignored directories, results for paths within ignored directories are incorrect. This is probably based on the premise that recursive directory scans should stop at ignored directories, which is no longer true (in certain cases, read_directory_recursive currently calls is_excluded and is_path_excluded to get correct ignored state). is_excluded caches parsed .gitignore files of the last directory in struct dir_struct. If the directory changes, it finds a common parent directory and is very careful to drop only as much state as necessary. On the other hand, is_excluded will also read and parse .gitignore files in already ignored directories, which are completely irrelevant. is_path_excluded correctly handles ignored directories by checking if any component in the path is excluded. As it uses is_excluded internally, this unfortunately forces is_excluded to drop and re-read all .gitignore files, as there is no common parent directory for the root dir. is_path_excluded tracks state in a separate struct path_exclude_check, which is essentially a wrapper of dir_struct with two more fields. However, as is_path_excluded also modifies dir_struct, it is not possible to e.g. use multiple path_exclude_check structures with the same dir_struct in parallel. The additional structure just unnecessarily complicates the API. Teach is_excluded / prep_exclude about ignored directories: whenever entering a new directory, first check if the entire directory is excluded. Remember the excluded state in dir_struct. Don't traverse into already ignored directories (i.e. don't read irrelevant .gitignore files). Directories could also be excluded by exclude patterns specified on the command line or .git/info/exclude, so we cannot simply skip prep_exclude entirely if there's no .gitignore file name (dir_struct.exclude_per_dir). Move this check to just before actually reading the file. is_path_excluded is now equivalent to is_excluded, so we can simply redirect to it (the public API is cleaned up in the next patch). The performance impact of the additional ignored check per directory is hardly noticeable when reading directories recursively (e.g. 'git status'). However, performance of git commands using the is_path_excluded API (e.g. 'git ls-files --cached --ignored --exclude-standard') is greatly improved as this no longer re-reads .gitignore files on each call. Here's some performance data from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| ls-files -ci \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------+-------+--------- before \| 0.506 \| 6.539 \| 0.212 \| 1.555 \| 0.323 \| 2.541 after \| 0.080 \| 1.191 \| 0.218 \| 1.583 \| 0.321 \| 2.579 gain \| 6.325 \| 5.490 \| 0.972 \| 0.982 \| 1.006 \| 0.985 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:12:14 +02:00			`stk->exclude_ix = group->nr;`
			`el = add_exclude_list(dir, EXC_DIRS, NULL);`
dir.c: move prep_exclude Move prep_exclude in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:11:37 +02:00			`memcpy(dir->basebuf + current, base + current,`
			`stk->baselen - current);`
dir.c: unify is_excluded and is_path_excluded APIs The is_excluded and is_path_excluded APIs are very similar, except for a few noteworthy differences: is_excluded doesn't handle ignored directories, results for paths within ignored directories are incorrect. This is probably based on the premise that recursive directory scans should stop at ignored directories, which is no longer true (in certain cases, read_directory_recursive currently calls is_excluded and is_path_excluded to get correct ignored state). is_excluded caches parsed .gitignore files of the last directory in struct dir_struct. If the directory changes, it finds a common parent directory and is very careful to drop only as much state as necessary. On the other hand, is_excluded will also read and parse .gitignore files in already ignored directories, which are completely irrelevant. is_path_excluded correctly handles ignored directories by checking if any component in the path is excluded. As it uses is_excluded internally, this unfortunately forces is_excluded to drop and re-read all .gitignore files, as there is no common parent directory for the root dir. is_path_excluded tracks state in a separate struct path_exclude_check, which is essentially a wrapper of dir_struct with two more fields. However, as is_path_excluded also modifies dir_struct, it is not possible to e.g. use multiple path_exclude_check structures with the same dir_struct in parallel. The additional structure just unnecessarily complicates the API. Teach is_excluded / prep_exclude about ignored directories: whenever entering a new directory, first check if the entire directory is excluded. Remember the excluded state in dir_struct. Don't traverse into already ignored directories (i.e. don't read irrelevant .gitignore files). Directories could also be excluded by exclude patterns specified on the command line or .git/info/exclude, so we cannot simply skip prep_exclude entirely if there's no .gitignore file name (dir_struct.exclude_per_dir). Move this check to just before actually reading the file. is_path_excluded is now equivalent to is_excluded, so we can simply redirect to it (the public API is cleaned up in the next patch). The performance impact of the additional ignored check per directory is hardly noticeable when reading directories recursively (e.g. 'git status'). However, performance of git commands using the is_path_excluded API (e.g. 'git ls-files --cached --ignored --exclude-standard') is greatly improved as this no longer re-reads .gitignore files on each call. Here's some performance data from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| ls-files -ci \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------+-------+--------- before \| 0.506 \| 6.539 \| 0.212 \| 1.555 \| 0.323 \| 2.541 after \| 0.080 \| 1.191 \| 0.218 \| 1.583 \| 0.321 \| 2.579 gain \| 6.325 \| 5.490 \| 0.972 \| 0.982 \| 1.006 \| 0.985 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:12:14 +02:00
			`/* Abort if the directory is excluded */`
			`if (stk->baselen) {`
			`int dt = DT_DIR;`
			`dir->basebuf[stk->baselen - 1] = 0;`
			`dir->exclude = last_exclude_matching_from_lists(dir,`
			`dir->basebuf, stk->baselen - 1,`
			`dir->basebuf + current, &dt);`
			`dir->basebuf[stk->baselen - 1] = '/';`
			`if (dir->exclude) {`
			`dir->basebuf[stk->baselen] = 0;`
			`dir->exclude_stack = stk;`
			`return;`
			`}`
			`}`

			`/* Try to read per-directory file unless path is too long */`
			`if (dir->exclude_per_dir &&`
			`stk->baselen + strlen(dir->exclude_per_dir) < PATH_MAX) {`
			`strcpy(dir->basebuf + stk->baselen,`
			`dir->exclude_per_dir);`
			`/*`
			`* dir->basebuf gets reused by the traversal, but we`
			`* need fname to remain unchanged to ensure the src`
			`* member of each struct exclude correctly`
			`* back-references its source file. Other invocations`
			`* of add_exclude_list provide stable strings, so we`
			`* strdup() and free() here in the caller.`
			`*/`
			`el->src = strdup(dir->basebuf);`
			`add_excludes_from_file_to_list(dir->basebuf,`
			`dir->basebuf, stk->baselen, el, 1);`
			`}`
dir.c: move prep_exclude Move prep_exclude in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:11:37 +02:00			`dir->exclude_stack = stk;`
			`current = stk->baselen;`
			`}`
			`dir->basebuf[baselen] = '\0';`
			`}`

dir.c: refactor is_excluded() In a similar way to the previous commit, this extracts a new helper function last_exclude_matching() which returns the last exclude_list element which matched, or NULL if no match was found. is_excluded() becomes a wrapper around this, and just returns 0 or 1 depending on whether any matching exclude_list element was found. This allows callers to find out _why_ a given path was excluded, rather than just whether it was or not, paving the way for a new git sub-command which allows users to test their exclude lists from the command line. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:27 +01:00			`/*`
			`* Loads the exclude lists for the directory containing pathname, then`
			`* scans all exclude lists to determine whether pathname is excluded.`
			`* Returns the exclude_list element which matched, or NULL for`
			`* undecided.`
			`*/`
dir.c: replace is_path_excluded with now equivalent is_excluded API Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:12:57 +02:00			`struct exclude last_exclude_matching(struct dir_struct dir,`
dir.c: refactor is_excluded() In a similar way to the previous commit, this extracts a new helper function last_exclude_matching() which returns the last exclude_list element which matched, or NULL if no match was found. is_excluded() becomes a wrapper around this, and just returns 0 or 1 depending on whether any matching exclude_list element was found. This allows callers to find out _why_ a given path was excluded, rather than just whether it was or not, paving the way for a new git sub-command which allows users to test their exclude lists from the command line. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:27 +01:00			`const char *pathname,`
			`int *dtype_p)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
			`int pathlen = strlen(pathname);`
Speedup scanning for excluded files. Try to avoid a lot of work scanning for excluded files, by caching some more information when setting up the exclusion data structure. Speeds up 'git runstatus' on a repository containing the Qt sources by 30% and reduces the amount of instructions executed (as measured by valgrind) by a factor of 2. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-10-28 21:27:13 +01:00			`const char *basename = strrchr(pathname, '/');`
			`basename = (basename) ? basename+1 : pathname;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
per-directory-exclude: lazily read .gitignore files Operations that walk directories or trees, which potentially need to consult the .gitignore files, used to always try to open the .gitignore file every time they entered a new directory, even when they ended up not needing to call excluded() function to see if a path in the directory is ignored. This was done by push/pop exclude_per_directory() functions that managed the data in a stack. This changes the directory walking API to remove the need to call these two functions. Instead, the directory walk data structure caches the data used by excluded() function the last time, and lazily reuses it as much as possible. Among the data the last check used, the ones from deeper directories that the path we are checking is outside are discarded, data from the common leading directories are reused, and then the directories between the common directory and the directory the path being checked is in are checked for .gitignore file. This is very similar to the way gitattributes are handled. This API change also fixes "ls-files -c -i", which called excluded() without setting up the gitignore data via the old push/pop functions. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 11:17:44 +01:00			`prep_exclude(dir, pathname, basename-pathname);`
dir.c: use a single struct exclude_list per source of excludes Previously each exclude_list could potentially contain patterns from multiple sources. For example dir->exclude_list[EXC_FILE] would typically contain patterns from .git/info/exclude and core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain patterns from multiple per-directory .gitignore files during directory traversal (i.e. when dir->exclude_stack was more than one item deep). We split these composite exclude_lists up into three groups of exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each exclude_list now contains patterns from a single source. This will allow us to cleanly track the origin of each pattern simply by adding a src field to struct exclude_list, rather than to struct exclude, which would make memory management of the source string tricky in the EXC_DIRS case where its contents are dynamically generated. Similarly, by moving the filebuf member from struct exclude_stack to struct exclude_list, it allows us to track and subsequently free memory buffers allocated during the parsing of all exclude files, rather than only tracking buffers allocated for files in the EXC_DIRS group. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:03 +01:00
dir.c: unify is_excluded and is_path_excluded APIs The is_excluded and is_path_excluded APIs are very similar, except for a few noteworthy differences: is_excluded doesn't handle ignored directories, results for paths within ignored directories are incorrect. This is probably based on the premise that recursive directory scans should stop at ignored directories, which is no longer true (in certain cases, read_directory_recursive currently calls is_excluded and is_path_excluded to get correct ignored state). is_excluded caches parsed .gitignore files of the last directory in struct dir_struct. If the directory changes, it finds a common parent directory and is very careful to drop only as much state as necessary. On the other hand, is_excluded will also read and parse .gitignore files in already ignored directories, which are completely irrelevant. is_path_excluded correctly handles ignored directories by checking if any component in the path is excluded. As it uses is_excluded internally, this unfortunately forces is_excluded to drop and re-read all .gitignore files, as there is no common parent directory for the root dir. is_path_excluded tracks state in a separate struct path_exclude_check, which is essentially a wrapper of dir_struct with two more fields. However, as is_path_excluded also modifies dir_struct, it is not possible to e.g. use multiple path_exclude_check structures with the same dir_struct in parallel. The additional structure just unnecessarily complicates the API. Teach is_excluded / prep_exclude about ignored directories: whenever entering a new directory, first check if the entire directory is excluded. Remember the excluded state in dir_struct. Don't traverse into already ignored directories (i.e. don't read irrelevant .gitignore files). Directories could also be excluded by exclude patterns specified on the command line or .git/info/exclude, so we cannot simply skip prep_exclude entirely if there's no .gitignore file name (dir_struct.exclude_per_dir). Move this check to just before actually reading the file. is_path_excluded is now equivalent to is_excluded, so we can simply redirect to it (the public API is cleaned up in the next patch). The performance impact of the additional ignored check per directory is hardly noticeable when reading directories recursively (e.g. 'git status'). However, performance of git commands using the is_path_excluded API (e.g. 'git ls-files --cached --ignored --exclude-standard') is greatly improved as this no longer re-reads .gitignore files on each call. Here's some performance data from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| ls-files -ci \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------+-------+--------- before \| 0.506 \| 6.539 \| 0.212 \| 1.555 \| 0.323 \| 2.541 after \| 0.080 \| 1.191 \| 0.218 \| 1.583 \| 0.321 \| 2.579 gain \| 6.325 \| 5.490 \| 0.972 \| 0.982 \| 1.006 \| 0.985 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:12:14 +02:00			`if (dir->exclude)`
			`return dir->exclude;`

dir.c: factor out parts of last_exclude_matching for later reuse Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:11:02 +02:00			`return last_exclude_matching_from_lists(dir, pathname, pathlen,`
			`basename, dtype_p);`
dir.c: refactor is_excluded() In a similar way to the previous commit, this extracts a new helper function last_exclude_matching() which returns the last exclude_list element which matched, or NULL if no match was found. is_excluded() becomes a wrapper around this, and just returns 0 or 1 depending on whether any matching exclude_list element was found. This allows callers to find out _why_ a given path was excluded, rather than just whether it was or not, paving the way for a new git sub-command which allows users to test their exclude lists from the command line. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:27 +01:00			`}`

			`/*`
			`* Loads the exclude lists for the directory containing pathname, then`
			`* scans all exclude lists to determine whether pathname is excluded.`
			`* Returns 1 if true, otherwise 0.`
			`*/`
dir.c: replace is_path_excluded with now equivalent is_excluded API Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:12:57 +02:00			`int is_excluded(struct dir_struct dir, const char pathname, int *dtype_p)`
dir.c: refactor is_excluded() In a similar way to the previous commit, this extracts a new helper function last_exclude_matching() which returns the last exclude_list element which matched, or NULL if no match was found. is_excluded() becomes a wrapper around this, and just returns 0 or 1 depending on whether any matching exclude_list element was found. This allows callers to find out _why_ a given path was excluded, rather than just whether it was or not, paving the way for a new git sub-command which allows users to test their exclude lists from the command line. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-27 03:32:27 +01:00			`{`
			`struct exclude *exclude =`
			`last_exclude_matching(dir, pathname, dtype_p);`
			`if (exclude)`
			`return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`return 0;`
			`}`

Style: place opening brace of a function definition at column 1 Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-09 00:35:32 +01:00			`static struct dir_entry dir_entry_new(const char pathname, int len)`
			`{`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`struct dir_entry *ent;`

			`ent = xmalloc(sizeof(*ent) + len + 1);`
			`ent->len = len;`
			`memcpy(ent->name, pathname, len);`
			`ent->name[len] = 0;`
Fix 'git add' with .gitignore When '*.ig' is ignored, and you have two files f.ig and d.ig/foo in the working tree, $ git add . correctly ignored f.ig but failed to ignore d.ig/foo. This was caused by a thinko in an earlier commit 4888c534, when we tried to allow adding otherwise ignored files. After reverting that commit, this takes a much simpler approach. When we have an unmatched pathspec that talks about an existing pathname, we know it is an ignored path the user tried to add, so we include it in the set of paths directory walker returned. This does not let you say "git add -f D" on an ignored directory D and add everything under D. People can submit a patch to further allow it if they want to, but I think it is a saner behaviour to require explicit paths to be spelled out in such a case. Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-12-29 20:01:31 +01:00			`return ent;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

dir.c: make dir_add_name() and dir_add_ignored() static These functions are not used by any other file. Signed-off-by: Nanako Shiraishi <nanako3@lavabit.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-10-02 12:14:23 +02:00			`static struct dir_entry dir_add_name(struct dir_struct dir, const char *pathname, int len)`
refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`{`
dir.c: git-clean -d -X: don't delete tracked directories The notion of "ignored tracked" directories introduced in 721ac4ed "dir.c: Make git-status --ignored more consistent" has a few unwanted side effects: - git-clean -d -X: deletes ignored tracked directories. git-clean should never delete tracked content. - git-ls-files --ignored --other --directory: lists ignored tracked directories instead of "other" directories. - git-status --ignored: lists ignored tracked directories while contained files may be listed as modified. Paths listed by git-status should be disjoint (except in long format where a path may be listed in both the staged and unstaged section). Additionally, the current behaviour violates documentation in gitignore(5) ("Specifies intentionally untracked files to ignore") and Documentation/ technical/api-directory-listing.txt ("DIR_SHOW_OTHER_DIRECTORIES: Include a directory that is not tracked."). In dir.c::treat_directory, remove the special handling of ignored tracked directories, so that the DIR_SHOW_OTHER_DIRECTORIES flag only affects "other" (i.e. untracked) directories. In dir.c::dir_add_name, check that added paths are untracked even if DIR_SHOW_IGNORED is set. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:10:05 +02:00			`if (cache_name_exists(pathname, len, ignore_case))`
refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`return NULL;`

Fix ALLOC_GROW calls with obsolete semantics ALLOC_GROW now expects the 'nr' argument to be "how much you want" and not "how much you have". This fixes all cases where we weren't previously adding anything to the 'nr'. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-17 00:43:40 +02:00			`ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);`
refactor dir_add_name This is in preparation for keeping two entry lists in the dir object. This patch adds and uses the ALLOC_GROW() macro, which implements the commonly used idiom of growing a dynamic array using the alloc_nr function (not just in dir.c, but everywhere). We also move creation of a dir_entry to dir_entry_new. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:44 +02:00			`return dir->entries[dir->nr++] = dir_entry_new(pathname, len);`
			`}`

git add: Add the "--ignore-missing" option for the dry run Sometimes it is useful to know if a file or directory will be ignored before it is added to the work tree. An example is "git submodule add", where it would be really nice to be able to fail with an appropriate error message before the submodule is cloned and checked out. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-07-10 00:18:38 +02:00			`struct dir_entry dir_add_ignored(struct dir_struct dir, const char *pathname, int len)`
dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`{`
git-add: no need for -f when resolving a conflict in already tracked path When a path F that matches ignore pattern has a conflict, "git add F" insisted the -f option be given, which did not make sense. It would have required -f when the path was originally added, but when resolving a conflict, it already is tracked. So this should work (and does): $ echo file >.gitignore $ echo content >file $ git add -f file ;# need -f because we are adding new path $ echo more content >>file $ git add file ;# don't need -f; it is not actually an "other" file This is handled under the hood by the COLLECT_IGNORED option to read_directory. When that code finds an ignored file, it checks the index to make sure it is not actually a tracked file. However, the test it uses does not take into account unmerged entries, and considers them to still be ignored. "git ls-files" uses a more elaborate test and gets the right answer and the same test should be used here. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-05-30 23:54:18 +02:00			`if (!cache_name_is_other(pathname, len))`
dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`return NULL;`

Fix ALLOC_GROW calls with obsolete semantics ALLOC_GROW now expects the 'nr' argument to be "how much you want" and not "how much you have". This fixes all cases where we weren't previously adding anything to the 'nr'. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-17 00:43:40 +02:00			`ALLOC_GROW(dir->ignored, dir->ignored_nr+1, dir->ignored_alloc);`
dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);`
			`}`

Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`enum exist_status {`
			`index_nonexistent = 0,`
			`index_directory,`
enums: omit trailing comma for portability Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX 5.1 fails to compile git. enum style is inconsistent already, with some enums declared on one line, some over 3 lines with the enum values all on the middle line, sometimes with 1 enum value per line... and independently of that the trailing comma is sometimes present and other times absent, often mixing with/without trailing comma styles in a single file, and sometimes in consecutive enum declarations. Clearly, omitting the comma is the more portable style, and this patch changes all enum declarations to use the portable omitted dangling comma style consistently. Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-14 11:31:35 +02:00			`index_gitdir`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`};`

Add case insensitivity support for directories when using git status When using a case preserving but case insensitive file system, directory case can differ but still refer to the same physical directory. git status reports the directory with the alternate case as an Untracked file. (That is, when mydir/filea.txt is added to the repository and then the directory on disk is renamed from mydir/ to MyDir/, git status shows MyDir/ as being untracked.) Support has been added in name-hash.c for hashing directories with a terminating slash into the name hash. When index_name_exists() is called with a directory (a name with a terminating slash), the name is not found via the normal cache_name_compare() call, but it is found in the slow_same_name() function. Additionally, in dir.c, directory_exists_in_index_icase() allows newly added directories deeper in the directory chain to be identified. Ultimately, it would be better if the file list was read in case insensitive alphabetical order from disk, but this change seems to suffice for now. The end result is the directory is looked up in a case insensitive manner and does not show in the Untracked files list. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:43 +02:00			`/*`
			`* Do not use the alphabetically stored index to look up`
			`* the directory name; instead, use the case insensitive`
			`* name hash.`
			`*/`
			`static enum exist_status directory_exists_in_index_icase(const char *dirname, int len)`
			`{`
			`struct cache_entry *ce = index_name_exists(&the_index, dirname, len + 1, ignore_case);`
			`unsigned char endchar;`

			`if (!ce)`
			`return index_nonexistent;`
			`endchar = ce->name[len];`

			`/*`
			`* The cache_entry structure returned will contain this dirname`
			`* and possibly additional path components.`
			`*/`
			`if (endchar == '/')`
			`return index_directory;`

			`/*`
			`* If there are no additional path components, then this cache_entry`
			`* represents a submodule. Submodules, despite being directories,`
			`* are stored in the cache without a closing slash.`
			`*/`
			`if (!endchar && S_ISGITLINK(ce->ce_mode))`
			`return index_gitdir;`

			`/* This should never be hit, but it exists just in case. */`
			`return index_nonexistent;`
			`}`

Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`/*`
			`* The index sorts alphabetically by entry name, which`
			`* means that a gitlink sorts as '\0' at the end, while`
			`* a directory (which is defined not as an entry, but as`
			`* the files it contains) will sort with the '/' at the`
			`* end.`
			`*/`
			`static enum exist_status directory_exists_in_index(const char *dirname, int len)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
Add case insensitivity support for directories when using git status When using a case preserving but case insensitive file system, directory case can differ but still refer to the same physical directory. git status reports the directory with the alternate case as an Untracked file. (That is, when mydir/filea.txt is added to the repository and then the directory on disk is renamed from mydir/ to MyDir/, git status shows MyDir/ as being untracked.) Support has been added in name-hash.c for hashing directories with a terminating slash into the name hash. When index_name_exists() is called with a directory (a name with a terminating slash), the name is not found via the normal cache_name_compare() call, but it is found in the slow_same_name() function. Additionally, in dir.c, directory_exists_in_index_icase() allows newly added directories deeper in the directory chain to be identified. Ultimately, it would be better if the file list was read in case insensitive alphabetical order from disk, but this change seems to suffice for now. The end result is the directory is looked up in a case insensitive manner and does not show in the Untracked files list. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-10-03 11:56:43 +02:00			`int pos;`

			`if (ignore_case)`
			`return directory_exists_in_index_icase(dirname, len);`

			`pos = cache_name_pos(dirname, len);`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`if (pos < 0)`
			`pos = -pos-1;`
			`while (pos < active_nr) {`
			`struct cache_entry *ce = active_cache[pos++];`
			`unsigned char endchar;`

			`if (strncmp(ce->name, dirname, len))`
			`break;`
			`endchar = ce->name[len];`
			`if (endchar > '/')`
			`break;`
			`if (endchar == '/')`
			`return index_directory;`
Make on-disk index representation separate from in-core one This converts the index explicitly on read and write to its on-disk format, allowing the in-core format to contain more flags, and be simpler. In particular, the in-core format is now host-endian (as opposed to the on-disk one that is network endian in order to be able to be shared across machines) and as a result we can dispense with all the htonl/ntohl on accesses to the cache_entry fields. This will make it easier to make use of various temporary flags that do not exist in the on-disk format. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 2008-01-15 01:03:17 +01:00			`if (!endchar && S_ISGITLINK(ce->ce_mode))`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`return index_gitdir;`
			`}`
			`return index_nonexistent;`
			`}`

			`/*`
			`* When we find a directory when traversing the filesystem, we`
			`* have three distinct cases:`
			`*`
			`* - ignore it`
			`* - see it as a directory`
			`* - recurse into it`
			`*`
			`* and which one we choose depends on a combination of existing`
			`* git index contents and the flags passed into the directory`
			`* traversal routine.`
			`*`
			`* Case 1: If we already have entries in the index under that`
dir.c: git-clean -d -X: don't delete tracked directories The notion of "ignored tracked" directories introduced in 721ac4ed "dir.c: Make git-status --ignored more consistent" has a few unwanted side effects: - git-clean -d -X: deletes ignored tracked directories. git-clean should never delete tracked content. - git-ls-files --ignored --other --directory: lists ignored tracked directories instead of "other" directories. - git-status --ignored: lists ignored tracked directories while contained files may be listed as modified. Paths listed by git-status should be disjoint (except in long format where a path may be listed in both the staged and unstaged section). Additionally, the current behaviour violates documentation in gitignore(5) ("Specifies intentionally untracked files to ignore") and Documentation/ technical/api-directory-listing.txt ("DIR_SHOW_OTHER_DIRECTORIES: Include a directory that is not tracked."). In dir.c::treat_directory, remove the special handling of ignored tracked directories, so that the DIR_SHOW_OTHER_DIRECTORIES flag only affects "other" (i.e. untracked) directories. In dir.c::dir_add_name, check that added paths are untracked even if DIR_SHOW_IGNORED is set. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:10:05 +02:00			`* directory name, we always recurse into the directory to see`
			`* all the files.`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`*`
			`* Case 2: If we already have that directory name as a gitlink,`
			`* we always continue to see it as a gitlink, regardless of whether`
			`* there is an actual git directory there or not (it might not`
			`* be checked out as a subproject!)`
			`*`
			`* Case 3: if we didn't have it in the index previously, we`
			`* have a few sub-cases:`
			`*`
			`* (a) if "show_other_directories" is true, we show it as`
			`* just a directory, unless "hide_empty_directories" is`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`* also true, in which case we need to check if it contains any`
			`* untracked and / or ignored files.`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`* (b) if it looks like a git directory, and we don't have`
rename dirlink to gitlink. Unify naming of plumbing dirlink/gitlink concept: git ls-files -z '*.[ch]' \| xargs -0 perl -pi -e 's/dirlink/gitlink/g;' -e 's/DIRLNK/GITLINK/g;' Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-05-21 22:08:28 +02:00			`* 'no_gitlinks' set we treat it as a gitlink, and show it`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`* as a directory.`
			`* (c) otherwise, we recurse into it.`
			`*/`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`static enum path_treatment treat_directory(struct dir_struct *dir,`
dir.c: Make git-status --ignored more consistent The current behavior of git-status is inconsistent and misleading. Especially when used with --untracked-files=all option: - files ignored in untracked directories will be missing from status output. - untracked files in committed yet ignored directories are also missing. - with --untracked-files=normal, untracked directories that contains only ignored files are dropped too. Make the behavior more consistent across all possible use cases: - "--ignored --untracked-files=normal" doesn't show each specific files but top directory. It instead shows untracked directories that only contains ignored files, and ignored tracked directories with untracked files. - "--ignored --untracked-files=all" shows all ignored files, either because it's in an ignored directory (tracked or untracked), or because the file is explicitly ignored. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-30 15:39:00 +01:00			`const char *dirname, int len, int exclude,`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`const struct path_simplify *simplify)`
			`{`
			`/* The "len-1" is to strip the final '/' */`
			`switch (directory_exists_in_index(dirname, len-1)) {`
			`case index_directory:`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_recurse;`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00
			`case index_gitdir:`
Turn the flags in struct dir_struct into a single variable By having flags represented as bits in the new member variable 'flags', it will be easier to use parse_options when dir_struct is involved. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-16 13:20:25 +01:00			`if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_none;`
			`return path_untracked;`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00
			`case index_nonexistent:`
Turn the flags in struct dir_struct into a single variable By having flags represented as bits in the new member variable 'flags', it will be easier to use parse_options when dir_struct is involved. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-16 13:20:25 +01:00			`if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`break;`
Turn the flags in struct dir_struct into a single variable By having flags represented as bits in the new member variable 'flags', it will be easier to use parse_options when dir_struct is involved. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-02-16 13:20:25 +01:00			`if (!(dir->flags & DIR_NO_GITLINKS)) {`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`unsigned char sha1[20];`
			`if (resolve_gitlink_ref(dirname, "HEAD", sha1) == 0)`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_untracked;`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`}`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_recurse;`
Teach directory traversal about subprojects This is the promised cleaned-up version of teaching directory traversal (ie the "read_directory()" logic) about subprojects. That makes "git add" understand to add/update subprojects. It now knows to look at the index file to see if a directory is marked as a subproject, and use that as information as whether it should be recursed into or not. It also generally cleans up the handling of directory entries when traversing the working tree, by splitting up the decision-making process into small functions of their own, and adding a fair number of comments. Finally, it teaches "add_file_to_cache()" that directory names can have slashes at the end, since the directory traversal adds them to make the difference between a file and a directory clear (it always did that, but my previous too-ugly-to-apply subproject patch had a totally different path for subproject directories and avoided the slash for that case). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-04-11 23:49:44 +02:00			`}`

			`/* This is the "show_other_directories" case */`
dir.c: Make git-status --ignored more consistent The current behavior of git-status is inconsistent and misleading. Especially when used with --untracked-files=all option: - files ignored in untracked directories will be missing from status output. - untracked files in committed yet ignored directories are also missing. - with --untracked-files=normal, untracked directories that contains only ignored files are dropped too. Make the behavior more consistent across all possible use cases: - "--ignored --untracked-files=normal" doesn't show each specific files but top directory. It instead shows untracked directories that only contains ignored files, and ignored tracked directories with untracked files. - "--ignored --untracked-files=all" shows all ignored files, either because it's in an ignored directory (tracked or untracked), or because the file is explicitly ignored. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-30 15:39:00 +01:00
dir.c: git-ls-files --directories: don't hide empty directories 'git-ls-files --ignored --directories' hides empty directories even though --no-empty-directory was not specified. Treat the DIR_HIDE_EMPTY_DIRECTORIES flag independently from DIR_SHOW_IGNORED to make all git-ls-files options work as expected. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:08:02 +02:00			`if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return exclude ? path_excluded : path_untracked;`

			`return read_directory_recursive(dir, dirname, len, 1, simplify);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`/*`
			`* This is an inexact early pruning of any recursive directory`
			`* reading - if the path cannot possibly be in the pathspec,`
			`* return true, and we'll skip it early.`
			`*/`
			`static int simplify_away(const char path, int pathlen, const struct path_simplify simplify)`
			`{`
			`if (simplify) {`
			`for (;;) {`
			`const char *match = simplify->path;`
			`int len = simplify->len;`

			`if (!match)`
			`break;`
			`if (len > pathlen)`
			`len = pathlen;`
			`if (!memcmp(path, match, len))`
			`return 0;`
			`simplify++;`
			`}`
			`return 1;`
			`}`
			`return 0;`
			`}`

dir: fix COLLECT_IGNORED on excluded prefixes As we walk the directory tree, if we see an ignored path, we want to add it to the ignored list only if it matches any pathspec that we were given. We used to check for the pathspec to appear explicitly. E.g., if we see "subdir/file" and it is excluded, we check to see if we have "subdir/file" in our pathspec. However, this interacts badly with the optimization to avoid recursing into ignored subdirectories. If "subdir" as a whole is ignored, then we never recurse, and consider only whether "subdir" itself is in our pathspec. It would not match a pathspec of "subdir/file" explicitly, even though it is the reason that subdir/file would be excluded. This manifests itself to the user as "git add subdir/file" failing to correctly note that the pathspec was ignored. This patch extends the in_pathspec logic to include prefix directory case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-03-11 08:15:43 +01:00			`/*`
			`* This function tells us whether an excluded path matches a`
			`* list of "interesting" pathspecs. That is, whether a path matched`
			`* by any of the pathspecs could possibly be ignored by excluding`
			`* the specified path. This can happen if:`
			`*`
			`* 1. the path is mentioned explicitly in the pathspec`
			`*`
			`* 2. the path is a directory prefix of some element in the`
			`* pathspec`
			`*/`
			`static int exclude_matches_pathspec(const char *path, int len,`
			`const struct path_simplify *simplify)`
builtin-add: simplify (and increase accuracy of) exclude handling Previously, the code would always set up the excludes, and then manually pick through the pathspec we were given, assuming that non-added but existing paths were just ignored. This was mostly correct, but would erroneously mark a totally empty directory as 'ignored'. Instead, we now use the collect_ignored option of dir_struct, which unambiguously tells us whether a path was ignored. This simplifies the code, and means empty directories are now just not mentioned at all. Furthermore, we now conditionally ask dir_struct to respect excludes, depending on whether the '-f' flag has been set. This means we don't have to pick through the result, checking for an 'ignored' flag; ignored entries were either added or not in the first place. We can safely get rid of the special 'ignored' flags to dir_entry, which were not used anywhere else. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-12 23:42:14 +02:00			`{`
			`if (simplify) {`
			`for (; simplify->path; simplify++) {`
			`if (len == simplify->len`
			`&& !memcmp(path, simplify->path, len))`
			`return 1;`
dir: fix COLLECT_IGNORED on excluded prefixes As we walk the directory tree, if we see an ignored path, we want to add it to the ignored list only if it matches any pathspec that we were given. We used to check for the pathspec to appear explicitly. E.g., if we see "subdir/file" and it is excluded, we check to see if we have "subdir/file" in our pathspec. However, this interacts badly with the optimization to avoid recursing into ignored subdirectories. If "subdir" as a whole is ignored, then we never recurse, and consider only whether "subdir" itself is in our pathspec. It would not match a pathspec of "subdir/file" explicitly, even though it is the reason that subdir/file would be excluded. This manifests itself to the user as "git add subdir/file" failing to correctly note that the pathspec was ignored. This patch extends the in_pathspec logic to include prefix directory case. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-03-11 08:15:43 +01:00			`if (len < simplify->len`
			`&& simplify->path[len] == '/'`
			`&& !memcmp(path, simplify->path, len))`
			`return 1;`
builtin-add: simplify (and increase accuracy of) exclude handling Previously, the code would always set up the excludes, and then manually pick through the pathspec we were given, assuming that non-added but existing paths were just ignored. This was mostly correct, but would erroneously mark a totally empty directory as 'ignored'. Instead, we now use the collect_ignored option of dir_struct, which unambiguously tells us whether a path was ignored. This simplifies the code, and means empty directories are now just not mentioned at all. Furthermore, we now conditionally ask dir_struct to respect excludes, depending on whether the '-f' flag has been set. This means we don't have to pick through the result, checking for an 'ignored' flag; ignored entries were either added or not in the first place. We can safely get rid of the special 'ignored' flags to dir_entry, which were not used anywhere else. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonas Fonseca <fonseca@diku.dk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-12 23:42:14 +02:00			`}`
			`}`
			`return 0;`
			`}`

Avoid using 'lstat()' to figure out directories If we have an up-to-date index entry for a file in that directory, we can know that the directories leading up to that file must be directories. No need to do an lstat() on the directory. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-09 22:14:28 +02:00			`static int get_index_dtype(const char *path, int len)`
			`{`
			`int pos;`
			`struct cache_entry *ce;`

			`ce = cache_name_exists(path, len, 0);`
			`if (ce) {`
			`if (!ce_uptodate(ce))`
			`return DT_UNKNOWN;`
			`if (S_ISGITLINK(ce->ce_mode))`
			`return DT_DIR;`
			`/*`
			`* Nobody actually cares about the`
			`* difference between DT_LNK and DT_REG`
			`*/`
			`return DT_REG;`
			`}`

			`/* Try to look it up as a directory */`
			`pos = cache_name_pos(path, len);`
			`if (pos >= 0)`
			`return DT_UNKNOWN;`
			`pos = -pos-1;`
			`while (pos < active_nr) {`
			`ce = active_cache[pos++];`
			`if (strncmp(ce->name, path, len))`
			`break;`
			`if (ce->name[len] > '/')`
			`break;`
			`if (ce->name[len] < '/')`
			`continue;`
			`if (!ce_uptodate(ce))`
			`break; /* continue? */`
			`return DT_DIR;`
			`}`
			`return DT_UNKNOWN;`
			`}`

Avoid doing extra 'lstat()'s for d_type if we have an up-to-date cache entry On filesystems without d_type, we can look at the cache entry first. Doing an lstat() can be expensive. Reported by Dmitry Potapov for Cygwin. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-09 04:31:49 +02:00			`static int get_dtype(struct dirent de, const char path, int len)`
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00			`{`
gitignore: lazily find dtype When we process "foo/" entries in gitignore files on a system that does not have d_type member in "struct dirent", the earlier implementation ran lstat(2) separately when matching with entries that came from the command line, in-tree .gitignore files, and $GIT_DIR/info/excludes file. This optimizes it by delaying the lstat(2) call until it becomes absolutely necessary. The initial idea for this change was by Jeff King, but I optimized it further to pass pointers to around. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-02-01 05:23:25 +01:00			`int dtype = de ? DTYPE(de) : DT_UNKNOWN;`
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00			`struct stat st;`

			`if (dtype != DT_UNKNOWN)`
			`return dtype;`
Avoid using 'lstat()' to figure out directories If we have an up-to-date index entry for a file in that directory, we can know that the directories leading up to that file must be directories. No need to do an lstat() on the directory. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-09 22:14:28 +02:00			`dtype = get_index_dtype(path, len);`
			`if (dtype != DT_UNKNOWN)`
			`return dtype;`
			`if (lstat(path, &st))`
Fix directory scanner to correctly ignore files without d_type On Fri, 19 Oct 2007, Todd T. Fries wrote: > If DT_UNKNOWN exists, then we have to do a stat() of some form to > find out the right type. That happened in the case of a pathname that was ignored, and we did not ask for "dir->show_ignored". That test used to be together with the "DTYPE(de) != DT_DIR", but splitting the two tests up means that we can do that (common) test before we even bother to calculate the real dtype. Of course, that optimization only matters for systems that don't have, or don't fill in DTYPE properly. I also clarified the real relationship between "exclude" and "dir->show_ignored". It used to do if (exclude != dir->show_ignored) { .. which wasn't exactly obvious, because it triggers for two different cases: - the path is marked excluded, but we are not interested in ignored files: ignore it - the path is not excluded, but we are interested in ignored files: ignore it unless it's a directory, in which case we might have ignored files inside the directory and need to recurse into it). so this splits them into those two cases, since the first case doesn't even care about the type. I also made a the DT_UNKNOWN case a separate helper function, and added some commentary to the cases. Linus Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2007-10-19 19:59:22 +02:00			`return dtype;`
			`if (S_ISREG(st.st_mode))`
			`return DT_REG;`
			`if (S_ISDIR(st.st_mode))`
			`return DT_DIR;`
			`if (S_ISLNK(st.st_mode))`
			`return DT_LNK;`
			`return dtype;`
			`}`

read_directory(): further split treat_path() The next caller I'll be adding won't have an access to struct dirent because it won't be reading from a directory stream. Split the main part of the function further into a separate function to make it usable by a caller without passing a dirent as long as it knows what type is feeding the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 05:56:16 +01:00			`static enum path_treatment treat_one_path(struct dir_struct *dir,`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`struct strbuf *path,`
read_directory(): further split treat_path() The next caller I'll be adding won't have an access to struct dirent because it won't be reading from a directory stream. Split the main part of the function further into a separate function to make it usable by a caller without passing a dirent as long as it knows what type is feeding the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 05:56:16 +01:00			`const struct path_simplify *simplify,`
			`int dtype, struct dirent *de)`
read_directory_recursive(): refactor handling of a single path into a separate function Primarily because I want to reuse it in a separate function later, but this de-dents a huge function by one tabstop which by itself is an improvement as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 04:14:07 +01:00			`{`
dir.c: git-status: avoid is_excluded checks for tracked files Checking if a file is in the index is much faster (hashtable lookup) than checking if the file is excluded (linear search over exclude patterns). Skip is_excluded checks for files: move the cache_name_exists check from treat_file to treat_one_path and return early if the file is tracked. This can safely be done as all other code paths also return path_ignored for tracked files, and dir_add_ignored skips tracked files as well. There's just one line left in treat_file, so move this to treat_one_path as well. Here's some performance data for git-status from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------- before \| 0.218 \| 1.583 \| 0.321 \| 2.579 after \| 0.156 \| 0.988 \| 0.202 \| 1.279 gain \| 1.397 \| 1.602 \| 1.589 \| 2.016 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:13:35 +02:00			`int exclude;`
			`if (dtype == DT_UNKNOWN)`
			`dtype = get_dtype(de, path->buf, path->len);`

			`/* Always exclude indexed files */`
			`if (dtype != DT_DIR &&`
			`cache_name_exists(path->buf, path->len, ignore_case))`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_none;`
dir.c: git-status: avoid is_excluded checks for tracked files Checking if a file is in the index is much faster (hashtable lookup) than checking if the file is excluded (linear search over exclude patterns). Skip is_excluded checks for files: move the cache_name_exists check from treat_file to treat_one_path and return early if the file is tracked. This can safely be done as all other code paths also return path_ignored for tracked files, and dir_add_ignored skips tracked files as well. There's just one line left in treat_file, so move this to treat_one_path as well. Here's some performance data for git-status from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------- before \| 0.218 \| 1.583 \| 0.321 \| 2.579 after \| 0.156 \| 0.988 \| 0.202 \| 1.279 gain \| 1.397 \| 1.602 \| 1.589 \| 2.016 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:13:35 +02:00
			`exclude = is_excluded(dir, path->buf, &dtype);`
read_directory_recursive(): refactor handling of a single path into a separate function Primarily because I want to reuse it in a separate function later, but this de-dents a huge function by one tabstop which by itself is an improvement as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 04:14:07 +01:00
			`/*`
			`* Excluded? If we don't explicitly want to show`
			`* ignored files, ignore it`
			`*/`
dir.c: git-status --ignored: don't scan the work tree twice 'git-status --ignored' still scans the work tree twice to collect untracked and ignored files, respectively. fill_directory / read_directory already supports collecting untracked and ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED flag to enable this has some git-add specific side-effects (e.g. it doesn't recurse into ignored directories, so listing ignored files with --untracked=all doesn't work). The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored files in dir_struct.entries[] (instead of dir_struct.ignored[] as DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git. We don't want to break the existing API, so lets introduce a new flag DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar to DIR_COLLECT_FILES, but will recurse into sub-directories based on the other flags as DIR_SHOW_IGNORED does. In dir.c::read_directory_recursive, add ignored files to either dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move the DIR_COLLECT_IGNORED case here so that filling result lists is in a common place. In wt-status.c::wt_status_collect_untracked, use the new flag and read results from dir_struct.ignored[]. Remove the extra fill_directory call. builtin/check-ignore.c doesn't call fill_directory, setting the git-add specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity. Update API documentation to reflect the changes. Performance: with this patch, 'git-status --ignored' is typically as fast as 'git-status'. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:15:03 +02:00			`if (exclude && !(dir->flags & (DIR_SHOW_IGNORED\|DIR_SHOW_IGNORED_TOO)))`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_excluded;`
read_directory_recursive(): refactor handling of a single path into a separate function Primarily because I want to reuse it in a separate function later, but this de-dents a huge function by one tabstop which by itself is an improvement as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 04:14:07 +01:00
			`switch (dtype) {`
			`default:`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_none;`
read_directory_recursive(): refactor handling of a single path into a separate function Primarily because I want to reuse it in a separate function later, but this de-dents a huge function by one tabstop which by itself is an improvement as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 04:14:07 +01:00			`case DT_DIR:`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`strbuf_addch(path, '/');`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return treat_directory(dir, path->buf, path->len, exclude,`
			`simplify);`
read_directory_recursive(): refactor handling of a single path into a separate function Primarily because I want to reuse it in a separate function later, but this de-dents a huge function by one tabstop which by itself is an improvement as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 04:14:07 +01:00			`case DT_REG:`
			`case DT_LNK:`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return exclude ? path_excluded : path_untracked;`
read_directory_recursive(): refactor handling of a single path into a separate function Primarily because I want to reuse it in a separate function later, but this de-dents a huge function by one tabstop which by itself is an improvement as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 04:14:07 +01:00			`}`
			`}`

read_directory(): further split treat_path() The next caller I'll be adding won't have an access to struct dirent because it won't be reading from a directory stream. Split the main part of the function further into a separate function to make it usable by a caller without passing a dirent as long as it knows what type is feeding the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 05:56:16 +01:00			`static enum path_treatment treat_path(struct dir_struct *dir,`
			`struct dirent *de,`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`struct strbuf *path,`
read_directory(): further split treat_path() The next caller I'll be adding won't have an access to struct dirent because it won't be reading from a directory stream. Split the main part of the function further into a separate function to make it usable by a caller without passing a dirent as long as it knows what type is feeding the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 05:56:16 +01:00			`int baselen,`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`const struct path_simplify *simplify)`
read_directory(): further split treat_path() The next caller I'll be adding won't have an access to struct dirent because it won't be reading from a directory stream. Split the main part of the function further into a separate function to make it usable by a caller without passing a dirent as long as it knows what type is feeding the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 05:56:16 +01:00			`{`
			`int dtype;`

			`if (is_dot_or_dotdot(de->d_name) \|\| !strcmp(de->d_name, ".git"))`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_none;`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`strbuf_setlen(path, baselen);`
			`strbuf_addstr(path, de->d_name);`
			`if (simplify_away(path->buf, path->len, simplify))`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return path_none;`
read_directory(): further split treat_path() The next caller I'll be adding won't have an access to struct dirent because it won't be reading from a directory stream. Split the main part of the function further into a separate function to make it usable by a caller without passing a dirent as long as it knows what type is feeding the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 05:56:16 +01:00
			`dtype = DTYPE(de);`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`return treat_one_path(dir, path, simplify, dtype, de);`
read_directory(): further split treat_path() The next caller I'll be adding won't have an access to struct dirent because it won't be reading from a directory stream. Split the main part of the function further into a separate function to make it usable by a caller without passing a dirent as long as it knows what type is feeding the function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 05:56:16 +01:00			`}`

libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`/*`
			`* Read a directory tree. We currently ignore anything but`
			`* directories, regular files and symlinks. That's because git`
			`* doesn't handle them at all yet. Maybe that will change some`
			`* day.`
			`*`
			`* Also, we ignore the name ".git" (even if it is not a directory).`
			`* That likely will not change.`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`*`
			`* Returns the most significant path_treatment value encountered in the scan.`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`*/`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`static enum path_treatment read_directory_recursive(struct dir_struct *dir,`
read_directory_recursive(): refactor handling of a single path into a separate function Primarily because I want to reuse it in a separate function later, but this de-dents a huge function by one tabstop which by itself is an improvement as well. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 04:14:07 +01:00			`const char *base, int baselen,`
			`int check_only,`
			`const struct path_simplify *simplify)`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`{`
dir: respect string length argument of read_directory_recursive() A directory name is passed to read_directory_recursive() as a length-limited string, through the parameters base and baselen. Suprisingly, base must be a NUL-terminated string as well, as it is passed to opendir(), ignoring baselen. Fix this by postponing the call to opendir() until the length-limted string is added to a strbuf, which provides a NUL in the right place. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-11 16:53:07 +02:00			`DIR *fdir;`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`enum path_treatment state, subdir_state, dir_state = path_none;`
read_directory_recursive: reduce one indentation level Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:11 +02:00			`struct dirent *de;`
Merge branch 'rs/maint-dir-strbuf' into rs/dir-strbuf By René Scharfe * rs/maint-dir-strbuf: dir: convert to strbuf 2012-05-08 18:43:40 +02:00			`struct strbuf path = STRBUF_INIT;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
Merge branch 'rs/maint-dir-strbuf' into rs/dir-strbuf By René Scharfe * rs/maint-dir-strbuf: dir: convert to strbuf 2012-05-08 18:43:40 +02:00			`strbuf_add(&path, base, baselen);`
read_directory_recursive: reduce one indentation level Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:11 +02:00
dir: respect string length argument of read_directory_recursive() A directory name is passed to read_directory_recursive() as a length-limited string, through the parameters base and baselen. Suprisingly, base must be a NUL-terminated string as well, as it is passed to opendir(), ignoring baselen. Fix this by postponing the call to opendir() until the length-limted string is added to a strbuf, which provides a NUL in the right place. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-11 16:53:07 +02:00			`fdir = opendir(path.len ? path.buf : ".");`
			`if (!fdir)`
			`goto out;`

read_directory_recursive: reduce one indentation level Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:11 +02:00			`while ((de = readdir(fdir)) != NULL) {`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`/* check how the file or directory should be treated */`
			`state = treat_path(dir, de, &path, baselen, simplify);`
			`if (state > dir_state)`
			`dir_state = state;`

			`/* recurse into subdir if instructed by treat_path */`
			`if (state == path_recurse) {`
			`subdir_state = read_directory_recursive(dir, path.buf,`
dir.c: git-status --ignored: don't list files in ignored directories 'git-status --ignored' lists both the ignored directory and the ignored files if the files are in a tracked sub directory. When recursing into sub directories in read_directory_recursive, pass on the check_only parameter so that we don't accidentally add the files. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:06:30 +02:00			`path.len, check_only, simplify);`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`if (subdir_state > dir_state)`
			`dir_state = subdir_state;`
			`}`

			`if (check_only) {`
			`/* abort early if maximum state has been reached */`
			`if (dir_state == path_untracked)`
			`break;`
			`/* skip the dir_add_* part */`
read_directory_recursive: reduce one indentation level Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:11 +02:00			`continue;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00
			`/* add the path to the appropriate result list */`
			`switch (state) {`
			`case path_excluded:`
			`if (dir->flags & DIR_SHOW_IGNORED)`
			`dir_add_name(dir, path.buf, path.len);`
dir.c: git-status --ignored: don't scan the work tree twice 'git-status --ignored' still scans the work tree twice to collect untracked and ignored files, respectively. fill_directory / read_directory already supports collecting untracked and ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED flag to enable this has some git-add specific side-effects (e.g. it doesn't recurse into ignored directories, so listing ignored files with --untracked=all doesn't work). The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored files in dir_struct.entries[] (instead of dir_struct.ignored[] as DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git. We don't want to break the existing API, so lets introduce a new flag DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar to DIR_COLLECT_FILES, but will recurse into sub-directories based on the other flags as DIR_SHOW_IGNORED does. In dir.c::read_directory_recursive, add ignored files to either dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move the DIR_COLLECT_IGNORED case here so that filling result lists is in a common place. In wt-status.c::wt_status_collect_untracked, use the new flag and read results from dir_struct.ignored[]. Remove the extra fill_directory call. builtin/check-ignore.c doesn't call fill_directory, setting the git-add specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity. Update API documentation to reflect the changes. Performance: with this patch, 'git-status --ignored' is typically as fast as 'git-status'. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:15:03 +02:00			`else if ((dir->flags & DIR_SHOW_IGNORED_TOO) \|\|`
			`((dir->flags & DIR_COLLECT_IGNORED) &&`
			`exclude_matches_pathspec(path.buf, path.len,`
			`simplify)))`
			`dir_add_ignored(dir, path.buf, path.len);`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`break;`

			`case path_untracked:`
			`if (!(dir->flags & DIR_SHOW_IGNORED))`
			`dir_add_name(dir, path.buf, path.len);`
dir: respect string length argument of read_directory_recursive() A directory name is passed to read_directory_recursive() as a length-limited string, through the parameters base and baselen. Suprisingly, base must be a NUL-terminated string as well, as it is passed to opendir(), ignoring baselen. Fix this by postponing the call to opendir() until the length-limted string is added to a strbuf, which provides a NUL in the right place. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-11 16:53:07 +02:00			`break;`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00
			`default:`
			`break;`
			`}`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`
read_directory_recursive: reduce one indentation level Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-10-24 08:36:11 +02:00			`closedir(fdir);`
dir: respect string length argument of read_directory_recursive() A directory name is passed to read_directory_recursive() as a length-limited string, through the parameters base and baselen. Suprisingly, base must be a NUL-terminated string as well, as it is passed to opendir(), ignoring baselen. Fix this by postponing the call to opendir() until the length-limted string is added to a strbuf, which provides a NUL in the right place. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-11 16:53:07 +02:00			`out:`
Merge branch 'rs/maint-dir-strbuf' into rs/dir-strbuf By René Scharfe * rs/maint-dir-strbuf: dir: convert to strbuf 2012-05-08 18:43:40 +02:00			`strbuf_release(&path);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`return dir_state;`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`}`

			`static int cmp_name(const void p1, const void p2)`
			`{`
			`const struct dir_entry e1 = (const struct dir_entry **)p1;`
			`const struct dir_entry e2 = (const struct dir_entry **)p2;`

			`return cache_name_compare(e1->name, e1->len,`
			`e2->name, e2->len);`
			`}`

Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`static struct path_simplify create_simplify(const char *pathspec)`
			`{`
			`int nr, alloc = 0;`
			`struct path_simplify *simplify = NULL;`

			`if (!pathspec)`
			`return NULL;`

			`for (nr = 0 ; ; nr++) {`
			`const char *match;`
			`if (nr >= alloc) {`
			`alloc = alloc_nr(alloc);`
			`simplify = xrealloc(simplify, alloc * sizeof(*simplify));`
			`}`
			`match = *pathspec++;`
			`if (!match)`
			`break;`
			`simplify[nr].path = match;`
			`simplify[nr].len = simple_length(match);`
			`}`
			`simplify[nr].path = NULL;`
			`simplify[nr].len = 0;`
			`return simplify;`
			`}`

			`static void free_simplify(struct path_simplify *simplify)`
			`{`
Avoid unnecessary "if-before-free" tests. This change removes all obvious useless if-before-free tests. E.g., it replaces code like this: if (some_expression) free (some_expression); with the now-equivalent: free (some_expression); It is equivalent not just because POSIX has required free(NULL) to work for a long time, but simply because it has worked for so long that no reasonable porting target fails the test. Here's some evidence from nearly 1.5 years ago: http://www.winehq.org/pipermail/wine-patches/2006-October/031544.html FYI, the change below was prepared by running the following: git ls-files -z \| xargs -0 \ perl -0x3b -pi -e \ 's/\bif\s\(\s(\S+?)(?:\s!=\sNULL)?\s\)\s+(free\s\(\s\1\s\))/$2/s' Note however, that it doesn't handle brace-enclosed blocks like "if (x) { free (x); }". But that's ok, since there were none like that in git sources. Beware: if you do use the above snippet, note that it can produce syntactically invalid C code. That happens when the affected "if"-statement has a matching "else". E.g., it would transform this if (x) free (x); else foo (); into this: free (x); else foo (); There were none of those here, either. If you're interested in automating detection of the useless tests, you might like the useless-if-before-free script in gnulib: [it does detect brace-enclosed free statements, and has a --name=S option to make it detect free-like functions with different names] http://git.sv.gnu.org/gitweb/?p=gnulib.git;a=blob;f=build-aux/useless-if-before-free Addendum: Remove one more (in imap-send.c), spotted by Jean-Luc Herren <jlh@gmx.ch>. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-01-31 18:26:32 +01:00			`free(simplify);`
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`}`

ls-files: fix overeager pathspec optimization Given pathspecs that share a common prefix, ls-files optimized its call into recursive directory reader by starting at the common prefix directory. If you have a directory "t" with an untracked file "t/junk" in it, but the top-level .gitignore file told us to ignore "t/", this resulted in: $ git ls-files -o --exclude-standard $ git ls-files -o --exclude-standard t/ t/junk $ git ls-files -o --exclude-standard t/junk t/junk $ cd t && git ls-files -o --exclude-standard junk We could argue that you are overriding the ignore file by giving a patchspec that matches or being in that directory, but it is somewhat unexpected. Worse yet, these behave differently: $ git ls-files -o --exclude-standard t/ . $ git ls-files -o --exclude-standard t/ t/junk This patch changes the optimization so that it notices when the common prefix directory that it starts reading from is an ignored one. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 08:05:41 +01:00			`static int treat_leading_path(struct dir_struct *dir,`
			`const char *path, int len,`
			`const struct path_simplify *simplify)`
			`{`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`struct strbuf sb = STRBUF_INIT;`
			`int baselen, rc = 0;`
ls-files: fix overeager pathspec optimization Given pathspecs that share a common prefix, ls-files optimized its call into recursive directory reader by starting at the common prefix directory. If you have a directory "t" with an untracked file "t/junk" in it, but the top-level .gitignore file told us to ignore "t/", this resulted in: $ git ls-files -o --exclude-standard $ git ls-files -o --exclude-standard t/ t/junk $ git ls-files -o --exclude-standard t/junk t/junk $ cd t && git ls-files -o --exclude-standard junk We could argue that you are overriding the ignore file by giving a patchspec that matches or being in that directory, but it is somewhat unexpected. Worse yet, these behave differently: $ git ls-files -o --exclude-standard t/ . $ git ls-files -o --exclude-standard t/ t/junk This patch changes the optimization so that it notices when the common prefix directory that it starts reading from is an ignored one. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 08:05:41 +01:00			`const char *cp;`
dir.c: make 'git-status --ignored' work within leading directories 'git-status --ignored path/' doesn't list ignored files and directories within 'path' if some component of 'path' is classified as untracked. Disable the DIR_SHOW_OTHER_DIRECTORIES flag while traversing leading directories. This prevents treat_leading_path() with DIR_SHOW_IGNORED flag from aborting at the top level untracked directory. As a side effect, this also eliminates a recursive directory scan per leading directory level, as treat_directory() can no longer call read_directory_recursive() when called from treat_leading_path(). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:09:25 +02:00			`int old_flags = dir->flags;`
ls-files: fix overeager pathspec optimization Given pathspecs that share a common prefix, ls-files optimized its call into recursive directory reader by starting at the common prefix directory. If you have a directory "t" with an untracked file "t/junk" in it, but the top-level .gitignore file told us to ignore "t/", this resulted in: $ git ls-files -o --exclude-standard $ git ls-files -o --exclude-standard t/ t/junk $ git ls-files -o --exclude-standard t/junk t/junk $ cd t && git ls-files -o --exclude-standard junk We could argue that you are overriding the ignore file by giving a patchspec that matches or being in that directory, but it is somewhat unexpected. Worse yet, these behave differently: $ git ls-files -o --exclude-standard t/ . $ git ls-files -o --exclude-standard t/ t/junk This patch changes the optimization so that it notices when the common prefix directory that it starts reading from is an ignored one. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 08:05:41 +01:00
			`while (len && path[len - 1] == '/')`
			`len--;`
			`if (!len)`
			`return 1;`
			`baselen = 0;`
dir.c: make 'git-status --ignored' work within leading directories 'git-status --ignored path/' doesn't list ignored files and directories within 'path' if some component of 'path' is classified as untracked. Disable the DIR_SHOW_OTHER_DIRECTORIES flag while traversing leading directories. This prevents treat_leading_path() with DIR_SHOW_IGNORED flag from aborting at the top level untracked directory. As a side effect, this also eliminates a recursive directory scan per leading directory level, as treat_directory() can no longer call read_directory_recursive() when called from treat_leading_path(). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:09:25 +02:00			`dir->flags &= ~DIR_SHOW_OTHER_DIRECTORIES;`
ls-files: fix overeager pathspec optimization Given pathspecs that share a common prefix, ls-files optimized its call into recursive directory reader by starting at the common prefix directory. If you have a directory "t" with an untracked file "t/junk" in it, but the top-level .gitignore file told us to ignore "t/", this resulted in: $ git ls-files -o --exclude-standard $ git ls-files -o --exclude-standard t/ t/junk $ git ls-files -o --exclude-standard t/junk t/junk $ cd t && git ls-files -o --exclude-standard junk We could argue that you are overriding the ignore file by giving a patchspec that matches or being in that directory, but it is somewhat unexpected. Worse yet, these behave differently: $ git ls-files -o --exclude-standard t/ . $ git ls-files -o --exclude-standard t/ t/junk This patch changes the optimization so that it notices when the common prefix directory that it starts reading from is an ignored one. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 08:05:41 +01:00			`while (1) {`
			`cp = path + baselen + !!baselen;`
			`cp = memchr(cp, '/', path + len - cp);`
			`if (!cp)`
			`baselen = len;`
			`else`
			`baselen = cp - path;`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`strbuf_setlen(&sb, 0);`
			`strbuf_add(&sb, path, baselen);`
			`if (!is_directory(sb.buf))`
			`break;`
			`if (simplify_away(sb.buf, sb.len, simplify))`
			`break;`
			`if (treat_one_path(dir, &sb, simplify,`
dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:14:22 +02:00			`DT_DIR, NULL) == path_none)`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`break; /* do not recurse into it */`
			`if (len <= baselen) {`
			`rc = 1;`
			`break; /* finished checking */`
			`}`
ls-files: fix overeager pathspec optimization Given pathspecs that share a common prefix, ls-files optimized its call into recursive directory reader by starting at the common prefix directory. If you have a directory "t" with an untracked file "t/junk" in it, but the top-level .gitignore file told us to ignore "t/", this resulted in: $ git ls-files -o --exclude-standard $ git ls-files -o --exclude-standard t/ t/junk $ git ls-files -o --exclude-standard t/junk t/junk $ cd t && git ls-files -o --exclude-standard junk We could argue that you are overriding the ignore file by giving a patchspec that matches or being in that directory, but it is somewhat unexpected. Worse yet, these behave differently: $ git ls-files -o --exclude-standard t/ . $ git ls-files -o --exclude-standard t/ t/junk This patch changes the optimization so that it notices when the common prefix directory that it starts reading from is an ignored one. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 08:05:41 +01:00			`}`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`strbuf_release(&sb);`
dir.c: make 'git-status --ignored' work within leading directories 'git-status --ignored path/' doesn't list ignored files and directories within 'path' if some component of 'path' is classified as untracked. Disable the DIR_SHOW_OTHER_DIRECTORIES flag while traversing leading directories. This prevents treat_leading_path() with DIR_SHOW_IGNORED flag from aborting at the top level untracked directory. As a side effect, this also eliminates a recursive directory scan per leading directory level, as treat_directory() can no longer call read_directory_recursive() when called from treat_leading_path(). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-15 21:09:25 +02:00			`dir->flags = old_flags;`
dir: convert to strbuf The functions read_directory_recursive() and treat_leading_path() both use buffers sized to fit PATH_MAX characters. The latter can be made to overrun its buffer, e.g. like this: $ a=0123456789abcdef $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ a=$a$a$a$a$a$a$a$a $ git add $a/a Instead of trying to add a check and potentionally forgetting to address similar cases, convert the involved functions and their helpers to use struct strbuf. The patch is suprisingly large because the helpers treat_path() and treat_one_path() modify the buffer as well and thus need to be converted, too. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-05-01 13:25:24 +02:00			`return rc;`
ls-files: fix overeager pathspec optimization Given pathspecs that share a common prefix, ls-files optimized its call into recursive directory reader by starting at the common prefix directory. If you have a directory "t" with an untracked file "t/junk" in it, but the top-level .gitignore file told us to ignore "t/", this resulted in: $ git ls-files -o --exclude-standard $ git ls-files -o --exclude-standard t/ t/junk $ git ls-files -o --exclude-standard t/junk t/junk $ cd t && git ls-files -o --exclude-standard junk We could argue that you are overriding the ignore file by giving a patchspec that matches or being in that directory, but it is somewhat unexpected. Worse yet, these behave differently: $ git ls-files -o --exclude-standard t/ . $ git ls-files -o --exclude-standard t/ t/junk This patch changes the optimization so that it notices when the common prefix directory that it starts reading from is an ignored one. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 08:05:41 +01:00			`}`

Simplify read_directory[_recursive]() arguments Stop the insanity with separate 'path' and 'base' arguments that must match. We don't need that crazy interface any more, since we cleaned up handling of 'path' in commit da4b3e8c28b1dc2b856d2555ac7bb47ab712598c. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-09 04:24:39 +02:00			`int read_directory(struct dir_struct dir, const char path, int len, const char **pathspec)`
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`{`
add: refuse to add working tree items beyond symlinks This is the same fix for the issue of adding "sym/path" when "sym" is a symblic link that points at a directory "dir" with "path" in it. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-04 09:52:37 +02:00			`struct path_simplify *simplify;`
Clean up git-ls-file directory walking library interface This moves the code to add the per-directory ignore files for the base directory into the library routine. That not only allows us to turn the function push_exclude_per_directory() static again, it also simplifies the library interface a lot (the caller no longer needs to worry about any of the per-directory exclude files at all). Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:46:16 +02:00
Simplify read_directory[_recursive]() arguments Stop the insanity with separate 'path' and 'base' arguments that must match. We don't need that crazy interface any more, since we cleaned up handling of 'path' in commit da4b3e8c28b1dc2b856d2555ac7bb47ab712598c. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-09 04:24:39 +02:00			`if (has_symlink_leading_path(path, len))`
add: refuse to add working tree items beyond symlinks This is the same fix for the issue of adding "sym/path" when "sym" is a symblic link that points at a directory "dir" with "path" in it. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2008-08-04 09:52:37 +02:00			`return dir->nr;`

			`simplify = create_simplify(pathspec);`
ls-files: fix overeager pathspec optimization Given pathspecs that share a common prefix, ls-files optimized its call into recursive directory reader by starting at the common prefix directory. If you have a directory "t" with an untracked file "t/junk" in it, but the top-level .gitignore file told us to ignore "t/", this resulted in: $ git ls-files -o --exclude-standard $ git ls-files -o --exclude-standard t/ t/junk $ git ls-files -o --exclude-standard t/junk t/junk $ cd t && git ls-files -o --exclude-standard junk We could argue that you are overriding the ignore file by giving a patchspec that matches or being in that directory, but it is somewhat unexpected. Worse yet, these behave differently: $ git ls-files -o --exclude-standard t/ . $ git ls-files -o --exclude-standard t/ t/junk This patch changes the optimization so that it notices when the common prefix directory that it starts reading from is an ignored one. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-01-09 08:05:41 +01:00			`if (!len \|\| treat_leading_path(dir, path, len, simplify))`
			`read_directory_recursive(dir, path, len, 0, simplify);`
Optimize directory listing with pathspec limiter. The way things are set up, you can now pass a "pathspec" to the "read_directory()" function. If you pass NULL, it acts exactly like it used to do (read everything). If you pass a non-NULL pointer, it will simplify it into a "these are the prefixes without any special characters", and stop any readdir() early if the path in question doesn't match any of the prefixes. NOTE! This does not obviate the need for the caller to do the exact pathspec match later. It's a first-level filter on "read_directory()", but it does not do the full pathspec thing. Maybe it should. But in the meantime, builtin-add.c really does need to do first read_directory(dir, .., pathspec); if (pathspec) prune_directory(dir, pathspec, baselen); ie the "prune_directory()" part will do the exact pathspec pruning, while the "read_directory()" will use the pathspec just to do some quick high-level pruning of the directories it will recurse into. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2007-03-31 05:39:30 +02:00			`free_simplify(simplify);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);`
dir_struct: add collect_ignored option When set, this option will cause read_directory to keep track of which entries were ignored. While this shouldn't effect functionality in most cases, it can make warning messages to the user much more useful. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-06-11 15:39:50 +02:00			`qsort(dir->ignored, dir->ignored_nr, sizeof(struct dir_entry *), cmp_name);`
libify git-ls-files directory traversal This moves the core directory traversal and filename exclusion logic into the general git library, making it available for other users directly. If we ever want to do "git commit" or "git add" as a built-in (and we do), we want to be able to handle most of git-ls-files as a library. NOTE! Not all of git-ls-files is libified by this. The index matching and pathspec prefix calculation is still in ls-files.c, but this is a big part of it. Signed-off-by: Linus Torvalds <torvalds@osdl.org> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-05-17 04:02:14 +02:00			`return dir->nr;`
			`}`
git-commit.sh: convert run_status to a C builtin This creates a new git-runstatus which should do roughly the same thing as the run_status function from git-commit.sh. Except for color support, the main focus has been to keep the output identical, so that it can be verified as correct and then used as a C platform for other improvements to the status printing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-08 10:05:34 +02:00
dir.c: minor clean-up Replace handcrafted reallocation with ALLOC_GROW(). Reindent "file_exists()" helper function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 10:11:46 +01:00			`int file_exists(const char *f)`
git-commit.sh: convert run_status to a C builtin This creates a new git-runstatus which should do roughly the same thing as the run_status function from git-commit.sh. Except for color support, the main focus has been to keep the output identical, so that it can be verified as correct and then used as a C platform for other improvements to the status printing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-08 10:05:34 +02:00			`{`
dir.c: minor clean-up Replace handcrafted reallocation with ALLOC_GROW(). Reindent "file_exists()" helper function. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-29 10:11:46 +01:00			`struct stat sb;`
file_exists(): dangling symlinks do exist This function is used to see if a path given by the user does exist on the filesystem. A symbolic link that does not point anywhere does exist but running stat() on it would yield an error, and it incorrectly said it does not exist. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-18 10:58:16 +01:00			`return lstat(f, &sb) == 0;`
git-commit.sh: convert run_status to a C builtin This creates a new git-runstatus which should do roughly the same thing as the run_status function from git-commit.sh. Except for color support, the main focus has been to keep the output identical, so that it can be verified as correct and then used as a C platform for other improvements to the status printing code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <junkio@cox.net> 2006-09-08 10:05:34 +02:00			`}`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00
			`/*`
setup: return correct prefix if worktree is '/' The same old problem reappears after setup code is reworked. We tend to assume there is at least one path component in a path and forget that path can be simply '/'. Reported-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:24 +01:00			`* Given two normalized paths (a trailing slash is ok), if subdir is`
			`* outside dir, return -1. Otherwise return the offset in subdir that`
			`* can be used as relative path to dir.`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00			`*/`
setup: return correct prefix if worktree is '/' The same old problem reappears after setup code is reworked. We tend to assume there is at least one path component in a path and forget that path can be simply '/'. Reported-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:24 +01:00			`int dir_inside_of(const char subdir, const char dir)`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00			`{`
setup: return correct prefix if worktree is '/' The same old problem reappears after setup code is reworked. We tend to assume there is at least one path component in a path and forget that path can be simply '/'. Reported-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:24 +01:00			`int offset = 0;`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00
setup: return correct prefix if worktree is '/' The same old problem reappears after setup code is reworked. We tend to assume there is at least one path component in a path and forget that path can be simply '/'. Reported-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:24 +01:00			`assert(dir && subdir && dir && subdir);`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00
setup: return correct prefix if worktree is '/' The same old problem reappears after setup code is reworked. We tend to assume there is at least one path component in a path and forget that path can be simply '/'. Reported-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:24 +01:00			`while (dir && subdir && dir == subdir) {`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00			`dir++;`
setup: return correct prefix if worktree is '/' The same old problem reappears after setup code is reworked. We tend to assume there is at least one path component in a path and forget that path can be simply '/'. Reported-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:24 +01:00			`subdir++;`
			`offset++;`
get_cwd_relative(): do not misinterpret suffix as subdirectory If the current working directory is the same as the work tree path plus a suffix, e.g. 'work' and 'work-xyz', then the suffix '-xyz' would be interpreted as a subdirectory of 'work'. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-22 13:13:05 +02:00			`}`
setup: return correct prefix if worktree is '/' The same old problem reappears after setup code is reworked. We tend to assume there is at least one path component in a path and forget that path can be simply '/'. Reported-by: Matthijs Kooijman <matthijs@stdin.nl> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:24 +01:00
			`/* hel[p]/me vs hel[l]/yeah */`
			`if (dir && subdir)`
			`return -1;`

			`if (!*subdir)`
			`return !dir ? offset : -1; / same dir */`

			`/* foo/[b]ar vs foo/[] */`
			`if (is_dir_sep(dir[-1]))`
			`return is_dir_sep(subdir[-1]) ? offset : -1;`

			`/* foo[/]bar vs foo[] */`
			`return is_dir_sep(*subdir) ? offset + 1 : -1;`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00			`}`

			`int is_inside_dir(const char *dir)`
			`{`
Kill off get_relative_cwd() Function dir_inside_of() does something similar (correctly), but looks easier to understand and does not bundle cwd to its business. Given get_relative_cwd's only user is is_inside_dir, we can kill it for good. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-03-26 10:04:25 +01:00			`char cwd[PATH_MAX];`
			`if (!dir)`
			`return 0;`
			`if (!getcwd(cwd, sizeof(cwd)))`
			`die_errno("can't find the current directory");`
			`return dir_inside_of(cwd, dir) >= 0;`
Add functions get_relative_cwd() and is_inside_dir() The function get_relative_cwd() works just as getcwd(), only that it takes an absolute path as additional parameter, returning the prefix of the current working directory relative to the given path. If the cwd is no subdirectory of the given path, it returns NULL. is_inside_dir() is just a trivial wrapper over get_relative_cwd(). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-08-01 02:29:17 +02:00			`}`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00
Allow cloning to an existing empty directory The die() message updated accordingly. The previous behaviour was to only allow cloning when the destination directory doesn't exist. [jc: added trivial tests] Signed-off-by: Alexander Potashev <aspotashev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-11 13:19:12 +01:00			`int is_empty_dir(const char *path)`
			`{`
			`DIR *dir = opendir(path);`
			`struct dirent *e;`
			`int ret = 1;`

			`if (!dir)`
			`return 0;`

			`while ((e = readdir(dir)) != NULL)`
			`if (!is_dot_or_dotdot(e->d_name)) {`
			`ret = 0;`
			`break;`
			`}`

			`closedir(dir);`
			`return ret;`
			`}`

clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`static int remove_dir_recurse(struct strbuf path, int flag, int kept_up)`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00			`{`
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00			`DIR *dir;`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00			`struct dirent *e;`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`int ret = 0, original_len = path->len, len, kept_down = 0;`
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00			`int only_empty = (flag & REMOVE_DIR_EMPTY_ONLY);`
remove_dir_recursively(): Add flag for skipping removal of toplevel dir Add the REMOVE_DIR_KEEP_TOPLEVEL flag to remove_dir_recursively() for deleting everything inside the given directory, but _not_ the given directory itself. Note that this does not pass the REMOVE_DIR_KEEP_NESTED_GIT flag, if set, to the recursive invocations of remove_dir_recursively(). It is likely to be a a bug that has been present since REMOVE_DIR_KEEP_NESTED_GIT was introduced (a0f4afb), but this commit keeps the same behaviour for now. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 15:58:54 +01:00			`int keep_toplevel = (flag & REMOVE_DIR_KEEP_TOPLEVEL);`
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00			`unsigned char submodule_head[20];`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00			`if ((flag & REMOVE_DIR_KEEP_NESTED_GIT) &&`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`!resolve_gitlink_ref(path->buf, "HEAD", submodule_head)) {`
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00			`/* Do not descend and nuke a nested git work tree. */`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`if (kept_up)`
			`*kept_up = 1;`
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00			`return 0;`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`}`
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`flag &= ~REMOVE_DIR_KEEP_TOPLEVEL;`
clean: require double -f options to nuke nested git repository and work tree When you have an embedded git work tree in your work tree (be it an orphaned submodule, or an independent checkout of an unrelated project), "git clean -d -f" blindly descended into it and removed everything. This is rarely what the user wants. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-07-01 00:33:45 +02:00			`dir = opendir(path->buf);`
remove_dir_recursively(): Add flag for skipping removal of toplevel dir Add the REMOVE_DIR_KEEP_TOPLEVEL flag to remove_dir_recursively() for deleting everything inside the given directory, but _not_ the given directory itself. Note that this does not pass the REMOVE_DIR_KEEP_NESTED_GIT flag, if set, to the recursive invocations of remove_dir_recursively(). It is likely to be a a bug that has been present since REMOVE_DIR_KEEP_NESTED_GIT was introduced (a0f4afb), but this commit keeps the same behaviour for now. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 15:58:54 +01:00			`if (!dir) {`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`/* an empty dir could be removed even if it is unreadble */`
remove_dir_recursively(): Add flag for skipping removal of toplevel dir Add the REMOVE_DIR_KEEP_TOPLEVEL flag to remove_dir_recursively() for deleting everything inside the given directory, but _not_ the given directory itself. Note that this does not pass the REMOVE_DIR_KEEP_NESTED_GIT flag, if set, to the recursive invocations of remove_dir_recursively(). It is likely to be a a bug that has been present since REMOVE_DIR_KEEP_NESTED_GIT was introduced (a0f4afb), but this commit keeps the same behaviour for now. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 15:58:54 +01:00			`if (!keep_toplevel)`
			`return rmdir(path->buf);`
			`else`
			`return -1;`
			`}`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00			`if (path->buf[original_len - 1] != '/')`
			`strbuf_addch(path, '/');`

			`len = path->len;`
			`while ((e = readdir(dir)) != NULL) {`
			`struct stat st;`
add is_dot_or_dotdot inline function A new inline function is_dot_or_dotdot is used to check if the directory name is either "." or "..". It returns a non-zero value if the given string is "." or "..". It's applicable to a lot of Git source code. Signed-off-by: Alexander Potashev <aspotashev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-01-10 13:07:50 +01:00			`if (is_dot_or_dotdot(e->d_name))`
			`continue;`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00
			`strbuf_setlen(path, len);`
			`strbuf_addstr(path, e->d_name);`
			`if (lstat(path->buf, &st))`
			`; /* fall thru */`
			`else if (S_ISDIR(st.st_mode)) {`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`if (!remove_dir_recurse(path, flag, &kept_down))`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00			`continue; /* happy */`
			`} else if (!only_empty && !unlink(path->buf))`
			`continue; /* happy, too */`

			`/* path too long, stat fails, or non-directory still exists */`
			`ret = -1;`
			`break;`
			`}`
			`closedir(dir);`

			`strbuf_setlen(path, original_len);`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`if (!ret && !keep_toplevel && !kept_down)`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00			`ret = rmdir(path->buf);`
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`else if (kept_up)`
			`/*`
			`* report the uplevel that it is not an error that we`
			`* did not rmdir() our directory.`
			`*/`
			`*kept_up = !ret;`
Introduce remove_dir_recursively() There was a function called remove_empty_dir_recursive() buried in refs.c. Expose a slightly enhanced version in dir.h: it can now optionally remove a non-empty directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-09-28 17:28:54 +02:00			`return ret;`
			`}`
core.excludesfile clean-up There are inconsistencies in the way commands currently handle the core.excludesfile configuration variable. The problem is the variable is too new to be noticed by anything other than git-add and git-status. * git-ls-files does not notice any of the "ignore" files by default, as it predates the standardized set of ignore files. The calling scripts established the convention to use .git/info/exclude, .gitignore, and later core.excludesfile. * git-add and git-status know about it because they call add_excludes_from_file() directly with their own notion of which standard set of ignore files to use. This is just a stupid duplication of code that need to be updated every time the definition of the standard set of ignore files is changed. * git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. * git-merge-recursive uses hardcoded per-directory .gitignore and nothing else. git-clean (scripted version) does not honor core.* because its call to underlying ls-files does not know about it. git-clean in C (parked in 'pu') doesn't either. We probably could change git-ls-files to use the standard set when no excludes are specified on the command line and ignore processing was asked, or something like that, but that will be a change in semantics and might break people's scripts in a subtle way. I am somewhat reluctant to make such a change. On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 09:05:00 +01:00
clean: preserve nested git worktree in subdirectories remove_dir_recursively() has a check to avoid removing the directory it was asked to remove without recursing into it and report success when the directory is the top level of a working tree of a nested git repository, to protect such a repository from "clean -f" (without double -f). If a working tree of a nested git repository is in a subdirectory of a toplevel project, however, this protection did not apply by mistake; we forgot to pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal codepath. This requires us to also teach the higher level not to remove the directory it is asked to remove, when the recursed invocation did not remove the directory it was asked to remove due to a nested git repository, as it is not an error to leave the parent directories of such a nested repository. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-15 09:04:12 +01:00			`int remove_dir_recursively(struct strbuf *path, int flag)`
			`{`
			`return remove_dir_recurse(path, flag, NULL);`
			`}`

core.excludesfile clean-up There are inconsistencies in the way commands currently handle the core.excludesfile configuration variable. The problem is the variable is too new to be noticed by anything other than git-add and git-status. * git-ls-files does not notice any of the "ignore" files by default, as it predates the standardized set of ignore files. The calling scripts established the convention to use .git/info/exclude, .gitignore, and later core.excludesfile. * git-add and git-status know about it because they call add_excludes_from_file() directly with their own notion of which standard set of ignore files to use. This is just a stupid duplication of code that need to be updated every time the definition of the standard set of ignore files is changed. * git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. * git-merge-recursive uses hardcoded per-directory .gitignore and nothing else. git-clean (scripted version) does not honor core.* because its call to underlying ls-files does not know about it. git-clean in C (parked in 'pu') doesn't either. We probably could change git-ls-files to use the standard set when no excludes are specified on the command line and ignore processing was asked, or something like that, but that will be a change in semantics and might break people's scripts in a subtle way. I am somewhat reluctant to make such a change. On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 09:05:00 +01:00			`void setup_standard_excludes(struct dir_struct *dir)`
			`{`
			`const char *path;`
Let core.excludesfile default to $XDG_CONFIG_HOME/git/ignore To use the feature of core.excludesfile, the user needs: 1. to create such a file, 2. and add configuration variable to point at it. Instead, we can make this a one-step process by choosing a default value which points to a filename in the user's $HOME, that is unlikely to already exist on the system, and only use the presence of the file as a cue that the user wants to use that feature. And we use "${XDG_CONFIG_HOME:-$HOME/.config/git}/ignore" as such a file, in the same directory as the newly added configuration file ("${XDG_CONFIG_HOME:-$HOME/.config/git}/config). The use of this directory is in line with XDG specification as a location to store such application specific files. Signed-off-by: Huynh Khoi Nguyen Nguyen <Huynh-Khoi-Nguyen.Nguyen@ensimag.imag.fr> Signed-off-by: Valentin Duperray <Valentin.Duperray@ensimag.imag.fr> Signed-off-by: Franck Jonas <Franck.Jonas@ensimag.imag.fr> Signed-off-by: Lucien Kong <Lucien.Kong@ensimag.imag.fr> Signed-off-by: Thomas Nguy <Thomas.Nguy@ensimag.imag.fr> Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-06-22 11:03:24 +02:00			`char *xdg_path;`
core.excludesfile clean-up There are inconsistencies in the way commands currently handle the core.excludesfile configuration variable. The problem is the variable is too new to be noticed by anything other than git-add and git-status. * git-ls-files does not notice any of the "ignore" files by default, as it predates the standardized set of ignore files. The calling scripts established the convention to use .git/info/exclude, .gitignore, and later core.excludesfile. * git-add and git-status know about it because they call add_excludes_from_file() directly with their own notion of which standard set of ignore files to use. This is just a stupid duplication of code that need to be updated every time the definition of the standard set of ignore files is changed. * git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. * git-merge-recursive uses hardcoded per-directory .gitignore and nothing else. git-clean (scripted version) does not honor core.* because its call to underlying ls-files does not know about it. git-clean in C (parked in 'pu') doesn't either. We probably could change git-ls-files to use the standard set when no excludes are specified on the command line and ignore processing was asked, or something like that, but that will be a change in semantics and might break people's scripts in a subtle way. I am somewhat reluctant to make such a change. On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 09:05:00 +01:00
			`dir->exclude_per_dir = ".gitignore";`
			`path = git_path("info/exclude");`
Let core.excludesfile default to $XDG_CONFIG_HOME/git/ignore To use the feature of core.excludesfile, the user needs: 1. to create such a file, 2. and add configuration variable to point at it. Instead, we can make this a one-step process by choosing a default value which points to a filename in the user's $HOME, that is unlikely to already exist on the system, and only use the presence of the file as a cue that the user wants to use that feature. And we use "${XDG_CONFIG_HOME:-$HOME/.config/git}/ignore" as such a file, in the same directory as the newly added configuration file ("${XDG_CONFIG_HOME:-$HOME/.config/git}/config). The use of this directory is in line with XDG specification as a location to store such application specific files. Signed-off-by: Huynh Khoi Nguyen Nguyen <Huynh-Khoi-Nguyen.Nguyen@ensimag.imag.fr> Signed-off-by: Valentin Duperray <Valentin.Duperray@ensimag.imag.fr> Signed-off-by: Franck Jonas <Franck.Jonas@ensimag.imag.fr> Signed-off-by: Lucien Kong <Lucien.Kong@ensimag.imag.fr> Signed-off-by: Thomas Nguy <Thomas.Nguy@ensimag.imag.fr> Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-06-22 11:03:24 +02:00			`if (!excludes_file) {`
			`home_config_paths(NULL, &xdg_path, "ignore");`
			`excludes_file = xdg_path;`
			`}`
gitignore: report access errors of exclude files When we try to access gitignore files, we check for their existence with a call to "access". We silently ignore missing files. However, if a file is not readable, this may be a configuration error; let's warn the user. For $GIT_DIR/info/excludes or core.excludesfile, we can just use access_or_warn. However, for per-directory files we actually try to open them, so we must add a custom warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-21 08:26:07 +02:00			`if (!access_or_warn(path, R_OK))`
core.excludesfile clean-up There are inconsistencies in the way commands currently handle the core.excludesfile configuration variable. The problem is the variable is too new to be noticed by anything other than git-add and git-status. * git-ls-files does not notice any of the "ignore" files by default, as it predates the standardized set of ignore files. The calling scripts established the convention to use .git/info/exclude, .gitignore, and later core.excludesfile. * git-add and git-status know about it because they call add_excludes_from_file() directly with their own notion of which standard set of ignore files to use. This is just a stupid duplication of code that need to be updated every time the definition of the standard set of ignore files is changed. * git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. * git-merge-recursive uses hardcoded per-directory .gitignore and nothing else. git-clean (scripted version) does not honor core.* because its call to underlying ls-files does not know about it. git-clean in C (parked in 'pu') doesn't either. We probably could change git-ls-files to use the standard set when no excludes are specified on the command line and ignore processing was asked, or something like that, but that will be a change in semantics and might break people's scripts in a subtle way. I am somewhat reluctant to make such a change. On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 09:05:00 +01:00			`add_excludes_from_file(dir, path);`
gitignore: report access errors of exclude files When we try to access gitignore files, we check for their existence with a call to "access". We silently ignore missing files. However, if a file is not readable, this may be a configuration error; let's warn the user. For $GIT_DIR/info/excludes or core.excludesfile, we can just use access_or_warn. However, for per-directory files we actually try to open them, so we must add a custom warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-08-21 08:26:07 +02:00			`if (excludes_file && !access_or_warn(excludes_file, R_OK))`
core.excludesfile clean-up There are inconsistencies in the way commands currently handle the core.excludesfile configuration variable. The problem is the variable is too new to be noticed by anything other than git-add and git-status. * git-ls-files does not notice any of the "ignore" files by default, as it predates the standardized set of ignore files. The calling scripts established the convention to use .git/info/exclude, .gitignore, and later core.excludesfile. * git-add and git-status know about it because they call add_excludes_from_file() directly with their own notion of which standard set of ignore files to use. This is just a stupid duplication of code that need to be updated every time the definition of the standard set of ignore files is changed. * git-read-tree takes --exclude-per-directory=<gitignore>, not because the flexibility was needed. Again, this was because the option predates the standardization of the ignore files. * git-merge-recursive uses hardcoded per-directory .gitignore and nothing else. git-clean (scripted version) does not honor core.* because its call to underlying ls-files does not know about it. git-clean in C (parked in 'pu') doesn't either. We probably could change git-ls-files to use the standard set when no excludes are specified on the command line and ignore processing was asked, or something like that, but that will be a change in semantics and might break people's scripts in a subtle way. I am somewhat reluctant to make such a change. On the other hand, I think it makes perfect sense to fix git-read-tree, git-merge-recursive and git-clean to follow the same rule as other commands. I do not think of a valid use case to give an exclude-per-directory that is nonstandard to read-tree command, outside a "negative" test in the t1004 test script. This patch is the first step to untangle this mess. The next step would be to teach read-tree, merge-recursive and clean (in C) to use setup_standard_excludes(). Signed-off-by: Junio C Hamano <gitster@pobox.com> 2007-11-14 09:05:00 +01:00			`add_excludes_from_file(dir, excludes_file);`
			`}`
Add remove_path: a function to remove as much as possible of a path The function has two potential users which both managed to get wrong their implementations (the one in builtin-rm.c one has a memleak, and builtin-merge-recursive.c scribles over its const argument). Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-09-27 00:56:46 +02:00
			`int remove_path(const char *name)`
			`{`
			`char *slash;`

rm: do not complain about d/f conflicts during deletion If we used to have an index entry "d/f", but "d" has been replaced by a non-directory entry, the user may still want to run "git rm" to delete the stale index entry. They could use "git rm --cached" to just touch the index, but "git rm" should also work: we explicitly try to handle the case that the file has already been removed from the working tree. However, because unlinking "d/f" in this case will not yield ENOENT, but rather ENOTDIR, we do not notice that the file is already gone. Instead, we report it as an error. The simple solution is to treat ENOTDIR in this case exactly like ENOENT; all we want to know is whether the file is already gone, and if a leading path is no longer a directory, then by definition the sub-path is gone. Reported-by: jpinheiro <7jpinheiro@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-04 21:03:35 +02:00			`if (unlink(name) && errno != ENOENT && errno != ENOTDIR)`
Add remove_path: a function to remove as much as possible of a path The function has two potential users which both managed to get wrong their implementations (the one in builtin-rm.c one has a memleak, and builtin-merge-recursive.c scribles over its const argument). Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-09-27 00:56:46 +02:00			`return -1;`

			`slash = strrchr(name, '/');`
			`if (slash) {`
			`char *dirs = xstrdup(name);`
			`slash = dirs + (slash - name);`
			`do {`
			`*slash = '\0';`
rm: fix bug in recursive subdirectory removal If we remove a path in a/deep/subdirectory, we should try to remove as many trailing components as possible (i.e., subdirectory, then deep, then a). However, the test for the return value of rmdir was reversed, so we only ever deleted at most one level. The fix is in remove_path, so "apply" and "merge-recursive" also are fixed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-02-19 06:57:21 +01:00			`} while (rmdir(dirs) == 0 && (slash = strrchr(dirs, '/')));`
Add remove_path: a function to remove as much as possible of a path The function has two potential users which both managed to get wrong their implementations (the one in builtin-rm.c one has a memleak, and builtin-merge-recursive.c scribles over its const argument). Signed-off-by: Alex Riesen <raa.lkml@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> 2008-09-27 00:56:46 +02:00			`free(dirs);`
			`}`
			`return 0;`
			`}`

tree_entry_interesting(): fix depth limit with overlapping pathspecs Suppose we have two pathspecs 'a' and 'a/b' (both are dirs) and depth limit 1. In current code, pathspecs are checked in input order. When 'a/b' is checked against pathspec 'a', it fails depth limit and therefore is excluded, although it should match 'a/b' pathspec. This patch reorders all pathspecs alphabetically, then teaches tree_entry_interesting() to check against the deepest pathspec first, so depth limit of a shallower pathspec won't affect a deeper one. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:45 +01:00			`static int pathspec_item_cmp(const void a_, const void b_)`
			`{`
			`struct pathspec_item a, b;`

			`a = (struct pathspec_item *)a_;`
			`b = (struct pathspec_item *)b_;`
			`return strcmp(a->match, b->match);`
			`}`

Add struct pathspec The old pathspec structure remains as pathspec.raw[]. New things are stored in pathspec.items[]. There's no guarantee that the pathspec order in raw[] is exactly as in items[]. raw[] is external (source) data and is untouched by pathspec manipulation functions. It eases migration from old const char ** to this new struct. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:36 +01:00			`int init_pathspec(struct pathspec pathspec, const char *paths)`
			`{`
			`const char **p = paths;`
			`int i;`

			`memset(pathspec, 0, sizeof(*pathspec));`
			`if (!p)`
			`return 0;`
			`while (*p)`
			`p++;`
			`pathspec->raw = paths;`
			`pathspec->nr = p - paths;`
			`if (!pathspec->nr)`
			`return 0;`

			`pathspec->items = xmalloc(sizeof(struct pathspec_item)*pathspec->nr);`
			`for (i = 0; i < pathspec->nr; i++) {`
			`struct pathspec_item *item = pathspec->items+i;`
			`const char *path = paths[i];`

			`item->match = path;`
			`item->len = strlen(path);`
pathspec: apply ".c" optimization from exclude When a pattern contains only a single asterisk as wildcard, e.g. "foobar", after literally comparing the leading part "foo" with the string, we can compare the tail of the string and make sure it matches "bar", instead of running fnmatch() on "bar" against the remainder of the string. -O2 build on linux-2.6, without the patch: $ time git rev-list --quiet HEAD -- '.c' real 0m40.770s user 0m40.290s sys 0m0.256s With the patch $ time ~/w/git/git rev-list --quiet HEAD -- '.c' real 0m34.288s user 0m33.997s sys 0m0.205s The above command is not supposed to be widely popular. It's chosen because it exercises pathspec matching a lot. The point is it cuts down matching time for popular patterns like .c, which could be used as pathspec in other places. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-11-24 05:33:50 +01:00			`item->flags = 0;`
Merge branch 'jk/pathspec-literal' Allow scripts to feed literal paths to commands that take pathspecs, by disabling wildcard globbing. * jk/pathspec-literal: add global --literal-pathspecs option Conflicts: dir.c 2013-01-06 08:42:07 +01:00			`if (limit_pathspec_to_literal()) {`
			`item->nowildcard_len = item->len;`
			`} else {`
			`item->nowildcard_len = simple_length(path);`
			`if (item->nowildcard_len < item->len) {`
			`pathspec->has_wildcard = 1;`
			`if (path[item->nowildcard_len] == '*' &&`
			`no_wildcard(path + item->nowildcard_len + 1))`
			`item->flags \|= PATHSPEC_ONESTAR;`
			`}`
pathspec: apply ".c" optimization from exclude When a pattern contains only a single asterisk as wildcard, e.g. "foobar", after literally comparing the leading part "foo" with the string, we can compare the tail of the string and make sure it matches "bar", instead of running fnmatch() on "bar" against the remainder of the string. -O2 build on linux-2.6, without the patch: $ time git rev-list --quiet HEAD -- '.c' real 0m40.770s user 0m40.290s sys 0m0.256s With the patch $ time ~/w/git/git rev-list --quiet HEAD -- '.c' real 0m34.288s user 0m33.997s sys 0m0.205s The above command is not supposed to be widely popular. It's chosen because it exercises pathspec matching a lot. The point is it cuts down matching time for popular patterns like .c, which could be used as pathspec in other places. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-11-24 05:33:50 +01:00			`}`
Add struct pathspec The old pathspec structure remains as pathspec.raw[]. New things are stored in pathspec.items[]. There's no guarantee that the pathspec order in raw[] is exactly as in items[]. raw[] is external (source) data and is untouched by pathspec manipulation functions. It eases migration from old const char ** to this new struct. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:36 +01:00			`}`
tree_entry_interesting(): fix depth limit with overlapping pathspecs Suppose we have two pathspecs 'a' and 'a/b' (both are dirs) and depth limit 1. In current code, pathspecs are checked in input order. When 'a/b' is checked against pathspec 'a', it fails depth limit and therefore is excluded, although it should match 'a/b' pathspec. This patch reorders all pathspecs alphabetically, then teaches tree_entry_interesting() to check against the deepest pathspec first, so depth limit of a shallower pathspec won't affect a deeper one. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:45 +01:00
			`qsort(pathspec->items, pathspec->nr,`
			`sizeof(struct pathspec_item), pathspec_item_cmp);`

Add struct pathspec The old pathspec structure remains as pathspec.raw[]. New things are stored in pathspec.items[]. There's no guarantee that the pathspec order in raw[] is exactly as in items[]. raw[] is external (source) data and is untouched by pathspec manipulation functions. It eases migration from old const char ** to this new struct. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-12-15 16:02:36 +01:00			`return 0;`
			`}`

			`void free_pathspec(struct pathspec *pathspec)`
			`{`
			`free(pathspec->items);`
			`pathspec->items = NULL;`
			`}`
add global --literal-pathspecs option Git takes pathspec arguments in many places to limit the scope of an operation. These pathspecs are treated not as literal paths, but as glob patterns that can be fed to fnmatch. When a user is giving a specific pattern, this is a nice feature. However, when programatically providing pathspecs, it can be a nuisance. For example, to find the latest revision which modified "$foo", one can use "git rev-list -- $foo". But if "$foo" contains glob characters (e.g., "f"), it will erroneously match more entries than desired. The caller needs to quote the characters in $foo, and even then, the results may not be exactly the same as with a literal pathspec. For instance, the depth checks in match_pathspec_depth do not kick in if we match via fnmatch. This patch introduces a global command-line option (i.e., one for "git" itself, not for specific commands) to turn this behavior off. It also has a matching environment variable, which can make it easier if you are a script or porcelain interface that is going to issue many such commands. This option cannot turn off globbing for particular pathspecs. That could eventually be done with a ":(noglob)" magic pathspec prefix. However, that level of granularity is more cumbersome to use for many cases, and doing ":(noglob)" right would mean converting the whole codebase to use "struct pathspec", as the usual "const char *pathspec" cannot represent extra per-item flags. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-12-19 23:37:30 +01:00
			`int limit_pathspec_to_literal(void)`
			`{`
			`static int flag = -1;`
			`if (flag < 0)`
			`flag = git_env_bool(GIT_LITERAL_PATHSPECS_ENVIRONMENT, 0);`
			`return flag;`
			`}`
Merge branch 'as/check-ignore' Add a new command "git check-ignore" for debugging .gitignore files. The variable names may want to get cleaned up but that can be done in-tree. * as/check-ignore: clean.c, ls-files.c: respect encapsulation of exclude_list_groups t0008: avoid brace expansion add git-check-ignore sub-command setup.c: document get_pathspec() add.c: extract new die_if_path_beyond_symlink() for reuse add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse pathspec.c: rename newly public functions for clarity add.c: move pathspec matchers into new pathspec.c for reuse add.c: remove unused argument from validate_pathspec() dir.c: improve docs for match_pathspec() and match_pathspec_depth() dir.c: provide clear_directory() for reclaiming dir_struct memory dir.c: keep track of where patterns came from dir.c: use a single struct exclude_list per source of excludes Conflicts: builtin/ls-files.c dir.c 2013-01-24 06:19:10 +01:00
dir.c: provide clear_directory() for reclaiming dir_struct memory By the end of a directory traversal, a dir_struct instance will typically contains pointers to various data structures on the heap. clear_directory() provides a convenient way to reclaim that memory. Signed-off-by: Adam Spiers <git@adamspiers.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-01-06 17:58:05 +01:00			`/*`
			`* Frees memory within dir which was allocated for exclude lists and`
			`* the exclude_stack. Does not free dir itself.`
			`*/`
			`void clear_directory(struct dir_struct *dir)`
			`{`
			`int i, j;`
			`struct exclude_list_group *group;`
			`struct exclude_list *el;`
			`struct exclude_stack *stk;`

			`for (i = EXC_CMDL; i <= EXC_FILE; i++) {`
			`group = &dir->exclude_list_group[i];`
			`for (j = 0; j < group->nr; j++) {`
			`el = &group->el[j];`
			`if (i == EXC_DIRS)`
			`free((char *)el->src);`
			`clear_exclude_list(el);`
			`}`
			`free(group->el);`
			`}`

			`stk = dir->exclude_stack;`
			`while (stk) {`
			`struct exclude_stack *prev = stk->prev;`
			`free(stk);`
			`stk = prev;`
			`}`
			`}`