Git takes pathspec arguments in many places to limit the
scope of an operation. These pathspecs are treated not as
literal paths, but as glob patterns that can be fed to
fnmatch. When a user is giving a specific pattern, this is a
nice feature.
However, when programatically providing pathspecs, it can be
a nuisance. For example, to find the latest revision which
modified "$foo", one can use "git rev-list -- $foo". But if
"$foo" contains glob characters (e.g., "f*"), it will
erroneously match more entries than desired. The caller
needs to quote the characters in $foo, and even then, the
results may not be exactly the same as with a literal
pathspec. For instance, the depth checks in
match_pathspec_depth do not kick in if we match via fnmatch.
This patch introduces a global command-line option (i.e.,
one for "git" itself, not for specific commands) to turn
this behavior off. It also has a matching environment
variable, which can make it easier if you are a script or
porcelain interface that is going to issue many such
commands.
This option cannot turn off globbing for particular
pathspecs. That could eventually be done with a ":(noglob)"
magic pathspec prefix. However, that level of granularity is
more cumbersome to use for many cases, and doing ":(noglob)"
right would mean converting the whole codebase to use
"struct pathspec", as the usual "const char **pathspec"
cannot represent extra per-item flags.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a pattern contains only a single asterisk as wildcard,
e.g. "foo*bar", after literally comparing the leading part "foo" with
the string, we can compare the tail of the string and make sure it
matches "bar", instead of running fnmatch() on "*bar" against the
remainder of the string.
-O2 build on linux-2.6, without the patch:
$ time git rev-list --quiet HEAD -- '*.c'
real 0m40.770s
user 0m40.290s
sys 0m0.256s
With the patch
$ time ~/w/git/git rev-list --quiet HEAD -- '*.c'
real 0m34.288s
user 0m33.997s
sys 0m0.205s
The above command is not supposed to be widely popular. It's chosen
because it exercises pathspec matching a lot. The point is it cuts
down matching time for popular patterns like *.c, which could be used
as pathspec in other places.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We mark pathspec with wildcards with the field use_wildcard. We
could do better by saving the length of the non-wildcard part, which
can be used for optimizations such as f9f6e2c (exclude: do strcmp as
much as possible before fnmatch - 2012-06-07).
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start laying the foundation to build the "wildmatch" after we can
agree on its desired semantics.
* nd/attr-match-optim-more:
attr: more matching optimizations from .gitignore
gitignore: make pattern parsing code a separate function
exclude: split pathname matching code into a separate function
exclude: fix a bug in prefix compare optimization
exclude: split basename matching code into a separate function
exclude: stricten a length check in EXC_FLAG_ENDSWITH case
.gitattributes and .gitignore share the same pattern syntax but has
separate matching implementation. Over the years, ignore's
implementation accumulates more optimizations while attr's stays the
same.
This patch reuses the core matching functions that are also used by
excluded_from_list. excluded_from_list and path_matches can't be
merged due to differences in exclude and attr, for example:
* "!pattern" syntax is forbidden in .gitattributes. As an attribute
can be unset (i.e. set to a special value "false") or made back to
unspecified (i.e. not even set to "false"), "!pattern attr" is unclear
which one it means.
* we support attaching attributes to directories, but git-core
internally does not currently make use of attributes on
directories.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function can later be reused by attr.c. Also turn to_exclude
field into a flag.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "namelen" becomes zero at this stage, we have matched the fixed
part, but whether it actually matches the pattern still depends on the
pattern in "exclude". As demonstrated in t3001, path "three/a.3"
exists and it matches the "three/a.3" part in pattern "three/a.3[abc]",
but that does not mean a true match.
Don't be too optimistic and let fnmatch() do the job.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This block of code deals with the "basename" part only, which has the
length of "pathlen - (basename - pathname)". Stricten the length check
and remove "pathname" from the main expression to avoid confusion.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* commit 'f9f6e2c':
exclude: do strcmp as much as possible before fnmatch
dir.c: get rid of the wildcard symbol set in no_wildcard()
Unindent excluded_from_list()
The previous series introduced warnings to multiple places, but it
could become tiring to see the warning on the same path over and
over again during a single run of Git. Making just one function
responsible for issuing this warning, we could later choose to keep
track of which paths we issued a warning (it would involve a hash
table of paths after running them through real_path() or something)
in order to reduce noise.
Right now we do not know if the noise reduction is necessary, but it
still would be a good code reduction/sharing anyway.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we try to access gitignore files, we check for their
existence with a call to "access". We silently ignore
missing files. However, if a file is not readable, this may
be a configuration error; let's warn the user.
For $GIT_DIR/info/excludes or core.excludesfile, we can just
use access_or_warn. However, for per-directory files we
actually try to open them, so we must add a custom warning.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Finishing touches to the XDG support (new feature for 1.7.12) and
tests.
* mm/config-xdg:
t1306: check that XDG_CONFIG_HOME works
ignore: make sure we have an xdg path before using it
attr: make sure we have an xdg path before using it
test-lib.sh: unset XDG_CONFIG_HOME
Commit e3ebc35 (config: fix several access(NULL) calls, 2012-07-12) was
fixing access(NULL) calls when trying to access $HOME/.config/git/config,
but missed the ones when trying to access $HOME/.config/git/ignore. Fix
and test this.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-files --exclude=t -i" did not consider anything under t/ as
excluded, as it did not pay attention to exclusion of leading paths
while walking the index. Other two users of excluded() are also
updated.
* jc/ls-files-i-dir:
dir.c: make excluded() file scope static
unpack-trees.c: use path_excluded() in check_ok_to_remove()
builtin/add.c: use path_excluded()
path_excluded(): update API to less cache-entry centric
ls-files -i: micro-optimize path_excluded()
ls-files -i: pay attention to exclusion of leading paths
Teach git to read various information from $XDG_CONFIG_HOME/git/ to allow
the user to avoid cluttering $HOME.
* mm/config-xdg:
config: write to $XDG_CONFIG_HOME/git/config file when appropriate
Let core.attributesfile default to $XDG_CONFIG_HOME/git/attributes
Let core.excludesfile default to $XDG_CONFIG_HOME/git/ignore
config: read (but not write) from $XDG_CONFIG_HOME/git/config file
Attempt to optimize matching with an exclude pattern with a deep
directory hierarchy by taking the part that specifies leading path
without wildcard literally.
To use the feature of core.excludesfile, the user needs:
1. to create such a file,
2. and add configuration variable to point at it.
Instead, we can make this a one-step process by choosing a default value
which points to a filename in the user's $HOME, that is unlikely to
already exist on the system, and only use the presence of the file as a
cue that the user wants to use that feature.
And we use "${XDG_CONFIG_HOME:-$HOME/.config/git}/ignore" as such a
file, in the same directory as the newly added configuration file
("${XDG_CONFIG_HOME:-$HOME/.config/git}/config). The use of this
directory is in line with XDG specification as a location to store
such application specific files.
Signed-off-by: Huynh Khoi Nguyen Nguyen <Huynh-Khoi-Nguyen.Nguyen@ensimag.imag.fr>
Signed-off-by: Valentin Duperray <Valentin.Duperray@ensimag.imag.fr>
Signed-off-by: Franck Jonas <Franck.Jonas@ensimag.imag.fr>
Signed-off-by: Lucien Kong <Lucien.Kong@ensimag.imag.fr>
Signed-off-by: Thomas Nguy <Thomas.Nguy@ensimag.imag.fr>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-files --exclude=t -i" did not consider anything under t/
as excluded, as it did not pay attention to exclusion of leading
paths while walking the index. Other two users of excluded() are
also updated.
* jc/ls-files-i-dir:
dir.c: make excluded() file scope static
unpack-trees.c: use path_excluded() in check_ok_to_remove()
builtin/add.c: use path_excluded()
path_excluded(): update API to less cache-entry centric
ls-files -i: micro-optimize path_excluded()
ls-files -i: pay attention to exclusion of leading paths
this also avoids calling fnmatch() if the non-wildcard prefix is
longer than basename
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Elsewhere in this file is_glob_special() is also used to check for
wildcards, which is defined in ctype. Make no_wildcard() also use this
function (indirectly via simple_length())
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was stupid of me to make the API too much cache-entry specific;
the caller may want to check arbitrary pathname without having a
corresponding cache-entry to see if a path is ignored.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we know a caller that does not recurse is calling us in the index
order, we can remember the last directory we found to be excluded
and see if the path we are looking at is still inside it, in which
case we can just answer that it is excluded.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git ls-files --exclude=t/ -i" does not show paths in directory t/
that have been added to the index, but it should.
The excluded() API was designed for callers who walk the tree from
the top, checking each level of the directory hierarchy as it
descends if it is excluded, and not even bothering to recurse into
an excluded directory. This would allow us optimize for a common
case by not having to check if the exclude pattern "foo/" matches
when looking at "foo/bar", because the caller should have noticed
that "foo" is excluded and did not even bother to read "foo/bar"
out of opendir()/readdir() to call it.
The code for "ls-files -i" however walks the index linearly, feeding
paths without checking if the leading directory is already excluded.
Introduce a helper function path_excluded() to let this caller
properly call excluded() check for higher hierarchies as necessary.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Return early if el->nr == 0. Unindent one more level for FNM_PATHNAME
code block as this block is getting complex and may need more
indentation.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that read_directory_recursive() (reached through read_directory())
respects the string length limit we provide, we don't need to create a
NUL-limited copy of the common prefix anymore.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A directory name is passed to read_directory_recursive() as a
length-limited string, through the parameters base and baselen.
Suprisingly, base must be a NUL-terminated string as well, as it is
passed to opendir(), ignoring baselen.
Fix this by postponing the call to opendir() until the length-limted
string is added to a strbuf, which provides a NUL in the right place.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions read_directory_recursive() and treat_leading_path() both
use buffers sized to fit PATH_MAX characters. The latter can be made to
overrun its buffer, e.g. like this:
$ a=0123456789abcdef
$ a=$a$a$a$a$a$a$a$a
$ a=$a$a$a$a$a$a$a$a
$ a=$a$a$a$a$a$a$a$a
$ git add $a/a
Instead of trying to add a check and potentionally forgetting to address
similar cases, convert the involved functions and their helpers to use
struct strbuf. The patch is suprisingly large because the helpers
treat_path() and treat_one_path() modify the buffer as well and thus need
to be converted, too.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
remove_dir_recursively() has a check to avoid removing the directory it
was asked to remove without recursing into it and report success when the
directory is the top level of a working tree of a nested git repository,
to protect such a repository from "clean -f" (without double -f). If a
working tree of a nested git repository is in a subdirectory of a toplevel
project, however, this protection did not apply by mistake; we forgot to
pass the REMOVE_DIR_KEEP_NESTED_GIT down to the recursive removal
codepath.
This requires us to also teach the higher level not to remove the
directory it is asked to remove, when the recursed invocation did not
remove the directory it was asked to remove due to a nested git
repository, as it is not an error to leave the parent directories of such
a nested repository.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add the REMOVE_DIR_KEEP_TOPLEVEL flag to remove_dir_recursively() for
deleting everything inside the given directory, but _not_ the given
directory itself.
Note that this does not pass the REMOVE_DIR_KEEP_NESTED_GIT flag, if set,
to the recursive invocations of remove_dir_recursively(). It is likely to
be a a bug that has been present since REMOVE_DIR_KEEP_NESTED_GIT was
introduced (a0f4afb), but this commit keeps the same behaviour for now.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also make common_prefix_len() static as this refactoring makes dir.c
itself the only caller of this helper function.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The implementation from pathspec_prefix (slightly modified) replaces the
current common_prefix, because it also respects glob characters.
Based on a patch by Clemens Buchacher.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/struct-pathspec:
pathspec: rename per-item field has_wildcard to use_wildcard
Improve tree_entry_interesting() handling code
Convert read_tree{,_recursive} to support struct pathspec
Reimplement read_tree_recursive() using tree_entry_interesting()
As the point of the last change is to allow use of strings as
literals no matter what characters are in them, "has_wildcard"
does not match what we use this field for anymore.
It is used to decide if the wildcard matching should be used, so
rename it to match the usage better.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a last ditch effort, try rmdir(2) when we cannot read the directory
to be removed. It may be an empty directory that we can remove without
any permission, as long as we can modify its parent directory.
Noticed by Linus.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Function dir_inside_of() does something similar (correctly), but looks
easier to understand and does not bundle cwd to its business. Given
get_relative_cwd's only user is is_inside_dir, we can kill it for
good.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The same old problem reappears after setup code is reworked. We tend
to assume there is at least one path component in a path and forget
that path can be simply '/'.
Reported-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the make_*_path functions so it's clearer what they do, in
particlar make clear what the differnce between make_absolute_path and
make_nonrelative_path is by renaming them real_path and absolute_path
respectively. make_relative_path has an understandable name and is
renamed to relative_path to maintain the name convention.
The function calls have been replaced 1-to-1 in their usage.
Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
match_pathspec_depth() is a clone of match_pathspec() except that it
can take depth limit. Computation is a bit lighter compared to
match_pathspec() because it's usually precomputed and stored in struct
pathspec.
In long term, match_pathspec() and match_one() should be removed in
favor of this function.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
never_interesting optimization is disabled if there is any wildcard
pathspec, even if it only matches exactly on trees.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Suppose we have two pathspecs 'a' and 'a/b' (both are dirs) and depth
limit 1. In current code, pathspecs are checked in input order. When
'a/b' is checked against pathspec 'a', it fails depth limit and
therefore is excluded, although it should match 'a/b' pathspec.
This patch reorders all pathspecs alphabetically, then teaches
tree_entry_interesting() to check against the deepest pathspec first,
so depth limit of a shallower pathspec won't affect a deeper one.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is needed to replace pathspec_matches() in builtin/grep.c.
max_depth == -1 means infinite depth. Depth limit is only effective
when pathspec.recursive == 1. When pathspec.recursive == 0, the
behavior depends on match functions: non-recursive for
tree_entry_interesting() and recursive for match_pathspec{,_depth}
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old pathspec structure remains as pathspec.raw[]. New things are
stored in pathspec.items[]. There's no guarantee that the pathspec
order in raw[] is exactly as in items[].
raw[] is external (source) data and is untouched by pathspec
manipulation functions. It eases migration from old const char ** to
this new struct.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* nd/maint-fix-add-typo-detection:
Revert "excluded_1(): support exclude files in index"
unpack-trees: fix sparse checkout's "unable to match directories"
unpack-trees: move all skip-worktree checks back to unpack_trees()
dir.c: add free_excludes()
cache.h: realign and use (1 << x) form for CE_* constants
* jj/icase-directory:
Support case folding in git fast-import when core.ignorecase=true
Support case folding for git add when core.ignorecase=true
Add case insensitivity support when using git ls-files
Add case insensitivity support for directories when using git status
Case insensitivity support for .gitignore via core.ignorecase
Add string comparison functions that respect the ignore_case variable.
Makefile & configure: add a NO_FNMATCH_CASEFOLD flag
Makefile & configure: add a NO_FNMATCH flag
Conflicts:
Makefile
config.mak.in
configure.ac
fast-import.c
This reverts commit c84de70781.
The commit provided a workaround for matching directories in
index. But it is no longer needed.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 490544b (get_cwd_relative(): do not misinterpret suffix as
subdirectory) handles case where:
dir = "/path/work";
cwd = "/path/work-xyz";
When it comes to the end of get_cwd_relative(), dir is at '\0' and cwd
is at '-'. The rest of cwd, "-xyz", clearly cannot be the relative
path from dir to cwd. However there is another case where:
dir = "/"; /* or even "c:/" */
cwd = "/path/to/here";
In this special case, while *cwd == 'p', which is not a path
separator, the rest of cwd, "path/to/here", can be returned as a
relative path from dir to cwd.
Handle this case and make t1509 pass again.
Reported-by: Albert Strasheim <fullung@gmail.com>
Reported-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit c84de70 (excluded_1(): support exclude files in index -
2009-08-20) tries to work around the fact that there is no
directory/file information in index entries, therefore
EXC_FLAG_MUSTBEDIR match would fail.
Unfortunately the workaround is flawed. This fixes it.
Reported-by: Thomas Rinderknecht <thomasr@sailguy.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When mydir/filea.txt is added, mydir/ is renamed to MyDir/, and
MyDir/fileb.txt is added, running git ls-files mydir only shows
mydir/filea.txt. Running git ls-files MyDir shows MyDir/fileb.txt.
Running git ls-files mYdIR shows nothing.
With this patch running git ls-files for mydir, MyDir, and mYdIR shows
mydir/filea.txt and MyDir/fileb.txt.
Wildcards are not handled case insensitively in this patch. Example:
MyDir/aBc/file.txt is added. git ls-files MyDir/a* works fine, but git
ls-files mydir/a* does not.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using a case preserving but case insensitive file system, directory
case can differ but still refer to the same physical directory. git
status reports the directory with the alternate case as an Untracked
file. (That is, when mydir/filea.txt is added to the repository and
then the directory on disk is renamed from mydir/ to MyDir/, git status
shows MyDir/ as being untracked.)
Support has been added in name-hash.c for hashing directories with a
terminating slash into the name hash. When index_name_exists() is called
with a directory (a name with a terminating slash), the name is not
found via the normal cache_name_compare() call, but it is found in the
slow_same_name() function.
Additionally, in dir.c, directory_exists_in_index_icase() allows newly
added directories deeper in the directory chain to be identified.
Ultimately, it would be better if the file list was read in case
insensitive alphabetical order from disk, but this change seems to
suffice for now.
The end result is the directory is looked up in a case insensitive
manner and does not show in the Untracked files list.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is especially beneficial when using Windows and Perforce and the
git-p4 bridge. Internally, Perforce preserves a given file's full path
including its case at the time it was added to the Perforce repository.
When syncing a file down via Perforce, missing directories are created,
if necessary, using the case as stored with the filename. Unfortunately,
two files in the same directory can have differing cases for their
respective paths, such as /diRa/file1.c and /DirA/file2.c. Depending on
sync order, DirA/ may get created instead of diRa/.
It is possible to handle directory names in a case insensitive manner
without this patch, but it is highly inconvenient, requiring each
character to be specified like so: [Bb][Uu][Ii][Ll][Dd]. With this patch, the
gitignore exclusions honor the core.ignorecase=true configuration
setting and make the process less error prone. The above is specified
like so: Build
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Multiple locations within this patch series alter a case sensitive
string comparison call such as strcmp() to be a call to a string
comparison call that selects case comparison based on the global
ignore_case variable. Behaviorally, when core.ignorecase=false, the
*_icase() versions are functionally equivalent to their C runtime
counterparts. When core.ignorecase=true, the *_icase() versions perform
a case insensitive comparison.
Like Linus' earlier ignorecase patch, these may ignore filename
conventions on certain file systems. By isolating filename comparisons
to certain functions, support for those filename conventions may be more
easily met.
Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
GCC 4.4.4 on MacOS incorrectly warns about potential use of uninitialized memory.
Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sometimes it is useful to know if a file or directory will be ignored
before it is added to the work tree. An example is "git submodule add",
where it would be really nice to be able to fail with an appropriate
error message before the submodule is cloned and checked out.
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* gv/portable:
test-lib: use DIFF definition from GIT-BUILD-OPTIONS
build: propagate $DIFF to scripts
Makefile: Tru64 portability fix
Makefile: HP-UX 10.20 portability fixes
Makefile: HPUX11 portability fixes
Makefile: SunOS 5.6 portability fix
inline declaration does not work on AIX
Allow disabling "inline"
Some platforms lack socklen_t type
Make NO_{INET_NTOP,INET_PTON} configured independently
Makefile: some platforms do not have hstrerror anywhere
git-compat-util.h: some platforms with mmap() lack MAP_FAILED definition
test_cmp: do not use "diff -u" on platforms that lack one
fixup: do not unconditionally disable "diff -u"
tests: use "test_cmp", not "diff", when verifying the result
Do not use "diff" found on PATH while building and installing
enums: omit trailing comma for portability
Makefile: -lpthread may still be necessary when libc has only pthread stubs
Rewrite dynamic structure initializations to runtime assignment
Makefile: pass CPPFLAGS through to fllow customization
Conflicts:
Makefile
wt-status.h
common_prefix() scans backwards from the far end of each 'next'
pathspec, starting from 'len', shortening the 'prefix' using 'path' as
a reference.
However, there is a small opportunity for an out-of-bounds access
because len is unconditionally set to prefix-1 after a "direct match"
test failed. This means that if 'next' is shorter than prefix+2, we
read past it.
Instead of a minimal fix, simplify the loop: scan *forward* over the
'next' entry, remembering the last '/' where it matched the prefix
known so far. This is far easier to read and also has the advantage
that we only scan over each entry once.
Acked-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Without this patch at least IBM VisualAge C 5.0 (I have 5.0.2) on AIX
5.1 fails to compile git.
enum style is inconsistent already, with some enums declared on one
line, some over 3 lines with the enum values all on the middle line,
sometimes with 1 enum value per line... and independently of that the
trailing comma is sometimes present and other times absent, often
mixing with/without trailing comma styles in a single file, and
sometimes in consecutive enum declarations.
Clearly, omitting the comma is the more portable style, and this patch
changes all enum declarations to use the portable omitted dangling
comma style consistently.
Signed-off-by: Gary V. Vaughan <gary@thewrittenword.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the current working directory is the same as the work tree path
plus a suffix, e.g. 'work' and 'work-xyz', then the suffix '-xyz'
would be interpreted as a subdirectory of 'work'.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jk/maint-add-ignored-dir:
tests for "git add ignored-dir/file" without -f
dir: fix COLLECT_IGNORED on excluded prefixes
t0050: mark non-working test as such
As we walk the directory tree, if we see an ignored path, we
want to add it to the ignored list only if it matches any
pathspec that we were given. We used to check for the
pathspec to appear explicitly. E.g., if we see "subdir/file"
and it is excluded, we check to see if we have "subdir/file"
in our pathspec.
However, this interacts badly with the optimization to avoid
recursing into ignored subdirectories. If "subdir" as a
whole is ignored, then we never recurse, and consider only
whether "subdir" itself is in our pathspec. It would not
match a pathspec of "subdir/file" explicitly, even though it
is the reason that subdir/file would be excluded.
This manifests itself to the user as "git add subdir/file"
failing to correctly note that the pathspec was ignored.
This patch extends the in_pathspec logic to include prefix
directory case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we remove a path in a/deep/subdirectory, we should try to
remove as many trailing components as possible (i.e.,
subdirectory, then deep, then a). However, the test for the
return value of rmdir was reversed, so we only ever deleted
at most one level.
The fix is in remove_path, so "apply" and "merge-recursive"
also are fixed.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit b5041c5 (Avoid writing to buffer in add_excludes_from_file_1())
tried not to append '\n' at the end because the next commit
may return a buffer that does not have extra space for that.
Unfortunately it left this assignment in the loop:
buf[i - (i && buf[i-1] == '\r')] = 0;
that can corrupt memory if "buf" is not '\n' terminated. But even if
it does not corrupt memory, the last line would not be
NULL-terminated, leading to errors later inside add_exclude().
This patch fixes it by reverting the faulty commit and make
sure "buf" is always \n terminated.
While at it, free unused memory properly.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/ls-files-ignored-pathspec:
ls-files: fix overeager pathspec optimization
read_directory(): further split treat_path()
read_directory_recursive(): refactor handling of a single path into a separate function
t3001: test ls-files -o ignored/dir
* nd/sparse: (25 commits)
t7002: test for not using external grep on skip-worktree paths
t7002: set test prerequisite "external-grep" if supported
grep: do not do external grep on skip-worktree entries
commit: correctly respect skip-worktree bit
ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
tests: rename duplicate t1009
sparse checkout: inhibit empty worktree
Add tests for sparse checkout
read-tree: add --no-sparse-checkout to disable sparse checkout support
unpack-trees(): ignore worktree check outside checkout area
unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
unpack-trees.c: generalize verify_* functions
unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
Introduce "sparse checkout"
dir.c: export excluded_1() and add_excludes_from_file_1()
excluded_1(): support exclude files in index
unpack-trees(): carry skip-worktree bit over in merged_entry()
Read .gitignore from index if it is skip-worktree
Avoid writing to buffer in add_excludes_from_file_1()
...
Conflicts:
.gitignore
Documentation/config.txt
Documentation/git-update-index.txt
Makefile
entry.c
t/t7002-grep.sh
Given pathspecs that share a common prefix, ls-files optimized its call
into recursive directory reader by starting at the common prefix
directory.
If you have a directory "t" with an untracked file "t/junk" in it, but the
top-level .gitignore file told us to ignore "t/", this resulted in:
$ git ls-files -o --exclude-standard
$ git ls-files -o --exclude-standard t/
t/junk
$ git ls-files -o --exclude-standard t/junk
t/junk
$ cd t && git ls-files -o --exclude-standard
junk
We could argue that you are overriding the ignore file by giving a
patchspec that matches or being in that directory, but it is somewhat
unexpected. Worse yet, these behave differently:
$ git ls-files -o --exclude-standard t/ .
$ git ls-files -o --exclude-standard t/
t/junk
This patch changes the optimization so that it notices when the common
prefix directory that it starts reading from is an ignored one.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The next caller I'll be adding won't have an access to struct dirent
because it won't be reading from a directory stream. Split the main
part of the function further into a separate function to make it usable
by a caller without passing a dirent as long as it knows what type is
feeding the function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Primarily because I want to reuse it in a separate function later,
but this de-dents a huge function by one tabstop which by itself is
an improvement as well.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions are used to handle .gitignore. They are now exported
so that sparse checkout can reuse.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Index does not really have "directories", attempts to match "foo/"
against index will fail unless someone tries to reconstruct directories
from a list of file.
Observing that dtype in this function can never be NULL (otherwise
it would segfault), dtype NULL will be used to say "hey.. you are
matching against index" and behave properly.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This adds index as a prerequisite for directory listing (with
exclude). At the moment directory listing is used by "git clean",
"git add", "git ls-files" and "git status"/"git commit" and
unpack_trees()-related commands. These commands have been
checked/modified to populate index before doing directory listing.
add_excludes_from_file() does not enable this feature, because it
is used to read .git/info/exclude and some explicit files specified
by "git ls-files".
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next patch, the buffer that is being used within
add_excludes_from_file_1() comes from another function and does not
have extra space to put \n at the end.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you have an embedded git work tree in your work tree (be it
an orphaned submodule, or an independent checkout of an unrelated
project), "git clean -d -f" blindly descended into it and removed
everything. This is rarely what the user wants.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we have an up-to-date index entry for a file in that directory, we
can know that the directories leading up to that file must be
directories. No need to do an lstat() on the directory.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On filesystems without d_type, we can look at the cache entry first.
Doing an lstat() can be expensive.
Reported by Dmitry Potapov for Cygwin.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop the insanity with separate 'path' and 'base' arguments that must
match. We don't need that crazy interface any more, since we cleaned up
handling of 'path' in commit da4b3e8c28.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of the users of "read_directory()" actually want a much simpler
interface than the whole complex (but rather powerful) one.
In fact 'git add' had already largely abstracted out the core interface
issues into a private "fill_directory()" function that was largely
applicable almost as-is to a number of callers. Yes, 'git add' wants to
do some extra work of its own, specific to the add semantics, but we can
easily split that out, and use the core as a generic function.
This function does exactly that, and now that much simplified
'fill_directory()' function can be shared with a number of callers,
while also ensuring that the rather more complex calling conventions of
read_directory() are used by fewer call-sites.
This also makes the 'common_prefix()' helper function private to dir.c,
since all callers are now in that file.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change calls to die(..., strerror(errno)) to use the new die_errno().
In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a path F that matches ignore pattern has a conflict, "git add F"
insisted the -f option be given, which did not make sense. It would have
required -f when the path was originally added, but when resolving a
conflict, it already is tracked.
So this should work (and does):
$ echo file >.gitignore
$ echo content >file
$ git add -f file ;# need -f because we are adding new path
$ echo more content >>file
$ git add file ;# don't need -f; it is not actually an "other" file
This is handled under the hood by the COLLECT_IGNORED option to
read_directory. When that code finds an ignored file, it checks the
index to make sure it is not actually a tracked file. However, the test
it uses does not take into account unmerged entries, and considers them
to still be ignored. "git ls-files" uses a more elaborate test and gets
the right answer and the same test should be used here.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now we pass two different pathnames ('path' and 'base') down to
read_directory_recursive(), and the only real reason for that is that we
want to allow an empty 'base' parameter, but when we do so, we need the
pathname to "opendir()" to be "." rather than the empty string.
And rather than handle that confusion in the caller, we can just fix
read_directory_recursive() to handle the case of an empty path itself,
by just passing opendir() a "." ourselves if the path is empty.
This would allow us to then drop one of the pathnames entirely from the
calling convention, but rather than do that, we'll start separating them
out as a "filesystem pathname" (the one we use for filesystem accesses)
and a "git internal base name" (which is the name that we use for git
internally).
That will eventually allow us to do things like handle different
encodings (eg the filesystem pathnames might be Latin1, while git itself
would use UTF-8 for filename information).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
improve error message in config.c
t4018-diff-funcname: add cpp xfuncname pattern to syntax test
Work around BSD whose typeof(tv.tv_sec) != time_t
git-am.txt: reword extra headers in message body
git-am.txt: Use date or value instead of time or timestamp
git-am.txt: add an 'a', say what 'it' is, simplify a sentence
dir.c: Fix two minor grammatical errors in comments
git-svn: fix a sloppy Getopt::Long usage
Essentially; s/type* /type */ as per the coding guidelines.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mv/parseopt-ls-files:
ls-files: fix broken --no-empty-directory
t3000: use test_cmp instead of diff
parse-opt: migrate builtin-ls-files.
Turn the flags in struct dir_struct into a single variable
Conflicts:
builtin-ls-files.c
t/t3000-ls-files-others.sh
* kb/checkout-optim:
Revert "lstat_cache(): print a warning if doing ping-pong between cache types"
checkout bugfix: use stat.mtime instead of stat.ctime in two places
Makefile: Set compiler switch for USE_NSEC
Create USE_ST_TIMESPEC and turn it on for Darwin
Not all systems use st_[cm]tim field for ns resolution file timestamp
Record ns-timestamps if possible, but do not use it without USE_NSEC
write_index(): update index_state->timestamp after flushing to disk
verify_uptodate(): add ce_uptodate(ce) test
make USE_NSEC work as expected
fix compile error when USE_NSEC is defined
check_updates(): effective removal of cache entries marked CE_REMOVE
lstat_cache(): print a warning if doing ping-pong between cache types
show_patch_diff(): remove a call to fstat()
write_entry(): use fstat() instead of lstat() when file is open
write_entry(): cleanup of some duplicated code
create_directories(): remove some memcpy() and strchr() calls
unlink_entry(): introduce schedule_dir_for_removal()
lstat_cache(): swap func(length, string) into func(string, length)
lstat_cache(): generalise longest_match_lstat_cache()
lstat_cache(): small cleanup and optimisation
By having flags represented as bits in the new member variable 'flags',
it will be easier to use parse_options when dir_struct is involved.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>