mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-10-31 22:37:54 +01:00

Author	SHA1	Message	Date
David Turner	2e5910f276	untracked-cache: fix subdirectory handling Previously, some calls lookup_untracked would pass a full path. But lookup_untracked assumes that the portion of the path up to and including to the untracked_cache_dir has been removed. So lookup_untracked would be looking in the untracked_cache for 'foo' for 'foo/bar' (instead of just looking for 'bar'). This would cause untracked cache corruption. Instead, treat_directory learns to track the base length of the parent directory, so that only the last path component is passed to lookup_untracked. Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-19 10:40:24 -07:00
Junio C Hamano	1c09276e34	Merge branch 'dt/untracked-sparse' Allow untracked cache (experimental) to be used when sparse checkout (experimental) is also in use. * dt/untracked-sparse: untracked-cache: support sparse checkout	2015-08-17 15:07:52 -07:00
Jeff King	f932729cc7	memoize common git-path "constant" files One of the most common uses of git_path() is to pass a constant, like git_path("MERGE_MSG"). This has two drawbacks: 1. The return value is a static buffer, and the lifetime is dependent on other calls to git_path, etc. 2. There's no compile-time checking of the pathname. This is OK for a one-off (after all, we have to spell it correctly at least once), but many of these constant strings appear throughout the code. This patch introduces a series of functions to "memoize" these strings, which are essentially globals for the lifetime of the program. We compute the value once, take ownership of the buffer, and return the cached value for subsequent calls. cache.h provides a helper macro for defining these functions as one-liners, and defines a few common ones for global use. Using a macro is a little bit gross, but it does nicely document the purpose of the functions. If we need to touch them all later (e.g., because we learned how to change the git_dir variable at runtime, and need to invalidate all of the stored values), it will be much easier to have the complete list. Note that the shared-global functions have separate, manual declarations. We could do something clever with the macros (e.g., expand it to a declaration in some places, and a declaration _and_ a definition in path.c). But there aren't that many, and it's probably better to stay away from too-magical macros. Likewise, if we abandon the C preprocessor in favor of generating these with a script, we could get much fancier. E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz". But the small amount of saved typing is probably not worth the resulting confusion to readers who want to grep for the function's definition. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-10 15:37:14 -07:00
Junio C Hamano	8e699cdb9f	Merge branch 'cb/uname-in-untracked' An experimental "untracked cache" feature used uname(2) in a slightly unportable way. * cb/uname-in-untracked: untracked: fix detection of uname(2) failure	2015-08-03 11:01:26 -07:00
David Turner	7687252f3f	untracked-cache: support sparse checkout Remove a check that would disable the untracked cache for sparse checkouts. Add tests that ensure that the untracked cache works with sparse checkouts -- specifically considering the case that a file foo/bar is checked out, but foo/.gitignore is not. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-07-31 10:56:18 -07:00
Charles Bailey	100e433741	untracked: fix detection of uname(2) failure According to POSIX specification uname(2) must return -1 on failure and a non-negative value on success. Although many implementations do return 0 on success it is valid to return any positive value for success. In particular, Solaris returns 1. Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Reviewed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-07-17 14:39:59 -07:00
Junio C Hamano	dfb67594e9	Merge branch 'rs/janitorial' into maint Code clean-up. * rs/janitorial: dir: remove unused variable sb clean: remove unused variable buf use file_exists() to check if a file exists in the worktree	2015-06-16 14:33:47 -07:00
Junio C Hamano	e9f767ecee	Merge branch 'jc/gitignore-precedence' into maint core.excludesfile (defaulting to $XDG_HOME/git/ignore) is supposed to be overridden by repository-specific .git/info/exclude file, but the order was swapped from the beginning. This belatedly fixes it. * jc/gitignore-precedence: ignore: info/exclude should trump core.excludesfile	2015-06-05 12:00:13 -07:00
Junio C Hamano	d9c82fa7a7	Merge branch 'pt/xdg-config-path' into maint Code clean-up for xdg configuration path support. * pt/xdg-config-path: path.c: remove home_config_paths() git-config: replace use of home_config_paths() git-commit: replace use of home_config_paths() credential-store.c: replace home_config_paths() with xdg_config_home() dir.c: replace home_config_paths() with xdg_config_home() attr.c: replace home_config_paths() with xdg_config_home() path.c: implement xdg_config_home() t0302: "unreadable" test needs POSIXPERM t0302: test credential-store support for XDG_CONFIG_HOME git-credential-store: support XDG_CONFIG_HOME git-credential-store: support multiple credential files	2015-06-05 12:00:04 -07:00
Junio C Hamano	4ba5bb5531	Merge branch 'rs/janitorial' Code clean-up. * rs/janitorial: dir: remove unused variable sb clean: remove unused variable buf use file_exists() to check if a file exists in the worktree	2015-06-01 12:45:15 -07:00
Junio C Hamano	38ccaf93bb	Merge branch 'nd/untracked-cache' Teach the index to optionally remember already seen untracked files to speed up "git status" in a working tree with tons of cruft. * nd/untracked-cache: (24 commits) git-status.txt: advertisement for untracked cache untracked cache: guard and disable on system changes mingw32: add uname() t7063: tests for untracked cache update-index: test the system before enabling untracked cache update-index: manually enable or disable untracked cache status: enable untracked cache untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE untracked cache: mark index dirty if untracked cache is updated untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS untracked cache: avoid racy timestamps read-cache.c: split racy stat test to a separate function untracked cache: invalidate at index addition or removal untracked cache: load from UNTR index extension untracked cache: save to an index extension ewah: add convenient wrapper ewah_serialize_strbuf() untracked cache: don't open non-existent .gitignore untracked cache: mark what dirs should be recursed/saved untracked cache: record/validate dir mtime and reuse cached output untracked cache: make a wrapper around {open,read,close}dir() ...	2015-05-26 13:24:46 -07:00
René Scharfe	22570b68e3	dir: remove unused variable sb It had never been used. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-20 13:50:22 -07:00
Junio C Hamano	20cf8b548e	Merge branch 'jc/gitignore-precedence' core.excludesfile (defaulting to $XDG_HOME/git/ignore) is supposed to be overridden by repository-specific .git/info/exclude file, but the order was swapped from the beginning. This belatedly fixes it. * jc/gitignore-precedence: ignore: info/exclude should trump core.excludesfile	2015-05-19 13:17:51 -07:00
Junio C Hamano	8a1d89745d	Merge branch 'cn/bom-in-gitignore' into maint Teach the codepaths that read .gitignore and .gitattributes files that these files encoded in UTF-8 may have UTF-8 BOM marker at the beginning; this makes it in line with what we do for configuration files already. * cn/bom-in-gitignore: attr: skip UTF8 BOM at the beginning of the input file config: use utf8_bom[] from utf.[ch] in git_parse_source() utf8-bom: introduce skip_utf8_bom() helper add_excludes_from_file: clarify the bom skipping logic dir: allow a BOM at the beginning of exclude files	2015-05-13 14:05:51 -07:00
Junio C Hamano	558e5a8c40	Merge branch 'pt/xdg-config-path' Code clean-up for xdg configuration path support. * pt/xdg-config-path: path.c: remove home_config_paths() git-config: replace use of home_config_paths() git-commit: replace use of home_config_paths() credential-store.c: replace home_config_paths() with xdg_config_home() dir.c: replace home_config_paths() with xdg_config_home() attr.c: replace home_config_paths() with xdg_config_home() path.c: implement xdg_config_home()	2015-05-11 14:24:01 -07:00
Paul Tan	2845ce7ff1	dir.c: replace home_config_paths() with xdg_config_home() Since only the xdg excludes file path is required, simplify the code by replacing use of home_config_paths() with xdg_config_home(). Signed-off-by: Paul Tan <pyokagan@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-06 11:33:17 -07:00
Junio C Hamano	2e1dfd62dc	Merge branch 'cn/bom-in-gitignore' Teach the codepaths that read .gitignore and .gitattributes files that these files encoded in UTF-8 may have UTF-8 BOM marker at the beginning; this makes it in line with what we do for configuration files already. * cn/bom-in-gitignore: attr: skip UTF8 BOM at the beginning of the input file config: use utf8_bom[] from utf.[ch] in git_parse_source() utf8-bom: introduce skip_utf8_bom() helper add_excludes_from_file: clarify the bom skipping logic dir: allow a BOM at the beginning of exclude files	2015-05-05 21:00:34 -07:00
Junio C Hamano	099d2d86a8	ignore: info/exclude should trump core.excludesfile $GIT_DIR/info/exclude and core.excludesfile (which falls back to $XDG_HOME/git/ignore) are both ways to override the ignore pattern lists given by the project in .gitignore files. The former, which is per-repository personal preference, should take precedence over the latter, which is a personal preference default across different repositories that are accessed from that machine. The existing documentation also agrees. However, the precedence order was screwed up between these two from the very beginning when `896bdfa2` (add: Support specifying an excludes file with a configuration variable, 2007-02-27) introduced core.excludesfile variable. Noticed-by: Yohei Endo <yoheie@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-04-22 14:31:49 -07:00
Junio C Hamano	dde843e737	utf8-bom: introduce skip_utf8_bom() helper With the recent change to ignore the UTF8 BOM at the beginning of .gitignore files, we now have two codepaths that do such a skipping (the other one is for reading the configuration files). Introduce utf8_bom[] constant string and skip_utf8_bom() helper and teach .gitignore code how to use it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-04-16 11:35:06 -07:00
Junio C Hamano	cb0abea870	add_excludes_from_file: clarify the bom skipping logic Even though the previous step shifts where the "entry" begins, we still iterate over the original buf[], which may begin with the UTF-8 BOM we are supposed to be skipping. At the end of the first line, the code grabs the contents of it starting at "entry", so there is nothing wrong per-se, but the logic looks really confused. Instead, move the buf pointer and shrink its size, to truly pretend that UTF-8 BOM did not exist in the input. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-04-16 11:26:29 -07:00
Carlos Martín Nieto	245e1c196d	dir: allow a BOM at the beginning of exclude files Some text editors like Notepad or LibreOffice write an UTF-8 BOM in order to indicate that the file is Unicode text rather than whatever the current locale would indicate. If someone uses such an editor to edit a gitignore file, we are left with those three bytes at the beginning of the file. If we do not skip them, we will attempt to match a filename with the BOM as prefix, which won't match the files the user is expecting. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-04-16 10:17:04 -07:00
Junio C Hamano	ab0fb57aac	Merge branch 'jc/report-path-error-to-dir' into maint Code clean-up. * jc/report-path-error-to-dir: report_path_error(): move to dir.c	2015-03-31 14:53:08 -07:00
Junio C Hamano	574ee8ae86	Merge branch 'jc/report-path-error-to-dir' Code clean-up. * jc/report-path-error-to-dir: report_path_error(): move to dir.c	2015-03-26 11:57:13 -07:00
Junio C Hamano	777c55a616	report_path_error(): move to dir.c The expected call sequence is for the caller to use match_pathspec() repeatedly on a set of pathspecs, accumulating the "hits" in a separate array, and then call this function to diagnose a pathspec that never matched anything, as that can indicate a typo from the command line, e.g. "git commit Maekfile". Many builtin commands use this function from builtin/ls-files.c, which is not a very healthy arrangement. ls-files might have been the first command to feel the need for such a helper, but the need is shared by everybody who uses the "match and then report" pattern. Move it to dir.c where match_pathspec() is defined. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-24 14:12:10 -07:00
Nguyễn Thái Ngọc Duy	1e8fef609e	untracked cache: guard and disable on system changes If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. This adds a new requirement that setup_git_directory* must be called before read_cache() because we need worktree location by then, or the cache is dropped. This change does not cover all bases, you can fool it if you try hard. The point is to stop accidents. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:18 -07:00
Nguyễn Thái Ngọc Duy	76e6b090a0	untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE This can be used to double check if results with untracked cache are correctly, compared to vanilla version. Untracked cache remains in index, but not used. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy	1bbb3dba3f	untracked cache: mark index dirty if untracked cache is updated Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy	c9ccb5d327	untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS This could be used to verify correct behavior in tests Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy	ed4efab1b1	untracked cache: avoid racy timestamps When a directory is updated within the same second that its timestamp is last saved, we cannot realize the directory has been updated by checking timestamps. Assume the worst (something is update). See `29e4d36` (Racy GIT - 2005-12-20) for more information. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:17 -07:00
Nguyễn Thái Ngọc Duy	e931371a8f	untracked cache: invalidate at index addition or removal Ideally we should implement untracked_cache_remove_from_index() and untracked_cache_add_to_index() so that they update untracked cache right away instead of invalidating it and wait for read_directory() next time to deal with it. But that may need some more work in unpack-trees.c. So stay simple as the first step. The new call in add_index_entry_with_check() may look strange because new calls usually stay close to cache_tree_invalidate_path(). We do it a bit later than c_t_i_p() in this function because if it's about replacing the entry with the same name, we don't care (but cache-tree does). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy	f9e6c64958	untracked cache: load from UNTR index extension Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy	83c094ad0d	untracked cache: save to an index extension Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy	27b099ae87	untracked cache: don't open non-existent .gitignore This cuts down a signficant number of open(.gitignore) because most directories usually don't have .gitignore files. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy	26cb0182b8	untracked cache: mark what dirs should be recursed/saved If we redo this thing in a functional style, we would have one struct untracked_dir as input tree and another as output. The input is used for verification. The output is a brand new tree, reflecting current worktree. But that means recreate a lot of dir nodes even if a lot could be shared between input and output trees in good cases. So we go with the messy but efficient way, combining both input and output trees into one. We need a way to know which node in this combined tree belongs to the output. This is the purpose of this "recurse" flag. "valid" bit can't be used for this because it's about data of the node except the subdirs. When we invalidate a directory, we want to keep cached data of the subdirs intact even though we don't really know what subdir still exists (yet). Then we check worktree to see what actual subdir remains on disk. Those will have 'recurse' bit set again. If cached data for those are still valid, we may be able to avoid computing exclude files for them. Those subdirs that are deleted will have 'recurse' remained clear and their 'valid' bits do not matter. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:16 -07:00
Nguyễn Thái Ngọc Duy	91a2288b5f	untracked cache: record/validate dir mtime and reuse cached output The main readdir loop in read_directory_recursive() is replaced with a new one that checks if cached results of a directory is still valid. If a file is added or removed from the index, the containing directory is invalidated (but not its subdirs). If directory's mtime is changed, the same happens. If a .gitignore is updated, the containing directory and all subdirs are invalidated recursively. If dir_struct#flags or other conditions change, the cache is ignored. If a directory is invalidated, we opendir/readdir/closedir and run the exclude machinery on that directory listing as usual. If untracked cache is also enabled, we'll update the cache along the way. If a directory is validated, we simply pull the untracked listing out from the cache. The cache also records the list of direct subdirs that we have to recurse in. Fully excluded directories are seen as "untracked files". In the best case when no dirs are invalidated, read_directory() becomes a series of stat(dir), open(.gitignore), fstat(), read(), close() and optionally hash_sha1_file() For comparison, standard read_directory() is a sequence of opendir(), readdir(), open(.gitignore), fstat(), read(), close(), the expensive last_exclude_matching() and closedir(). We already try not to open(.gitignore) if we know it does not exist, so open/fstat/read/close sequence does not apply to every directory. The sequence could be reduced further, as noted in prep_exclude() in another patch. So in theory, the entire best-case read_directory sequence could be reduced to a series of stat() and nothing else. This is not a silver bullet approach. When you compile a C file, for example, the old .o file is removed and a new one with the same name created, effectively invalidating the containing directory's cache (but not its subdirectories). If your build process touches every directory, this cache adds extra overhead for nothing, so it's a good idea to separate generated files from tracked files.. Editors may use the same strategy for saving files. And of course you're out of luck running your repo on an unsupported filesystem and/or operating system. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy	cf7c61484f	untracked cache: make a wrapper around {open,read,close}dir() This allows us to feed different info to read_directory_recursive() based on untracked cache in the next patch. Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy	5ebf79ad4b	untracked cache: invalidate dirs recursively if .gitignore changes It's easy to see that if an existing .gitignore changes, its SHA-1 would be different and invalidate_gitignore() is called. If .gitignore is removed, add_excludes() will treat it like an empty .gitignore, which again should invalidate the cached directory data. if .gitignore is added, lookup_untracked() already fills initial .gitignore SHA-1 as "empty file", so again invalidate_gitignore() is called. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy	ccad261f07	untracked cache: initial untracked cache validation Make sure the starting conditions and all global exclude files are good to go. If not, either disable untracked cache completely, or wipe out the cache and start fresh. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:15 -07:00
Nguyễn Thái Ngọc Duy	0dcb8d7fe0	untracked cache: record .gitignore information and dir hierarchy The idea is if we can capture all input and (non-rescursive) output of read_directory_recursive(), and can verify later that all the input is the same, then the second r_d_r() should produce the same output as in the first run. The requirement for this to work is stat info of a directory MUST change if an entry is added to or removed from that directory (and should not change often otherwise). If your OS and filesystem do not meet this requirement, untracked cache is not for you. Most file systems on nix should be fine. On Windows, NTFS is fine while FAT may not be [1] even though FAT on Linux seems to be fine. The list of input of r_d_r() is in the big comment block in dir.h. In short, the output of a directory (not counting subdirs) mainly depends on stat info of the directory in question, all .gitignore leading to it and the check_only flag when r_d_r() is called recursively. This patch records all this info (and the output) as r_d_r() runs. Two hash_sha1_file() are required for $GIT_DIR/info/exclude and core.excludesfile unless their stat data matches. hash_sha1_file() is only needed when .gitignore files in the worktree are modified, otherwise their SHA-1 in index is used (see the previous patch). We could store stat data for .gitignore files so we don't have to rehash them if their content is different from index, but I think .gitignore files are rarely modified, so not worth extra cache data (and hashing penalty read-cache.c:verify_hdr(), as we will be storing this as an index extension). The implication is, if you change .gitignore, you better add it to the index soon or you lose all the benefit of untracked cache because a modified .gitignore invalidates all subdirs recursively. This is especially bad for .gitignore at root. This cached output is about untracked files only, not ignored files because the number of tracked files is usually small, so small cache overhead, while the number of ignored files could go really high (e.g. .o files mixing with source code). [1] "Description of NTFS date and time stamps for files and folders" http://support.microsoft.com/kb/299648 Helped-by: Torsten Bögershausen <tboegi@web.de> Helped-by: David Turner <dturner@twopensource.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:14 -07:00
Nguyễn Thái Ngọc Duy	55fe6f51f4	dir.c: optionally compute sha-1 of a .gitignore file This is not used anywhere yet. But the goal is to compare quickly if a .gitignore file has changed when we have the SHA-1 of both old (cached somewhere) and new (from index or a tree) versions. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-12 13:45:08 -07:00
Junio C Hamano	1758d236a2	Merge branch 'nd/dir-prep-exclude-cleanup' Code clean-up. * nd/dir-prep-exclude-cleanup: dir.c: remove the second declaration of "stk" in prep_exclude()	2014-10-24 15:00:05 -07:00
Nguyễn Thái Ngọc Duy	03e11a715b	dir.c: remove the second declaration of "stk" in prep_exclude() This "stk" shadows the first declaration at the top. There's currently no bad effect. But let's avoid it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-21 11:22:00 -07:00
Junio C Hamano	f655651e09	Merge branch 'rs/strbuf-getcwd' Reduce the use of fixed sized buffer passed to getcwd() calls by introducing xgetcwd() helper. * rs/strbuf-getcwd: use strbuf_add_absolute_path() to add absolute paths abspath: convert absolute_path() to strbuf use xgetcwd() to set $GIT_DIR use xgetcwd() to get the current directory or die wrapper: add xgetcwd() abspath: convert real_path_internal() to strbuf abspath: use strbuf_getcwd() to remember original working directory setup: convert setup_git_directory_gently_1 et al. to strbuf unix-sockets: use strbuf_getcwd() strbuf: add strbuf_getcwd()	2014-09-02 13:28:44 -07:00
René Scharfe	56b9f6e738	use xgetcwd() to get the current directory or die Convert several calls of getcwd() and die() to use xgetcwd() instead. This way we get rid of fixed-size buffers (which can be too small depending on the used file system) and gain consistent error messages. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-26 11:06:06 -07:00
Nguyễn Thái Ngọc Duy	aceb9429b3	prep_exclude: remove the artificial PATH_MAX limit This fixes a segfault in git-status with long paths on Windows, where PATH_MAX is only 260. This also fixes the problem of silently ignoring .gitignore if the full path exceeds PATH_MAX. Now add_excludes_from_file() will report if it gets ENAMETOOLONG. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-14 15:24:34 -07:00
Nguyễn Thái Ngọc Duy	d961baa846	dir.c: coding style fix Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-07-14 15:24:34 -07:00
Jeremiah Mahler	ccdd4a0f3c	cleanup duplicate name_compare() functions We often represent our strings as a counted string, i.e. a pair of the pointer to the beginning of the string and its length, and the string may not be NUL terminated to that length. To compare a pair of such counted strings, unpack-trees.c and read-cache.c implement their own name_compare() functions identically. In addition, the cache_name_compare() function in read-cache.c is nearly identical. The only difference is when one string is the prefix of the other string, in which case name_compare() returns -1/+1 to show which one is longer, and cache_name_compare() returns the difference of the lengths to show the same information. Unify these three functions by using the implementation from cache_name_compare(). This does not make any difference to the existing and future callers, as they must be paying attention only to the sign of the returned value (and not the magnitude) because the original implementations of these two functions return values returned by memcmp(3) when the one string is not a prefix of the other string, and the only thing memcmp(3) guarantees its callers is the sign of the returned value, not the magnitude. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:12:14 -07:00
Pasha Bolokhov	e61a6c1d82	dir.c:trim_trailing_spaces(): fix for " \ " sequence Discard the unnecessary 'nr_spaces' variable, remove 'strlen()' and improve the 'if' structure. Switch to pointers instead of integers to control the loop. Slightly more rare occurrences of 'text \ ' with a backslash in between spaces are handled correctly. Namely, the code in `7e2e4b37` (dir: ignore trailing spaces in exclude patterns, 2014-02-09) does not reset 'last_space' when a backslash is encountered and the above line stays intact as a result. Add a test at the end of t/t0008-ignores.sh to exhibit this behavior. Signed-off-by: Pasha Bolokhov <pasha.bolokhov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-02 15:48:48 -07:00
Junio C Hamano	8ba87adad6	Merge branch 'cb/aix' * cb/aix: tests: don't rely on strerror text when testing rmdir failure dir.c: make git_fnmatch() not inline	2014-04-03 12:38:38 -07:00
Charles Bailey	1f26ce615a	dir.c: make git_fnmatch() not inline Now that it calls a static inline function, it cannot be an inline definition with external linkage. Remove inline and make it an external definition. Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-31 11:50:15 -07:00
Junio C Hamano	fe9122a352	Merge branch 'dd/use-alloc-grow' Replace open-coded reallocation with ALLOC_GROW() macro. * dd/use-alloc-grow: sha1_file.c: use ALLOC_GROW() in pretend_sha1_file() read-cache.c: use ALLOC_GROW() in add_index_entry() builtin/mktree.c: use ALLOC_GROW() in append_to_tree() attr.c: use ALLOC_GROW() in handle_attr_line() dir.c: use ALLOC_GROW() in create_simplify() reflog-walk.c: use ALLOC_GROW() replace_object.c: use ALLOC_GROW() in register_replace_object() patch-ids.c: use ALLOC_GROW() in add_commit() diffcore-rename.c: use ALLOC_GROW() diff.c: use ALLOC_GROW() commit.c: use ALLOC_GROW() in register_commit_graft() cache-tree.c: use ALLOC_GROW() in find_subtree() bundle.c: use ALLOC_GROW() in add_to_ref_list() builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()	2014-03-18 13:50:21 -07:00
Junio C Hamano	650c90a185	Merge branch 'nd/no-more-fnmatch' We started using wildmatch() in place of fnmatch(3); complete the process and stop using fnmatch(3). * nd/no-more-fnmatch: actually remove compat fnmatch source code stop using fnmatch (either native or compat) Revert "test-wildmatch: add "perf" command to compare wildmatch and fnmatch" use wildmatch() directly without fnmatch() wrapper	2014-03-14 14:25:31 -07:00
Junio C Hamano	dfcd354cdf	Merge branch 'nd/gitignore-trailing-whitespace' Trailing whitespaces in .gitignore files, unless they are quoted for fnmatch(3), e.g. "path\ ", are warned and ignored. Strictly speaking, this is a backward incompatible change, but very unlikely to bite any sane user and adjusting should be obvious and easy. * nd/gitignore-trailing-whitespace: t0008: skip trailing space test on Windows dir: ignore trailing spaces in exclude patterns dir: warn about trailing spaces in exclude patterns	2014-03-14 14:23:37 -07:00
Dmitry S. Dolzhenko	9af49f822b	dir.c: use ALLOC_GROW() in create_simplify() Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-03 14:54:29 -08:00
Nguyễn Thái Ngọc Duy	ae8d082421	pathspec: pass directory indicator to match_pathspec_item() This patch activates the DO_MATCH_DIRECTORY code in m_p_i(), which makes "git diff HEAD submodule/" and "git diff HEAD submodule" produce the same output. Previously only the version without trailing slash returns the difference (if any). That's the effect of new ce_path_match(). dir_path_match() is not executed by the new tests. And it should not introduce regressions. Previously if path "dir/" is passed in with pathspec "dir/", they obviously match. With new dir_path_match(), the path becomes _directory_ "dir" vs pathspec "dir/", which is not executed by the old code path in m_p_i(). The new code path is executed and produces the same result. The other case is pathspec "dir" and path "dir/" is now turned to "dir" (with DO_MATCH_DIRECTORY). Still the same result before or after the patch. So why change? Because of the next patch about clean.c. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy	68690fdd0b	match_pathspec: match pathspec "foo/" against directory "foo" Currently we do support matching pathspec "foo/" against directory "foo". That is because match_pathspec() has no way to tell "foo" is a directory and matching "foo/" against _file_ "foo" is wrong. The callers can now tell match_pathspec if "foo" is a directory, we could make an exception for this case. Code is not executed though because no callers pass the flag yet. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy	42b0874a7e	dir.c: prepare match_pathspec_item for taking more flags Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:37:19 -08:00
Nguyễn Thái Ngọc Duy	854b09592c	pathspec: rename match_pathspec_depth() to match_pathspec() A long time ago, for some reason I was not happy with match_pathspec(). I created a better version, match_pathspec_depth() that was suppose to replace match_pathspec() eventually. match_pathspec() has finally been gone since 6 months ago. Use the shorter name for match_pathspec_depth(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:37:14 -08:00
Nguyễn Thái Ngọc Duy	eb07894fe0	use wildmatch() directly without fnmatch() wrapper Make it clear that we don't use fnmatch() anymore. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-20 14:15:46 -08:00
Nguyễn Thái Ngọc Duy	7e2e4b37d3	dir: ignore trailing spaces in exclude patterns Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-10 11:49:53 -08:00
Nguyễn Thái Ngọc Duy	16402b992e	dir: warn about trailing spaces in exclude patterns Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-10 11:49:53 -08:00
Junio C Hamano	d0956cfa8e	Merge branch 'mh/safe-create-leading-directories' Code clean-up and protection against concurrent write access to the ref namespace. * mh/safe-create-leading-directories: rename_tmp_log(): on SCLD_VANISHED, retry rename_tmp_log(): limit the number of remote_empty_directories() attempts rename_tmp_log(): handle a possible mkdir/rmdir race rename_ref(): extract function rename_tmp_log() remove_dir_recurse(): handle disappearing files and directories remove_dir_recurse(): tighten condition for removing unreadable dir lock_ref_sha1_basic(): if locking fails with ENOENT, retry lock_ref_sha1_basic(): on SCLD_VANISHED, retry safe_create_leading_directories(): add new error value SCLD_VANISHED cmd_init_db(): when creating directories, handle errors conservatively safe_create_leading_directories(): introduce enum for return values safe_create_leading_directories(): always restore slash at end of loop safe_create_leading_directories(): split on first of multiple slashes safe_create_leading_directories(): rename local variable safe_create_leading_directories(): add explicit "slash" pointer safe_create_leading_directories(): reduce scope of local variable safe_create_leading_directories(): fix format of "if" chaining	2014-01-27 10:45:33 -08:00
Michael Haggerty	863808cd1a	remove_dir_recurse(): handle disappearing files and directories If a file or directory that we are trying to remove disappears (e.g., because another process has pruned it), do not consider it an error. However, if REMOVE_DIR_KEEP_TOPLEVEL is set, and the toplevel directory is missing, then consider it an error (like before). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-21 13:46:47 -08:00
Michael Haggerty	ecb2c282c0	remove_dir_recurse(): tighten condition for removing unreadable dir If opendir() fails on the top-level directory, it makes sense to try to delete it anyway--but only if the failure was due to EACCES. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-21 13:46:32 -08:00
Nguyễn Thái Ngọc Duy	ef79b1f870	Support pathspec magic :(exclude) and its short form :! Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-06 13:00:39 -08:00
Eric Sunshine	de372b1b46	dir: revert work-around for retired dangerous behavior directory_exists_in_index_icase() dangerously assumed that it could access one character beyond the end of its directory argument, and that that character would unconditionally be '/'. `2eac2a4c` (ls-files -k: a directory only can be killed if the index has a non-directory, 2013-08-15) added a caller which did not respect this undocumented assumption, and `680be044` (dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage, 2013-08-23) added a work-around which temporarily appends a '/' before invoking directory_exists_in_index_icase(). Since the dangerous behavior of directory_exists_in_index_icase() has been eliminated, the work-around is now redundant, so retire it (but not the tests added by the same commit). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-17 10:08:27 -07:00
Eric Sunshine	d28eec2673	name-hash: stop storing trailing '/' on paths in index_state.dir_hash When `5102c617` (Add case insensitivity support for directories when using git status, 2010-10-03) added directories to the name-hash there was only a single hash table in which both real cache entries and leading directory prefixes were registered. To distinguish between the two types of entries, directories were stored with a trailing '/'. `2092678c` (name-hash.c: fix endless loop with core.ignorecase=true, 2013-02-28), however, moved directories to a separate hash table (index_state.dir_hash) but retained the (now) redundant trailing '/', thus callers continue to bear the burden of ensuring the slash's presence before searching the index for a directory. Eliminate this redundancy by storing paths in the dir-hash without the trailing '/'. An important benefit of this change is that it eliminates undocumented and dangerous behavior of dir.c:directory_exists_in_index_icase() in which it assumes not only that it can validly access one character beyond the end of its incoming directory argument, but also that that character will unconditionally be a '/'. This perilous behavior was "tolerated" because the string passed in by its lone caller always had a '/' in that position, however, things broke [1] when `2eac2a4c` (ls-files -k: a directory only can be killed if the index has a non-directory, 2013-08-15) added a new caller which failed to respect the undocumented assumption. [1]: http://thread.gmane.org/gmane.comp.version-control.git/232727 Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-17 10:08:07 -07:00
Eric Sunshine	ebbd7439b1	employ new explicit "exists in index?" API Each caller of index_name_exists() knows whether it is looking for a directory or a file, and can avoid the unnecessary indirection of index_name_exists() by instead calling index_dir_exists() or index_file_exists() directly. Invoking the appropriate search function explicitly will allow a subsequent patch to relieve callers of the artificial burden of having to add a trailing '/' to the pathname given to index_dir_exists(). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-17 10:07:37 -07:00
Junio C Hamano	4c4d9d9b65	Merge branch 'jc/ls-files-killed-optim' "git ls-files -k" needs to crawl only the part of the working tree that may overlap the paths in the index to find killed files, but shared code with the logic to find all the untracked files, which made it unnecessarily inefficient. * jc/ls-files-killed-optim: dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage t3010: update to demonstrate "ls-files -k" optimization pitfalls ls-files -k: a directory only can be killed if the index has a non-directory dir.c: use the cache_* macro to access the current index	2013-09-11 15:03:28 -07:00
Junio C Hamano	b02f5aeda6	Merge branch 'jl/submodule-mv' "git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...	2013-09-09 14:36:15 -07:00
Eric Sunshine	680be044d9	dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage directory_exists_in_index() takes pathname and its length, but its helper function directory_exists_in_index_icase() reads one byte beyond the end of the pathname and expects there to be a '/'. This needs to be fixed, as that one-byte-beyond-the-end location may not even be readable, possibly by not registering directories to name hashes with trailing slashes. In the meantime, update the new caller added recently to treat_one_path() to make sure that the path buffer it gives the function is one byte longer than the path it is asking the function about by appending a slash to it. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-23 16:26:59 -07:00
Junio C Hamano	2eac2a4cc4	ls-files -k: a directory only can be killed if the index has a non-directory "ls-files -o" and "ls-files -k" both traverse the working tree down to find either all untracked paths or those that will be "killed" (removed from the working tree to make room) when the paths recorded in the index are checked out. It is necessary to traverse the working tree fully when enumerating all the "other" paths, but when we are only interested in "killed" paths, we can take advantage of the fact that paths that do not overlap with entries in the index can never be killed. The treat_one_path() helper function, which is called during the recursive traversal, is the ideal place to implement an optimization. When we are looking at a directory P in the working tree, there are three cases: (1) P exists in the index. Everything inside the directory P in the working tree needs to go when P is checked out from the index. (2) P does not exist in the index, but there is P/Q in the index. We know P will stay a directory when we check out the contents of the index, but we do not know yet if there is a directory P/Q in the working tree to be killed, so we need to recurse. (3) P does not exist in the index, and there is no P/Q in the index to require P to be a directory, either. Only in this case, we know that everything inside P will not be killed without recursing. Note that this helper is called by treat_leading_path() that decides if we need to traverse only subdirectories of a single common leading directory, which is essential for this optimization to be correct. This caller checks each level of the leading path component from shallower directory to deeper ones, and that is what allows us to only check if the path appears in the index. If the call to treat_one_path() weren't there, given a path P/Q/R, the real traversal may start from directory P/Q/R, even when the index records P as a regular file, and we would end up having to check if any leading subpath in P/Q/R, e.g. P, appears in the index. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-15 13:50:34 -07:00
Junio C Hamano	7126102742	dir.c: use the cache_* macro to access the current index These codepaths always start from the_index and use index_* functions, but there is no reason to do so. Use the compatibility cache_* macro to access the current in-core index like everybody else. While at it, fix typo in the comment for a function to check if a path within a directory appears in the index. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-15 12:08:45 -07:00
Junio C Hamano	d3aeb31dc4	Merge branch 'nd/const-struct-cache-entry' * nd/const-struct-cache-entry: Convert "struct cache_entry *" to "const ..." wherever possible	2013-07-22 11:24:01 -07:00
Nguyễn Thái Ngọc Duy	93d9353716	parse_pathspec: accept :(icase)path syntax Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 12:14:38 -07:00
Nguyễn Thái Ngọc Duy	bd30c2e484	pathspec: support :(glob) syntax :(glob)path differs from plain pathspec that it uses wildmatch with WM_PATHNAME while the other uses fnmatch without FNM_PATHNAME. The difference lies in how '' (and '*') is processed. With the introduction of :(glob) and :(literal) and their global options --[no]glob-pathspecs, the user can: - make everything literal by default via --noglob-pathspecs --literal-pathspecs cannot be used for this purpose as it disables _all_ pathspec magic. - individually turn on globbing with :(glob) - make everything globbing by default via --glob-pathspecs - individually turn off globbing with :(literal) The implication behind this is, there is no way to gain the default matching behavior (i.e. fnmatch without FNM_PATHNAME). You either get new globbing or literal. The old fnmatch behavior is considered deprecated and discouraged to use. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:10 -07:00
Nguyễn Thái Ngọc Duy	5c6933d201	pathspec: support :(literal) syntax for noglob pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	341003e715	kill limit_pathspec_to_literal() as it's only used by parse_pathspec() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	b3920bbdc5	rename field "raw" to "_raw" in struct pathspec This patch is essentially no-op. It helps catching new use of this field though. This field is introduced as an intermediate step for the pathspec conversion and will be removed eventually. At this stage no more access sites should be introduced. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	84b8b5d1fa	remove match_pathspec() in favor of match_pathspec_depth() match_pathspec_depth was created to replace match_pathspec (see `61cf282` (pathspec: add match_pathspec_depth() - 2010-12-15). It took more than two years, but the replacement finally happens :-) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	9a08727443	remove init_pathspec() in favor of parse_pathspec() While at there, move free_pathspec() to pathspec.c Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	827f4d6c21	convert common_prefix() to use struct pathspec The code now takes advantage of nowildcard_len field. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:09 -07:00
Nguyễn Thái Ngọc Duy	7327d3d1b7	convert {read,fill}_directory to take struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:08 -07:00
Nguyễn Thái Ngọc Duy	8f4f8f4579	guard against new pathspec magic in pathspec matching code GUARD_PATHSPEC() marks pathspec-sensitive code, basically all those that touch anything in 'struct pathspec' except fields "nr" and "original". GUARD_PATHSPEC() is not supposed to fail. It's mainly to help the designers catch unsupported codepaths. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:07 -07:00
Nguyễn Thái Ngọc Duy	6330a17199	parse_pathspec: add special flag for max_depth feature match_pathspec_depth() and tree_entry_interesting() check max_depth field in order to support "git grep --max-depth". The feature activation is tied to "recursive" field, which led to some unwanted activation, e.g. `5c8eeb8` (diff-index: enable recursive pathspec matching in unpack_trees - 2012-01-15). This patch decouples the activation from "recursive" field, puts it in "magic" field instead. This makes sure that only "git grep" can activate this feature. And because parse_pathspec knows when the feature is not used, it does not need to sort pathspec (required for max_depth to work correctly). A small win for non-grep cases. Even though a new magic flag is introduced, no magic syntax is. The magic can be only enabled by parse_pathspec() caller. We might someday want to support ":(maxdepth:10)src." It all depends on actual use cases. max_depth feature cannot be enabled via init_pathspec() anymore. But that's ok because init_pathspec() is on its way to /dev/null. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy	d2ce133195	parse_pathspec: save original pathspec for reporting We usually use pathspec_item's match field for pathspec error reporting. However "match" (or "raw") does not show the magic part, which will play more important role later on. Preserve exact user input for reporting. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy	87323bdace	add parse_pathspec() that converts cmdline args to struct pathspec Currently to fill a struct pathspec, we do: const char **paths; paths = get_pathspec(prefix, argv); ... init_pathspec(&pathspec, paths); "paths" can only carry bare strings, which loses information from command line arguments such as pathspec magic or the prefix part's length for each argument. parse_pathspec() is introduced to combine the two calls into one. The plan is gradually replace all get_pathspec() and init_pathspec() with parse_pathspec(). get_pathspec() now becomes a thin wrapper of parse_pathspec(). parse_pathspec() allows the caller to reject the pathspec magics that it does not support. When a new pathspec magic is introduced, we can enable it per command after making sure that all underlying code has no problem with the new magic. "flags" parameter is currently unused. But it would allow callers to pass certain instructions to parse_pathspec, for example forcing literal pathspec when no magic is used. With the introduction of parse_pathspec, there are now two functions that can initialize struct pathspec: init_pathspec and parse_pathspec. Any semantic changes in struct pathspec must be reflected in both functions. init_pathspec() will be phased out in favor of parse_pathspec(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy	64acde94ef	move struct pathspec and related functions to pathspec.[ch] Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:06 -07:00
Nguyễn Thái Ngọc Duy	9c5e6c802c	Convert "struct cache_entry " to "const ..." wherever possible I attempted to make index_state->cache[] a "const struct cache_entry " to find out how existing entries in index are modified and where. The question I have is what do we do if we really need to keep track of on-disk changes in the index. The result is - diff-lib.c: setting CE_UPTODATE - name-hash.c: setting CE_HASHED - preload-index.c, read-cache.c, unpack-trees.c and builtin/update-index: obvious - entry.c: write_entry() may refresh the checked out entry via fill_stat_cache_info(). This causes "non-const struct cache_entry " in builtin/apply.c, builtin/checkout-index.c and builtin/checkout.c - builtin/ls-files.c: --with-tree changes stagemask and may set CE_UPDATE Of these, write_entry() and its call sites are probably most interesting because it modifies on-disk info. But this is stat info and can be retrieved via refresh, at least for porcelain commands. Other just uses ce_flags for local purposes. So, keeping track of "dirty" entries is just a matter of setting a flag in index modification functions exposed by read-cache.c. Except unpack-trees, the rest of the code base does not do anything funny behind read-cache's back. The actual patch is less valueable than the summary above. But if anyone wants to re-identify the above sites. Applying this patch, then this: diff --git a/cache.h b/cache.h index 430d021..1692891 100644 --- a/cache.h +++ b/cache.h @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode) #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1) struct index_state { - struct cache_entry cache; + const struct cache_entry cache; unsigned int version; unsigned int cache_nr, cache_alloc, cache_changed; struct string_list *resolve_undo; will help quickly identify them without bogus warnings. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-09 09:12:48 -07:00
Junio C Hamano	26c986e118	treat_directory(): do not declare submodules to be untracked When the working tree walker encounters a directory, it asks the function treat_directory() if it should descend into it, show it as an untracked directory, or do something else. When the directory is the top of the submodule working tree, we used to say "That is an untracked directory", which was bogus. It is an entity that is tracked in the index of the repository we are looking at, and that is not to be descended into it. Return path_none, not path_untracked, to report that. The existing case that path_untracked is returned for a newly discovered submodule that is not tracked in the index (this only happens when DIR_NO_GITLINKS option is not used) is unchanged, but that is exactly because the submodule is not tracked in the index. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-01 14:23:24 -07:00
Junio C Hamano	3684101a65	Merge branch 'kb/status-ignored-optim-2' Fix 1.8.3 regressions in the .gitignore path exclusion logic. * kb/status-ignored-optim-2: dir.c: fix ignore processing within not-ignored directories	2013-06-03 12:58:56 -07:00
Karsten Blees	c3c327deea	dir.c: fix ignore processing within not-ignored directories As of `95c6f271` "dir.c: unify is_excluded and is_path_excluded APIs", the is_excluded API no longer recurses into directories that match an ignore pattern, and returns the directory's ignored state for all contained paths. This is OK for normal ignore patterns, i.e. ignoring a directory affects the entire contents recursively. Unfortunately, this also "works" for negated ignore patterns ('!dir'), i.e. the entire contents is "not-ignored" recursively, regardless of ignore patterns that match the contents directly. In prep_exclude, skip recursing into a directory only if it is really ignored (i.e. the ignore pattern is not negated). Signed-off-by: Karsten Blees <blees@dcon.de> Tested-by: Øystein Walle <oystwa@gmail.com> Reviewed-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-02 14:54:38 -07:00
Junio C Hamano	7ebb906ddd	Merge branch 'jn/config-ignore-inaccessible' When $HOME is misconfigured to point at an unreadable directory, we used to complain and die. This loosens the check. * jn/config-ignore-inaccessible: config: allow inaccessible configuration under $HOME	2013-05-29 14:30:10 -07:00
Karsten Blees	0aaf62b6e0	dir.c: git-status --ignored: don't scan the work tree twice 'git-status --ignored' still scans the work tree twice to collect untracked and ignored files, respectively. fill_directory / read_directory already supports collecting untracked and ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED flag to enable this has some git-add specific side-effects (e.g. it doesn't recurse into ignored directories, so listing ignored files with --untracked=all doesn't work). The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored files in dir_struct.entries[] (instead of dir_struct.ignored[] as DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git. We don't want to break the existing API, so lets introduce a new flag DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar to DIR_COLLECT_FILES, but will recurse into sub-directories based on the other flags as DIR_SHOW_IGNORED does. In dir.c::read_directory_recursive, add ignored files to either dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move the DIR_COLLECT_IGNORED case here so that filling result lists is in a common place. In wt-status.c::wt_status_collect_untracked, use the new flag and read results from dir_struct.ignored[]. Remove the extra fill_directory call. builtin/check-ignore.c doesn't call fill_directory, setting the git-add specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity. Update API documentation to reflect the changes. Performance: with this patch, 'git-status --ignored' is typically as fast as 'git-status'. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-15 12:36:42 -07:00
Karsten Blees	defd7c7b37	dir.c: git-status --ignored: don't scan the work tree three times 'git-status --ignored' recursively scans directories up to three times: 1. To collect untracked files. 2. To collect ignored files. 3. When collecting ignored files, to check that an untracked directory that potentially contains ignored files doesn't also contain untracked files (i.e. isn't already listed as untracked). Let's get rid of case 3 first. Currently, read_directory_recursive returns a boolean whether a directory contains the requested files or not (actually, it returns the number of files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies what we're looking for. To be able to test for both untracked and ignored files in a single scan, we need to return a bit more info, and the result must be independent of the DIR_SHOW_IGNORED flag. Reuse the path_treatment enum as return value of read_directory_recursive. Split path_handled in two separate values path_excluded and path_untracked that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't need an extra value path_untracked_and_excluded, as directories with both untracked and ignored files should be listed as untracked. Rename path_ignored to path_none for clarity (i.e. "don't treat that path" in contrast to "the path is ignored and should be treated according to DIR_SHOW_IGNORED"). Replace enum directory_treatment with path_treatment. That's just another enum with the same meaning, no need to translate back and forth. In treat_directory, get rid of the extra read_directory_recursive call and all the DIR_SHOW_IGNORED-specific code. In read_directory_recursive, decide whether to dir_add_name path_excluded or path_untracked paths based on the DIR_SHOW_IGNORED flag. The return value of read_directory_recursive is the maximum path_treatment of all files and sub-directories. In the check_only case, abort when we've reached the most significant value (path_untracked). Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-15 12:34:01 -07:00
Karsten Blees	8aaf8d7728	dir.c: git-status: avoid is_excluded checks for tracked files Checking if a file is in the index is much faster (hashtable lookup) than checking if the file is excluded (linear search over exclude patterns). Skip is_excluded checks for files: move the cache_name_exists check from treat_file to treat_one_path and return early if the file is tracked. This can safely be done as all other code paths also return path_ignored for tracked files, and dir_add_ignored skips tracked files as well. There's just one line left in treat_file, so move this to treat_one_path as well. Here's some performance data for git-status from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------- before \| 0.218 \| 1.583 \| 0.321 \| 2.579 after \| 0.156 \| 0.988 \| 0.202 \| 1.279 gain \| 1.397 \| 1.602 \| 1.589 \| 2.016 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-15 12:34:01 -07:00
Karsten Blees	b07bc8c8c3	dir.c: replace is_path_excluded with now equivalent is_excluded API Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-15 12:34:01 -07:00
Karsten Blees	95c6f27164	dir.c: unify is_excluded and is_path_excluded APIs The is_excluded and is_path_excluded APIs are very similar, except for a few noteworthy differences: is_excluded doesn't handle ignored directories, results for paths within ignored directories are incorrect. This is probably based on the premise that recursive directory scans should stop at ignored directories, which is no longer true (in certain cases, read_directory_recursive currently calls is_excluded and is_path_excluded to get correct ignored state). is_excluded caches parsed .gitignore files of the last directory in struct dir_struct. If the directory changes, it finds a common parent directory and is very careful to drop only as much state as necessary. On the other hand, is_excluded will also read and parse .gitignore files in already ignored directories, which are completely irrelevant. is_path_excluded correctly handles ignored directories by checking if any component in the path is excluded. As it uses is_excluded internally, this unfortunately forces is_excluded to drop and re-read all .gitignore files, as there is no common parent directory for the root dir. is_path_excluded tracks state in a separate struct path_exclude_check, which is essentially a wrapper of dir_struct with two more fields. However, as is_path_excluded also modifies dir_struct, it is not possible to e.g. use multiple path_exclude_check structures with the same dir_struct in parallel. The additional structure just unnecessarily complicates the API. Teach is_excluded / prep_exclude about ignored directories: whenever entering a new directory, first check if the entire directory is excluded. Remember the excluded state in dir_struct. Don't traverse into already ignored directories (i.e. don't read irrelevant .gitignore files). Directories could also be excluded by exclude patterns specified on the command line or .git/info/exclude, so we cannot simply skip prep_exclude entirely if there's no .gitignore file name (dir_struct.exclude_per_dir). Move this check to just before actually reading the file. is_path_excluded is now equivalent to is_excluded, so we can simply redirect to it (the public API is cleaned up in the next patch). The performance impact of the additional ignored check per directory is hardly noticeable when reading directories recursively (e.g. 'git status'). However, performance of git commands using the is_path_excluded API (e.g. 'git ls-files --cached --ignored --exclude-standard') is greatly improved as this no longer re-reads .gitignore files on each call. Here's some performance data from the linux and WebKit repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true): \| ls-files -ci \| status \| status --ignored \| linux \| WebKit \| linux \| WebKit \| linux \| WebKit -------+-------+--------+-------+--------+-------+--------- before \| 0.506 \| 6.539 \| 0.212 \| 1.555 \| 0.323 \| 2.541 after \| 0.080 \| 1.191 \| 0.218 \| 1.583 \| 0.321 \| 2.579 gain \| 6.325 \| 5.490 \| 0.972 \| 0.982 \| 1.006 \| 0.985 Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-15 12:34:00 -07:00
Karsten Blees	6cd5c582dc	dir.c: move prep_exclude Move prep_exclude in preparation for the next patch. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-15 12:34:00 -07:00
Karsten Blees	46aa2f95d2	dir.c: factor out parts of last_exclude_matching for later reuse Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-15 12:34:00 -07:00

1 2 3 4 5 ...