mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-14 13:13:01 +01:00

Author	SHA1	Message	Date
Junio C Hamano	8336832ad9	Merge branch 'nd/reset-intent-to-add' * nd/reset-intent-to-add: reset: support "--mixed --intent-to-add" mode	2014-02-27 14:01:40 -08:00
Junio C Hamano	cbaeafc325	Merge branch 'nd/submodule-pathspec-ending-with-slash' Allow "git cmd path/", when the 'path' is where a submodule is bound to the top-level working tree, to match 'path', despite the extra and unnecessary trailing slash. * nd/submodule-pathspec-ending-with-slash: clean: use cache_name_is_other() clean: replace match_pathspec() with dir_path_match() pathspec: pass directory indicator to match_pathspec_item() match_pathspec: match pathspec "foo/" against directory "foo" dir.c: prepare match_pathspec_item for taking more flags pathspec: rename match_pathspec_depth() to match_pathspec() pathspec: convert some match_pathspec_depth() to dir_path_match() pathspec: convert some match_pathspec_depth() to ce_path_match()	2014-02-27 14:01:15 -08:00
Junio C Hamano	156d6ed922	Merge branch 'bk/refresh-missing-ok-in-merge-recursive' Allow "merge-recursive" to work in an empty (temporary) working tree again when there are renames involved, correcting an old regression in 1.7.7 era. * bk/refresh-missing-ok-in-merge-recursive: merge-recursive.c: tolerate missing files while refreshing index read-cache.c: extend make_cache_entry refresh flag with options read-cache.c: refactor --ignore-missing implementation t3030-merge-recursive: test known breakage with empty work tree	2014-02-27 14:01:14 -08:00
Nguyễn Thái Ngọc Duy	429bb40abd	pathspec: convert some match_pathspec_depth() to ce_path_match() This helps reduce the number of match_pathspec_depth() call sites and show how match_pathspec_depth() is used. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:36:52 -08:00
Brad King	257627268a	read-cache.c: extend make_cache_entry refresh flag with options Convert the make_cache_entry boolean 'refresh' argument to a more general 'refresh_options' argument. Pass the value through to the underlying refresh_cache_ent call. Add option CE_MATCH_REFRESH to enable stat refresh. Update call sites to use the new signature. Signed-off-by: Brad King <brad.king@kitware.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:31:17 -08:00
Brad King	2e2e7ec1ef	read-cache.c: refactor --ignore-missing implementation Move lstat ENOENT handling from refresh_index to refresh_cache_ent and activate it with a new CE_MATCH_IGNORE_MISSING option. This will allow other call paths into refresh_cache_ent to use the feature. Signed-off-by: Brad King <brad.king@kitware.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:31:10 -08:00
Thomas Gummerer	3c09d6845d	read-cache: add index.version config variable Add a config variable that allows setting the default index version when initializing a new index file. Similar to the GIT_INDEX_VERSION environment variable this only affects new index files. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 13:33:17 -08:00
Thomas Gummerer	136347d718	introduce GIT_INDEX_VERSION environment variable Respect a GIT_INDEX_VERSION environment variable, when a new index is initialized. Setting the environment variable will not cause existing index files to be converted to another format, but will only affect newly written index files. This can be used to initialize repositories with index-v4. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 09:48:40 -08:00
Nguyễn Thái Ngọc Duy	b4b313f94a	reset: support "--mixed --intent-to-add" mode When --mixed is used, entries could be removed from index if the target ref does not have them. When "reset" is used in preparation for commit spliting (in a dirty worktree), it could be hard to track what files to be added back. The new option --intent-to-add simplifies it by marking all removed files intent-to-add. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>	2014-02-05 16:44:51 -08:00
Jeff King	c3d8da571f	read-cache: use get_be32 instead of hand-rolled ntoh_l Commit `d60c49c` (read-cache.c: allow unaligned mapping of the index file, 2012-04-03) introduced helpers to access unaligned data. However, we already have get_be32, which has a few advantages: 1. It's already written, so we avoid duplication. 2. It's probably faster, since it does the endian conversion and the alignment fix at the same time. 3. The get_be32 code is well-tested, having been in block-sha1 for a long time. By contrast, our custom helpers were probably almost never used, since the user needed to manually define a macro to enable them. We have to add a get_be16 implementation to the existing get_be32, but that is very simple to do. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-23 14:03:48 -08:00
Karsten Blees	5699d17ee0	read-cache.c: fix memory leaks caused by removed cache entries When cache_entry structs are removed from index_state.cache, they are not properly freed. Freeing those entries wasn't possible before because we couldn't remove them from index_state.name_hash. Now that we _do_ remove the entries from name_hash, we can also free them. Add 'free(cache_entry)' to all call sites of name-hash.c::remove_name_hash in read-cache.c (we could free() directly in remove_name_hash(), but name-hash.c isn't concerned with cache_entry allocation at all). Accessing a cache_entry after removing it from the index is now no longer allowed, as the memory has been freed. The following functions need minor fixes (typically by copying ce->name before use): - builtin/rm.c::cmd_rm - builtin/update-index.c::do_reupdate - read-cache.c::read_index_unmerged - resolve-undo.c::unmerge_index_entry_at Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-11-18 13:04:25 -08:00
Karsten Blees	419a597f64	name-hash.c: remove cache entries instead of marking them CE_UNHASHED The new hashmap implementation supports remove, so really remove unused cache entries from the name hashmap instead of just marking them. The CE_UNHASHED flag and CE_STATE_MASK are no longer needed. Keep the CE_HASHED flag to prevent adding entries twice. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-11-18 13:04:24 -08:00
Junio C Hamano	d6a58b7773	Merge branch 'es/name-hash-no-trailing-slash-in-dirs' Clean up the internal of the name-hash mechanism used to work around case insensitivity on some filesystems to cleanly fix a long-standing API glitch where the caller of cache_name_exists() that ask about a directory with a counted string was required to have '/' at one location past the end of the string. * es/name-hash-no-trailing-slash-in-dirs: dir: revert work-around for retired dangerous behavior name-hash: stop storing trailing '/' on paths in index_state.dir_hash employ new explicit "exists in index?" API name-hash: refactor polymorphic index_name_exists()	2013-10-17 15:55:16 -07:00
Junio C Hamano	541dc4dfa0	Merge branch 'jk/write-broken-index-with-nul-sha1' Earlier we started rejecting an attempt to add 0{40} object name to the index and to tree objects, but it sometimes is necessary to allow so to be able to use tools like filter-branch to correct such broken tree objects. * jk/write-broken-index-with-nul-sha1: write_index: optionally allow broken null sha1s	2013-09-17 11:40:27 -07:00
Eric Sunshine	d28eec2673	name-hash: stop storing trailing '/' on paths in index_state.dir_hash When `5102c617` (Add case insensitivity support for directories when using git status, 2010-10-03) added directories to the name-hash there was only a single hash table in which both real cache entries and leading directory prefixes were registered. To distinguish between the two types of entries, directories were stored with a trailing '/'. `2092678c` (name-hash.c: fix endless loop with core.ignorecase=true, 2013-02-28), however, moved directories to a separate hash table (index_state.dir_hash) but retained the (now) redundant trailing '/', thus callers continue to bear the burden of ensuring the slash's presence before searching the index for a directory. Eliminate this redundancy by storing paths in the dir-hash without the trailing '/'. An important benefit of this change is that it eliminates undocumented and dangerous behavior of dir.c:directory_exists_in_index_icase() in which it assumes not only that it can validly access one character beyond the end of its incoming directory argument, but also that that character will unconditionally be a '/'. This perilous behavior was "tolerated" because the string passed in by its lone caller always had a '/' in that position, however, things broke [1] when `2eac2a4c` (ls-files -k: a directory only can be killed if the index has a non-directory, 2013-08-15) added a new caller which failed to respect the undocumented assumption. [1]: http://thread.gmane.org/gmane.comp.version-control.git/232727 Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-17 10:08:07 -07:00
Eric Sunshine	ebbd7439b1	employ new explicit "exists in index?" API Each caller of index_name_exists() knows whether it is looking for a directory or a file, and can avoid the unnecessary indirection of index_name_exists() by instead calling index_dir_exists() or index_file_exists() directly. Invoking the appropriate search function explicitly will allow a subsequent patch to relieve callers of the artificial burden of having to add a trailing '/' to the pathname given to index_dir_exists(). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-09-17 10:07:37 -07:00
Junio C Hamano	b0d974d6d9	Merge branch 'tg/index-struct-sizes' The code that reads from a region that mmaps an on-disk index assumed that "int"/"short" are always 32/16 bits. * tg/index-struct-sizes: read-cache: use fixed width integer types	2013-09-09 14:50:38 -07:00
Junio C Hamano	b02f5aeda6	Merge branch 'jl/submodule-mv' "git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...	2013-09-09 14:36:15 -07:00
Jeff King	83bd7437ca	write_index: optionally allow broken null sha1s Commit `4337b58` (do not write null sha1s to on-disk index, 2012-07-28) added a safety check preventing git from writing null sha1s into the index. The intent was to catch errors in other parts of the code that might let such an entry slip into the index (or worse, a tree). Some existing repositories may have invalid trees that contain null sha1s already, though. Until `4337b58`, a common way to clean this up would be to use git-filter-branch's index-filter to repair such broken entries. That now fails when filter-branch tries to write out the index. Introduce a GIT_ALLOW_NULL_SHA1 environment variable to relax this check and make it easier to recover from such a history. It is tempting to not involve filter-branch in this commit at all, and instead require the user to manually invoke GIT_ALLOW_NULL_SHA1=1 git filter-branch ... to perform an index-filter on a history with trees with null sha1s. That would be slightly safer, but requires some specialized knowledge from the user. So let's set the GIT_ALLOW_NULL_SHA1 variable automatically when checking out the to-be-filtered trees. Advice on using filter-branch to remove such entries already exists on places like stackoverflow, and this patch makes it Just Work again on recent versions of git. Further commands that touch the index will still notice and fail, unless they actually remove the broken entries. A filter-branch whose filters do not touch the index at all will not error out (since we complain of the null sha1 only on writing, not when making a tree out of the index), but this is acceptable, as we still print a loud warning, so the problem is unlikely to go unnoticed. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-28 20:54:43 -07:00
Thomas Gummerer	7800c1ebcc	read-cache: use fixed width integer types Use the fixed width integer types uint16_t and uint32_t for on-disk structures; unsigned short and unsigned int do not have a guaranteed size. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-08-20 12:29:42 -07:00
Ondřej Bílka	98e023dea4	many small typofixes Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Reviewed-by: Marc Branchaud <marcnarc@xiplink.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-29 12:32:25 -07:00
Junio C Hamano	65ed8684c4	Merge branch 'rs/discard-index-discard-array' into maint * rs/discard-index-discard-array: read-cache: free cache in discard_index read-cache: add simple performance test	2013-07-19 10:39:01 -07:00
Nguyễn Thái Ngọc Duy	9b2d61499b	convert refresh_index to take struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-15 10:56:08 -07:00
Nguyễn Thái Ngọc Duy	9c5e6c802c	Convert "struct cache_entry " to "const ..." wherever possible I attempted to make index_state->cache[] a "const struct cache_entry " to find out how existing entries in index are modified and where. The question I have is what do we do if we really need to keep track of on-disk changes in the index. The result is - diff-lib.c: setting CE_UPTODATE - name-hash.c: setting CE_HASHED - preload-index.c, read-cache.c, unpack-trees.c and builtin/update-index: obvious - entry.c: write_entry() may refresh the checked out entry via fill_stat_cache_info(). This causes "non-const struct cache_entry " in builtin/apply.c, builtin/checkout-index.c and builtin/checkout.c - builtin/ls-files.c: --with-tree changes stagemask and may set CE_UPDATE Of these, write_entry() and its call sites are probably most interesting because it modifies on-disk info. But this is stat info and can be retrieved via refresh, at least for porcelain commands. Other just uses ce_flags for local purposes. So, keeping track of "dirty" entries is just a matter of setting a flag in index modification functions exposed by read-cache.c. Except unpack-trees, the rest of the code base does not do anything funny behind read-cache's back. The actual patch is less valueable than the summary above. But if anyone wants to re-identify the above sites. Applying this patch, then this: diff --git a/cache.h b/cache.h index 430d021..1692891 100644 --- a/cache.h +++ b/cache.h @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode) #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1) struct index_state { - struct cache_entry cache; + const struct cache_entry cache; unsigned int version; unsigned int cache_nr, cache_alloc, cache_changed; struct string_list *resolve_undo; will help quickly identify them without bogus warnings. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-07-09 09:12:48 -07:00
Junio C Hamano	ac5611a1cc	Merge branch 'fc/do-not-use-the-index-in-add-to-index' into maint * fc/do-not-use-the-index-in-add-to-index: read-cache: trivial style cleanups read-cache: fix wrong 'the_index' usage	2013-07-03 15:40:38 -07:00
Junio C Hamano	079424a2cf	Merge branch 'mh/ref-races' "git pack-refs" that races with new ref creation or deletion have been susceptible to lossage of refs under right conditions, which has been tightened up. * mh/ref-races: for_each_ref: load all loose refs before packed refs get_packed_ref_cache: reload packed-refs file when it changes add a stat_validity struct Extract a struct stat_data from cache_entry packed_ref_cache: increment refcount when locked do_for_each_entry(): increment the packed refs cache refcount refs: manage lifetime of packed refs cache via reference counting refs: implement simple transactions for the packed-refs file refs: wrap the packed refs cache in a level of indirection pack_refs(): split creation of packed refs and entry writing repack_without_ref(): split list curation and entry writing	2013-06-30 15:40:05 -07:00
Junio C Hamano	08bcd774f4	Merge branch 'rs/discard-index-discard-array' * rs/discard-index-discard-array: read-cache: free cache in discard_index read-cache: add simple performance test	2013-06-20 16:02:30 -07:00
Michael Haggerty	3861253224	add a stat_validity struct It can sometimes be useful to know whether a path in the filesystem has been updated without going to the work of opening and re-reading its content. We trust the stat() information on disk already to handle index updates, and we can use the same trick here. This patch introduces a "stat_validity" struct which encapsulates the concept of checking the stat-freshness of a file. It is implemented on top of "struct stat_data" to reuse the logic about which stat entries to trust for a particular platform, but hides the complexity behind two simple functions: check and update. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-20 15:50:17 -07:00
Michael Haggerty	c21d39d7c7	Extract a struct stat_data from cache_entry Add public functions fill_stat_data() and match_stat_data() to work with it. This infrastructure will later be used to check the validity of other types of file. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-20 15:50:17 -07:00
Junio C Hamano	6bf2227b92	Merge branch 'fc/do-not-use-the-index-in-add-to-index' * fc/do-not-use-the-index-in-add-to-index: read-cache: trivial style cleanups read-cache: fix wrong 'the_index' usage	2013-06-11 13:30:28 -07:00
René Scharfe	a0fc4db01d	read-cache: free cache in discard_index discard_cache doesn't have to free the array of cache entries, because the next call of read_cache can simply reuse it, as they all operate on the global variable the_index. discard_index on the other hand does have to free it, because it can be used e.g. with index_state variables on the stack, in which case a missing free would cause an unrecoverable leak. This patch releases the memory and removes a comment that was relevant for discard_cache but has become outdated. Since discard_cache is just a wrapper around discard_index nowadays, we lose the optimization that avoids reallocation of that array within loops of read_cache and discard_cache. That doesn't cause a performance regression for me, however (HEAD = this patch, HEAD^ = master + p0002): Test // HEAD^ HEAD ---------------\\----------------------------------------------------- 0002.1: read_ca// 1000 times 0.62(0.58+0.04) 0.61(0.58+0.02) -1.6% Suggested-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-09 17:03:01 -07:00
Felipe Contreras	c4aa3167fe	read-cache: trivial style cleanups Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-03 10:10:38 -07:00
Felipe Contreras	582eb8536b	read-cache: fix wrong 'the_index' usage We are dealing with the 'istate' index, not 'the_index'. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-03 10:10:25 -07:00
René Scharfe	21a6b9fa42	read-cache: mark cache_entry pointers const ie_match_stat and ie_modified only derefence their struct cache_entry pointers for reading. Add const to the parameter declaration here and do the same for the static helper function used by them, as it's the same there as well. This allows callers to pass in const pointers. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-06-02 15:31:12 -07:00
Junio C Hamano	4b35b007a6	Merge branch 'lf/read-blob-data-from-index' Reduce duplicated code between convert.c and attr.c. * lf/read-blob-data-from-index: convert.c: remove duplicate code read_blob_data_from_index(): optionally return the size of blob data attr.c: extract read_index_data() as read_blob_data_from_index()	2013-04-21 18:39:45 -07:00
Lukas Fleischer	ff36682505	read_blob_data_from_index(): optionally return the size of blob data This allows for optionally getting the size of the returned data and will be used in a follow-up patch. Signed-off-by: Lukas Fleischer <git@cryptocrack.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-17 09:51:47 -07:00
Lukas Fleischer	29fb37b272	attr.c: extract read_index_data() as read_blob_data_from_index() Extract the read_index_data() function from attr.c and move it to read-cache.c; rename it to read_blob_data_from_index() and update the function signature of it to align better with index/cache API functions. This allows for reusing the function in convert.c later. Signed-off-by: Lukas Fleischer <git@cryptocrack.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-04-17 09:49:11 -07:00
Junio C Hamano	c81e2c61b3	Merge branch 'kb/name-hash' into maint-1.8.1 * kb/name-hash: name-hash.c: fix endless loop with core.ignorecase=true	2013-04-03 08:44:54 -07:00
Junio C Hamano	c044bed8f0	Merge branch 'kb/name-hash' The code to keep track of what directory names are known to Git on platforms with case insensitive filesystems can get confused upon a hash collision between these pathnames and looped forever. * kb/name-hash: name-hash.c: fix endless loop with core.ignorecase=true	2013-04-01 08:59:53 -07:00
Junio C Hamano	865e99b5fd	Merge branch 'nd/doc-index-format' Update the index format documentation to mention the v4 format. * nd/doc-index-format: update-index: list supported idx versions and their features read-cache.c: use INDEX_FORMAT_{LB,UB} in verify_hdr() index-format.txt: mention of v4 is missing in some places	2013-03-19 12:15:14 -07:00
Karsten Blees	2092678cd5	name-hash.c: fix endless loop with core.ignorecase=true With core.ignorecase=true, name-hash.c builds a case insensitive index of all tracked directories. Currently, the existing cache entry structures are added multiple times to the same hashtable (with different name lengths and hash codes). However, there's only one dir_next pointer, which gets completely messed up in case of hash collisions. In the worst case, this causes an endless loop if ce == ce->dir_next (see t7062). Use a separate hashtable and separate structures for the directory index so that each directory entry has its own next pointer. Use reference counting to track which directory entry contains files. There are only slight changes to the name-hash.c API: - new free_name_hash() used by read_cache.c::discard_index() - remove_name_hash() takes an additional index_state parameter - index_name_exists() for a directory (trailing '/') may return a cache entry that has been removed (CE_UNHASHED). This is not a problem as the return value is only used to check if the directory exists (dir.c) or to normalize casing of directory names (read-cache.c). Getting rid of cache_entry.dir_next reduces memory consumption, especially with core.ignorecase=false (which doesn't use that member at all). With core.ignorecase=true, building the directory index is slightly faster as we add / check the parent directory first (instead of going through all directory levels for each file in the index). E.g. with WebKit (~200k files, ~7k dirs), time spent in lazy_init_name_hash is reduced from 176ms to 130ms. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-27 23:29:04 -08:00
Nguyễn Thái Ngọc Duy	b82a7b5bbc	read-cache.c: use INDEX_FORMAT_{LB,UB} in verify_hdr() `9d22778` (read-cache.c: write prefix-compressed names in the index - 2012-04-04) defined these. Interestingly, they were not used by read-cache.c, or anywhere in that patch. They were used in builtin/update-index.c later for checking supported index versions. Use them here too. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-02-22 12:48:41 -08:00
Robin Rosenberg	c08e4d5b5c	Enable minimal stat checking Specifically the fields uid, gid, ctime, ino and dev are set to zero by JGit. Other implementations, eg. Git in cygwin are allegedly also somewhat incompatible with Git For Windows and on *nix platforms the resolution of the timestamps may differ. Any stat checking by git will then need to check content, which may be very slow, particularly on Windows. Since mtime and size is typically enough we should allow the user to tell git to avoid checking these fields if they are set to zero in the index. This change introduces a core.checkstat config option where the the user can select to check all fields (default), or just size and the whole second part of mtime (minimal). Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-01-22 09:33:16 -08:00
Junio C Hamano	357e9c69c9	read-cache.c: mark a private file-scope symbol as static Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-09-15 22:58:21 -07:00
Junio C Hamano	3b753148b6	Merge branch 'jk/maint-null-in-trees' We do not want a link to 0{40} object stored anywhere in our objects. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value	2012-08-27 11:54:28 -07:00
Junio C Hamano	d0ae7e2e71	Merge branch 'nd/index-errno' Assignments to errno before calling system functions that used to matter in the old code were left behind after the code structure changed sufficiently to make them useless. * nd/index-errno: read_index_from: remove bogus errno assignments	2012-08-22 11:51:42 -07:00
Nguyễn Thái Ngọc Duy	57d84f8d93	read_index_from: remove bogus errno assignments These assignments comes from the very first commit `e83c516` (Initial revision of "git", the information manager from hell - 2005-04-07). Back then we did not die() when errors happened so correct errno was required. Since `5d1a5c0` ([PATCH] Better error reporting for "git status" - 2005-10-01), read_index_from() learned to die rather than just return -1 and these assignments became irrelevant. Remove them. While at it, move die_errno() next to xmmap() call because it's the mmap's error code that we care about. Otherwise if close(fd); fails, it could overwrite mmap's errno. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-08-06 10:01:21 -07:00
Jeff King	4337b5856f	do not write null sha1s to on-disk index We should never need to write the null sha1 into an index entry (short of the 1 in 2^160 chance that somebody actually has content that hashes to it). If we attempt to do so, it is much more likely that it is a bug, since we use the null sha1 as a sentinel value to mean "not valid". The presence of null sha1s in the index (which can come from, among other things, "update-index --cacheinfo", or by reading a corrupted tree) can cause problems for later readers, because they cannot distinguish the literal null sha1 from its use a sentinel value. For example, "git diff-files" on such an entry would make it appear as if it is stat-dirty, and until recently, the diff code assumed such an entry meant that we should be diffing a working tree file rather than a blob. Ideally, we would stop such entries from entering even our in-core index. However, we do sometimes legitimately add entries with null sha1s in order to represent these sentinel situations; simply forbidding them in add_index_entry breaks a lot of the existing code. However, we can at least make sure that our in-core sentinel representation never makes it to disk. To be thorough, we will test an attempt to add both a blob and a submodule entry. In the former case, we might run into problems anyway because we will be missing the blob object. But in the latter case, we do not enforce connectivity across gitlink entries, making this our only point of enforcement. The current implementation does not care which type of entry we are seeing, but testing both cases helps future-proof the test suite in case that changes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-29 15:13:36 -07:00
Junio C Hamano	30ea575876	Merge branch 'tg/ce-namelen-field' Split lower bits of ce_flags field and creates a new ce_namelen field in the in-core index structure. * tg/ce-namelen-field: Strip namelen out of ce_flags into a ce_namelen field	2012-07-23 20:55:21 -07:00
Junio C Hamano	8fc824f397	Merge branch 'tg/maint-cache-name-compare' Even though the index can record pathnames longer than 1<<12 bytes, in some places we were not comparing them in full, potentially replacing index entries instead of adding. * tg/maint-cache-name-compare: cache_name_compare(): do not truncate while comparing paths	2012-07-15 21:40:18 -07:00
Thomas Gummerer	b60e188c51	Strip namelen out of ce_flags into a ce_namelen field Strip the name length from the ce_flags field and move it into its own ce_namelen field in struct cache_entry. This will both give us a tiny bit of a performance enhancement when working with long pathnames and is a refactoring for more readability of the code. It enhances readability, by making it more clear what is a flag, and where the length is stored and make it clear which functions use stages in comparisions and which only use the length. It also makes CE_NAMEMASK private, so that users don't mistakenly write the name length in the flags. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-11 09:42:45 -07:00
Junio C Hamano	01388518c3	Merge branch 'tg/maint-cache-name-compare' into tg/ce-namelen-field * tg/maint-cache-name-compare: cache_name_compare(): do not truncate while comparing paths	2012-07-11 09:40:25 -07:00
Junio C Hamano	d5f53338ab	cache_name_compare(): do not truncate while comparing paths We failed to use ce_namelen() equivalent and instead only compared up to the CE_NAMEMASK bytes by mistake. Adding an overlong path that shares the same common prefix as an existing entry in the index did not add a new entry, but instead replaced the existing one, as the result. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-11 09:25:56 -07:00
Thomas Gummerer	68c4f6a577	Replace strlen() with ce_namelen() Replace strlen(ce->name) with ce_namelen() in a couple of places which gives us some additional bits of performance. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-08 19:49:34 -07:00
Junio C Hamano	d4a5d872c0	Merge branch 'jc/index-v4' Trivially shrinks the on-disk size of the index file to save both I/O and checksum overhead. The topic should give a solid base to build on further updates, with the code refactoring in its earlier parts, and the backward compatibility mechanism in its later parts. * jc/index-v4: index-v4: document the entry format unpack-trees: preserve the index file version of original update-index: upgrade/downgrade on-disk index version read-cache.c: write prefix-compressed names in the index read-cache.c: read prefix-compressed names in index on-disk version v4 read-cache.c: move code to copy incore to ondisk cache to a helper function read-cache.c: move code to copy ondisk to incore cache to a helper function read-cache.c: report the header version we do not understand read-cache.c: make create_from_disk() report number of bytes it consumed read-cache.c: allow unaligned mapping of the index file cache.h: hide on-disk index details varint: make it available outside the context of pack	2012-05-02 13:51:13 -07:00
Junio C Hamano	9d227781b6	read-cache.c: write prefix-compressed names in the index Teach the code to write the index in the v4 on-disk format. Record the format version of the on-disk index we read from in the index_state, and use the format when writing the new index out. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-04 09:57:49 -07:00
Junio C Hamano	6c9cd161d9	read-cache.c: read prefix-compressed names in index on-disk version v4 Because the entries are sorted by path, adjacent entries in the index tend to share the leading components of them, and it makes sense to only store the differences in later entries. In the v4 on-disk format of the index, each on-disk cache entry stores the number of bytes to be stripped from the end of the previous name, and the bytes to append to the result, to come up with its name. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-03 16:24:46 -07:00
Junio C Hamano	f136f7bfe8	read-cache.c: move code to copy incore to ondisk cache to a helper function This makes the change in a later patch look less scary. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-03 16:24:46 -07:00
Junio C Hamano	3fc22b5331	read-cache.c: move code to copy ondisk to incore cache to a helper function This makes the change in a later patch look less scary. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-03 16:24:46 -07:00
Junio C Hamano	0136bac9b8	read-cache.c: report the header version we do not understand Instead of just saying "bad index version", report the value we read from the disk. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-03 16:24:45 -07:00
Junio C Hamano	936f53d055	read-cache.c: make create_from_disk() report number of bytes it consumed The function is the one that is reading from the data stream. It only is natural to make it responsible for reporting this number, not the caller. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-03 16:24:45 -07:00
Junio C Hamano	d60c49c2d7	read-cache.c: allow unaligned mapping of the index file Both the on-disk format v2 and v3 pads the "name" field to the multiple of eight to make sure that various quantities in network long/short type can be accessed with ntohl/ntohs without having to worry about alignment, but this forces us to waste disk I/O bandwidth. Introduce ntoh_s()/ntoh_l() macros that the callers can use as if they were the regular ntohs()/ntohl() on a field that may not be aligned correctly. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-03 16:24:45 -07:00
Junio C Hamano	db3b313c84	cache.h: hide on-disk index details The on-disk format of the index file is a detail whose implementation is neatly encapsulated in read-cache.c; there is no need to expose it to the general public that include the cache.h header file. Also add a prominent mark to read-cache.c to delineate the parts that deal with the index file I/O routines from the remainder of the file. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-04-03 16:24:45 -07:00
Jeff King	f8582cad8d	make is_empty_blob_sha1 available everywhere The read-cache implementation defines this static function, but it is a generally useful concept in git. Let's give the empty blob the same treatment as the empty tree, providing both hex and binary forms of the sha1. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-03-23 13:52:13 -07:00
Junio C Hamano	3d1f148c33	refresh_index: do not show unmerged path that is outside pathspec When running "git add --refresh <pathspec>", we incorrectly showed the path that is unmerged even if it is outside the specified pathspec, even though we did honor pathspec and refreshed only the paths that matched. Note that this cange does not affect "git update-index --refresh"; for hysterical raisins, it does not take a pathspec (it takes real paths) and more importantly itss command line options are parsed and executed one by one as they are encountered, so "git update-index --refresh foo" means "first refresh the index, and then update the entry 'foo' by hashing the contents in file 'foo'", not "refresh only entry 'foo'". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-02-17 10:11:05 -08:00
Junio C Hamano	ef87690b27	Merge branch 'rs/allocate-cache-entry-individually' * rs/allocate-cache-entry-individually: cache.h: put single NUL at end of struct cache_entry read-cache.c: allocate index entries individually Conflicts: read-cache.c	2011-12-09 13:36:56 -08:00
Jeff King	73b7eae60c	refresh_index: make porcelain output more specific If you have a deleted file and a porcelain refreshes the cache, we print: Unstaged changes after reset: M file This is technically correct, in that the file is modified, but it's friendlier to the user if we further differentiate the case of a deleted file (especially because this output looks a lot like "diff --name-status", which would also make the distinction). Similarly, we can distinguish typechanges ("T") and intent-to-add files ("A"), both of which appear as just "M" in the current output. The plumbing output for all cases remains "needs update" for historical compatibility. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-11-18 11:55:58 -08:00
Jeff King	4bd4e73093	refresh_index: rename format variables When refreshing the index, for modified (or unmerged) files we will print "needs update" (or "needs merge") for plumbing, or line similar to the output from "diff --name-status" for porcelain. The variables holding which type of message to show are named after the plumbing messages. However, as we begin to differentiate more cases at the porcelain level (with the plumbing message staying the same), that naming scheme will become awkward. Instead, name the variables after which case we found (modified or unmerged), not what we will output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-11-18 11:55:05 -08:00
Jeff King	d05e697010	read-cache: let refresh_cache_ent pass up changed flags This will enable refresh_cache to differentiate more cases of modification (such as typechange) when telling the user what isn't fresh. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-11-18 11:53:46 -08:00
René Scharfe	debed2a629	read-cache.c: allocate index entries individually The code to estimate the in-memory size of the index based on its on-disk representation is subtly wrong for certain architecture-dependent struct layouts. Instead of fixing it, replace the code to keep the index entries in a single large block of memory and allocate each entry separately instead. This is both simpler and more flexible, as individual entries can now be freed. Actually using that added flexibility is left for a later patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-10-26 15:25:59 -07:00
René Scharfe	8f41c07f90	read-cache.c: fix index memory allocation estimate_cache_size() tries to guess how much memory is needed for the in-memory representation of an index file. It does that by using the file size, the number of entries and the difference of the sizes of the on-disk and in-memory structs -- without having to check the length of the name of each entry, which varies for each entry, but their sums are the same no matter the representation. Except there can be a difference. First of all, the size is really calculated by ce_size and ondisk_ce_size based on offsetof(..., name), not sizeof, which can be different. And entries are padded with 1 to 8 NULs at the end (after the variable name) to make their total length a multiple of eight. So in order to allocate enough memory to hold the index, change the delta calculation to be based on offsetof(..., name) and round up to the next multiple of eight. On a 32-bit Linux, this delta was used before: sizeof(struct cache_entry) == 72 sizeof(struct ondisk_cache_entry) == 64 --- 8 The actual difference for an entry with a filename length of one was, however (find the definitions are in cache.h): offsetof(struct cache_entry, name) == 72 offsetof(struct ondisk_cache_entry, name) == 62 ce_size == (72 + 1 + 8) & ~7 == 80 ondisk_ce_size == (62 + 1 + 8) & ~7 == 64 --- 16 So eight bytes less had been allocated for such entries. The new formula yields the correct delta: (72 - 62 + 7) & ~7 == 16 Reported-by: John Hsing <tsyj2007@gmail.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-10-26 14:35:16 -07:00
Junio C Hamano	1952e102b7	Merge branch 'maint' * maint: whitespace: have SP on both sides of an assignment "=" update-ref: whitespace fix	2011-08-25 16:00:07 -07:00
Junio C Hamano	cd2b8ae983	whitespace: have SP on both sides of an assignment "=" I've deliberately excluded the borrowed code in compat/nedmalloc directory. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-08-25 14:47:07 -07:00
Junio C Hamano	033c2dc436	Merge branch 'ef/maint-win-verify-path' * ef/maint-win-verify-path: verify_dotfile(): do not assume '/' is the path seperator verify_path(): simplify check at the directory boundary verify_path: consider dos drive prefix real_path: do not assume '/' is the path seperator A Windows path starting with a backslash is absolute	2011-06-29 17:09:17 -07:00
Theo Niessink	e0f530ff8a	verify_dotfile(): do not assume '/' is the path seperator verify_dotfile() currently assumes that the path seperator is '/', but on Windows it can also be '\\', so use is_dir_sep() instead. Signed-off-by: Theo Niessink <theo@taletn.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-06-08 16:34:38 -07:00
Junio C Hamano	3bdf09c7f5	verify_path(): simplify check at the directory boundary We simply want to say "At a directory boundary, be careful with a name that begins with a dot, forbid a name that ends with the boundary character or has duplicated bounadry characters". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-06-07 12:22:51 -07:00
Erik Faye-Lund	56948cb6aa	verify_path: consider dos drive prefix If someone manage to create a repo with a 'C:' entry in the root-tree, files can be written outside of the working-dir. This opens up a can-of-worms of exploits. Fix it by explicitly checking for a dos drive prefix when verifying a paht. While we're at it, make sure that paths beginning with '\' is considered absolute as well. Noticed-by: Theo Niessink <theo@taletn.com> Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-27 10:59:18 -07:00
Junio C Hamano	c4ce46fc7a	index_fd(): turn write_object and format_check arguments into one flag The "format_check" parameter tucked after the existing parameters is too ugly an afterthought to live in any reasonable API. Combine it with the other boolean parameter "write_object" into a single "flags" parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-05-09 11:58:19 -07:00
Junio C Hamano	44ec754dc7	Merge branch 'jc/index-update-if-able' into maint * jc/index-update-if-able: update $GIT_INDEX_FILE when there are racily clean entries diff/status: refactor opportunistic index update	2011-04-03 12:33:05 -07:00
Junio C Hamano	149971badc	Merge branch 'jc/index-update-if-able' * jc/index-update-if-able: update $GIT_INDEX_FILE when there are racily clean entries diff/status: refactor opportunistic index update	2011-03-26 20:13:16 -07:00
Junio C Hamano	483fbe2b7c	update $GIT_INDEX_FILE when there are racily clean entries Traditional "opportunistic index update" done by read-only "diff" and "status" was about updating cached lstat(2) information in the index for the next round. We missed another obvious optimization opportunity: when there are racily clean entries that will cease to be racily clean by updating $GIT_INDEX_FILE. Detect that case and write $GIT_INDEX_FILE out to give it a newer timestamp. Noticed by Lasse Makholm by stracing "git status" in a fresh checkout and counting the number of open(2) calls. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-03-21 14:49:46 -07:00
Junio C Hamano	ccdc4ec304	diff/status: refactor opportunistic index update When we had to refresh the index internally before running diff or status, we opportunistically updated the $GIT_INDEX_FILE so that later invocation of git can use the lstat(2) we already did in this invocation. Make them share a helper function to do so. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-03-21 12:43:10 -07:00
Junio C Hamano	fc7ae9c156	Merge branch 'nd/hash-object-sanity' * nd/hash-object-sanity: Make hash-object more robust against malformed objects Conflicts: cache.h	2011-02-27 21:58:30 -08:00
Junio C Hamano	d5c87a802d	Merge branch 'nd/struct-pathspec' * nd/struct-pathspec: (22 commits) t6004: add pathspec globbing test for log family t7810: overlapping pathspecs and depth limit grep: drop pathspec_matches() in favor of tree_entry_interesting() grep: use writable strbuf from caller for grep_tree() grep: use match_pathspec_depth() for cache/worktree grepping grep: convert to use struct pathspec Convert ce_path_match() to use match_pathspec_depth() Convert ce_path_match() to use struct pathspec struct rev_info: convert prune_data to struct pathspec pathspec: add match_pathspec_depth() tree_entry_interesting(): optimize wildcard matching when base is matched tree_entry_interesting(): support wildcard matching tree_entry_interesting(): fix depth limit with overlapping pathspecs tree_entry_interesting(): support depth limit tree_entry_interesting(): refactor into separate smaller functions diff-tree: convert base+baselen to writable strbuf glossary: define pathspec Move tree_entry_interesting() to tree-walk.c and export it tree_entry_interesting(): remove dependency on struct diff_options Convert struct diff_options to use struct pathspec ...	2011-02-27 21:17:36 -08:00
Jonathan Nieder	046613c546	update-index --refresh --porcelain: add missing const Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-22 16:51:21 -08:00
Nguyễn Thái Ngọc Duy	c879daa237	Make hash-object more robust against malformed objects Commits, trees and tags have structure. Don't let users feed git with malformed ones. Sooner or later git will die() when encountering them. Note that this patch does not check semantics. A tree that points to non-existent objects is perfectly OK (and should be so, users may choose to add commit first, then its associated tree for example). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-07 15:05:25 -08:00
Nguyễn Thái Ngọc Duy	898bbd9fb4	Convert ce_path_match() to use match_pathspec_depth() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:30 -08:00
Nguyễn Thái Ngọc Duy	eb9cb55b94	Convert ce_path_match() to use struct pathspec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-02-03 14:08:30 -08:00
Junio C Hamano	5e738ae820	Merge branch 'jj/icase-directory' * jj/icase-directory: Support case folding in git fast-import when core.ignorecase=true Support case folding for git add when core.ignorecase=true Add case insensitivity support when using git ls-files Add case insensitivity support for directories when using git status Case insensitivity support for .gitignore via core.ignorecase Add string comparison functions that respect the ignore_case variable. Makefile & configure: add a NO_FNMATCH_CASEFOLD flag Makefile & configure: add a NO_FNMATCH flag Conflicts: Makefile config.mak.in configure.ac fast-import.c	2010-12-03 16:10:34 -08:00
Joshua Jensen	dc1ae70487	Support case folding for git add when core.ignorecase=true When MyDir/ABC/filea.txt is added to Git, the disk directory MyDir/ABC/ is renamed to mydir/aBc/, and then mydir/aBc/fileb.txt is added, the index will contain MyDir/ABC/filea.txt and mydir/aBc/fileb.txt. Although the earlier portions of this patch series account for those differences in case, this patch makes the pathing consistent by folding the case of newly added files against the first file added with that path. In read-cache.c's add_to_index(), the index_name_exists() support used for git status's case insensitive directory lookups is used to find the proper directory case according to what the user already checked in. That is, MyDir/ABC/'s case is used to alter the stored path for fileb.txt to MyDir/ABC/fileb.txt (instead of mydir/aBc/fileb.txt). This is especially important when cloning a repository to a case sensitive file system. MyDir/ABC/ and mydir/aBc/ exist in the same directory on a Windows machine, but on Linux, the files exist in two separate directories. The update to add_to_index(), in effect, treats a Windows file system as case sensitive by making path case consistent. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-10-06 11:19:59 -07:00
Jonathan Nieder	59efba64ac	core: Stop leaking ondisk_cache_entrys Noticed with valgrind. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-11 09:57:43 -07:00
Shawn O. Pearce	b659b49bb0	Correct spelling of 'REUC' extension The new dircache extension CACHE_EXT_RESOLVE_UNDO, whose value is 0x52455543, is actually the ASCII sequence 'REUC', not the ASCII sequence 'REUN'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-02-02 09:54:34 -08:00
Junio C Hamano	125fd98434	Make ce_uptodate() trustworthy again The rule has always been that a cache entry that is ce_uptodate(ce) means that we already have checked the work tree entity and we know there is no change in the work tree compared to the index, and nobody should have to double check. Note that false ce_uptodate(ce) does not mean it is known to be dirty---it only means we don't know if it is clean. There are a few codepaths (refresh-index and preload-index are among them) that mark a cache entry as up-to-date based solely on the return value from ie_match_stat(); this function uses lstat() to see if the work tree entity has been touched, and for a submodule entry, if its HEAD points at the same commit as the commit recorded in the index of the superproject (a submodule that is not even cloned is considered clean). A submodule is no longer considered unmodified merely because its HEAD matches the index of the superproject these days, in order to prevent people from forgetting to commit in the submodule and updating the superproject index with the new submodule commit, before commiting the state in the superproject. However, the patch to do so didn't update the codepath that marks cache entries up-to-date based on the updated definition and instead worked it around by saying "we don't trust the return value of ce_uptodate() for submodules." This makes ce_uptodate() trustworthy again by not marking submodule entries up-to-date. The next step _could_ be to introduce a few "in-core" flag bits to cache_entry structure to record "this entry is _known_ to be dirty", call is_submodule_modified() from ie_match_stat(), and use these new bits to avoid running this rather expensive check more than once, but that can be a separate patch. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-24 00:15:29 -08:00
Linus Torvalds	fb7d3f32b2	Remove diff machinery dependency from read-cache Exal Sibeaz pointed out that some git files are way too big, and that add_files_to_cache() brings in all the diff machinery to any git binary that needs the basic git SHA1 object operations from read-cache.c. Which is pretty much all of them. It's doubly silly, since add_files_to_cache() is only used by builtin programs (add, checkout and commit), so it's fairly easily fixed by just moving the thing to builtin-add.c, and avoiding the dependency entirely. I initially argued to Exal that it would probably be best to try to depend on smart compilers and linkers, but after spending some time trying to make -ffunction-sections work and giving up, I think Exal was right, and the fix is to just do some trivial cleanups like this. This trivial cleanup results in pretty stunning file size differences. The diff machinery really is mostly used by just the builtin programs, and you have things like these trivial before-and-after numbers: -rwxr-xr-x 1 torvalds torvalds 1727420 2010-01-21 10:53 git-hash-object -rwxrwxr-x 1 torvalds torvalds 940265 2010-01-21 11:16 git-hash-object Now, I'm not saying that 940kB is good either, but that's mostly all the debug information - you can see the real code with 'size': text data bss dec hex filename 418675 3920 127408 550003 86473 git-hash-object (before) 230650 2288 111728 344666 5425a git-hash-object (after) ie we have a nice 24% size reduction from this trivial cleanup. It's not just that one file either. I get: [torvalds@nehalem git]$ du -s /home/torvalds/libexec/git-core 45640 /home/torvalds/libexec/git-core (before) 33508 /home/torvalds/libexec/git-core (after) so we're talking 12MB of diskspace here. (Of course, stripping all the binaries brings the 33MB down to 9MB, so the whole debug information thing is still the bulk of it all, but that's a separate issue entirely) Now, I'm sure there are other things we should do, and changing our compiler flags from -O2 to -Os would bring the text size down by an additional almost 20%, but this thing Exal pointed out seems to be some good low-hanging fruit. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-21 17:05:13 -08:00
Junio C Hamano	6751e0471d	Merge branch 'jc/cache-unmerge' * jc/cache-unmerge: rerere forget path: forget recorded resolution rerere: refactor rerere logic to make it independent from I/O rerere: remove silly 1024-byte line limit resolve-undo: teach "update-index --unresolve" to use resolve-undo info resolve-undo: "checkout -m path" uses resolve-undo information resolve-undo: allow plumbing to clear the information resolve-undo: basic tests resolve-undo: record resolved conflicts in a new index extension section builtin-merge.c: use standard active_cache macros Conflicts: builtin-ls-files.c builtin-merge.c builtin-rerere.c	2010-01-20 14:46:35 -08:00
Junio C Hamano	56eb8b43eb	Merge branch 'jc/symbol-static' * jc/symbol-static: date.c: mark file-local function static Replace parse_blob() with an explanatory comment symlinks.c: remove unused functions object.c: remove unused functions strbuf.c: remove unused function sha1_file.c: remove unused function mailmap.c: remove unused function utf8.c: mark file-local function static submodule.c: mark file-local function static quote.c: mark file-local function static remote-curl.c: mark file-local function static read-cache.c: mark file-local functions static parse-options.c: mark file-local function static entry.c: mark file-local function static http.c: mark file-local functions static pretty.c: mark file-local function static builtin-rev-list.c: mark file-local function static bisect.c: mark file-local function static	2010-01-20 14:37:25 -08:00
Junio C Hamano	dc96c5ee70	Merge branch 'cc/reset-more' * cc/reset-more: t7111: check that reset options work as described in the tables Documentation: reset: add some missing tables Fix bit assignment for CE_CONFLICTED "reset --merge": fix unmerged case reset: use "unpack_trees()" directly instead of "git read-tree" reset: add a few tests for "git reset --merge" Documentation: reset: add some tables to describe the different options reset: improve mixed reset error message when in a bare repo	2010-01-13 11:58:56 -08:00
Junio C Hamano	73d66323ac	Merge branch 'nd/sparse' * nd/sparse: (25 commits) t7002: test for not using external grep on skip-worktree paths t7002: set test prerequisite "external-grep" if supported grep: do not do external grep on skip-worktree entries commit: correctly respect skip-worktree bit ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID tests: rename duplicate t1009 sparse checkout: inhibit empty worktree Add tests for sparse checkout read-tree: add --no-sparse-checkout to disable sparse checkout support unpack-trees(): ignore worktree check outside checkout area unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout unpack-trees.c: generalize verify_* functions unpack-trees(): add CE_WT_REMOVE to remove on worktree alone Introduce "sparse checkout" dir.c: export excluded_1() and add_excludes_from_file_1() excluded_1(): support exclude files in index unpack-trees(): carry skip-worktree bit over in merged_entry() Read .gitignore from index if it is skip-worktree Avoid writing to buffer in add_excludes_from_file_1() ... Conflicts: .gitignore Documentation/config.txt Documentation/git-update-index.txt Makefile entry.c t/t7002-grep.sh	2010-01-13 11:58:34 -08:00
Junio C Hamano	87b29e5a5a	read-cache.c: mark file-local functions static Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-12 01:06:08 -08:00
Junio C Hamano	e11d7b5969	"reset --merge": fix unmerged case Commit `9e8ecea` (Add 'merge' mode to 'git reset', 2008-12-01) disallowed "git reset --merge" when there was unmerged entries. But it wished if unmerged entries were reset as if --hard (instead of --merge) has been used. This makes sense because all "mergy" operations makes sure that any path involved in the merge does not have local modifications before starting, so resetting such a path away won't lose any information. The previous commit changed the behavior of --merge to accept resetting unmerged entries if they are reset to a different state than HEAD, but it did not reset the changes in the work tree, leaving the conflict markers in the resulting file in the work tree. Fix it by doing three things: - Update the documentation to match the wish of original "reset --merge" better, namely, "An unmerged entry is a sign that the path didn't have any local modification and can be safely resetted to whatever the new HEAD records"; - Update read_index_unmerged(), which reads the index file into the cache while dropping any higher-stage entries down to stage #0, not to copy the object name from the higher stage entry. The code used to take the object name from the a stage entry ("base" if you happened to have stage #1, or "ours" if both sides added, etc.), which essentially meant that you are getting random results depending on what the merge did. The _only_ reason we want to keep a previously unmerged entry in the index at stage #0 is so that we don't forget the fact that we have corresponding file in the work tree in order to be able to remove it when the tree we are resetting to does not have the path. In order to differentiate such an entry from ordinary cache entry, the cache entry added by read_index_unmerged() is marked as CE_CONFLICTED. - Update merged_entry() and deleted_entry() so that they pay attention to cache entries marked as CE_CONFLICTED. They are previously unmerged entries, and the files in the work tree that correspond to them are resetted away by oneway_merge() to the version from the tree we are resetting to. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-03 16:01:05 -08:00

1 2 3 4 5 ...