mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-19 23:44:50 +01:00

Author	SHA1	Message	Date
Junio C Hamano	380a742679	Merge branch 'lt/case-insensitive' * lt/case-insensitive: Make git-add behave more sensibly in a case-insensitive environment When adding files to the index, add support for case-independent matches Make unpack-tree update removed files before any updated files Make branch merging aware of underlying case-insensitive filsystems Add 'core.ignorecase' option Make hash_name_lookup able to do case-independent lookups Make "index_name_exists()" return the cache_entry it found Move name hashing functions into a file of its own Make unpack_trees_options bit flags actual bitfields	2008-05-10 18:14:28 -07:00
Junio C Hamano	31a3c6bb45	Merge branch 'db/learn-HEAD' * db/learn-HEAD: Make ls-remote http://... list HEAD, like for git://... Make walker.fetch_ref() take a struct ref.	2008-05-08 20:06:23 -07:00
Junio C Hamano	e2e2defc14	Merge branch 'lh/git-file' * lh/git-file: Teach GIT-VERSION-GEN about the .git file Teach git-submodule.sh about the .git file Teach resolve_gitlink_ref() about the .git file Add platform-independent .git "symlink"	2008-05-05 19:16:16 -07:00
Heikki Orsila	0104ca09e3	Make read_in_full() and write_in_full() consistent with xread() and xwrite() xread() and xwrite() return ssize_t values as their native POSIX counterparts read(2) and write(2). To be consistent, read_in_full() and write_in_full() should also return ssize_t values. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-29 23:11:57 -07:00
Daniel Barkalow	be885d96fe	Make ls-remote http://... list HEAD, like for git://... This makes a struct ref able to represent a symref, and makes http.c able to recognize one, and makes transport.c look for "HEAD" as a ref in the list, and makes it dereference symrefs for the resulting ref, if any. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-26 17:36:18 -07:00
Heikki Orsila	06cbe85503	Make core.sharedRepository more generic git init --shared=0xxx, where '0xxx' is an octal number, will create a repository with file modes set to '0xxx'. Users with a safe umask value (0077) can use this option to force file modes. For example, '0640' is a group-readable but not group-writable regardless of user's umask value. Values compatible with old Git versions are written as they were before, for compatibility reasons. That is, "1" for "group" and "2" for "everybody". "git config core.sharedRepository 0xxx" is also handled. Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-16 18:23:54 -07:00
Junio C Hamano	a53f2ec617	git_config_bool_or_int() This new function can be used by config parsers to tell if a variable is simply set, set to 1, or set to "true". Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-12 18:38:40 -07:00
Lars Hjemli	b44ebb19e3	Add platform-independent .git "symlink" This patch allows .git to be a regular textfile containing the path of the real git directory (prefixed with "gitdir: "), which can be useful on platforms lacking support for real symlinks. Signed-off-by: Lars Hjemli <hjemli@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:22:50 -07:00
Linus Torvalds	1102952b45	Make git-add behave more sensibly in a case-insensitive environment This expands on the previous patch, and allows "git add" to sanely handle a filename that has changed case, keeping the case in the index constant, and avoiding aliases. In particular, if you have an index entry called "File", but the checked-out tree is case-corrupted and has an entry called "file" instead, doing a git add . (or naming "file" explicitly) will automatically notice that we have an alias, and will replace the name "file" with the existing index capitalization (ie "File"). However, if we actually have both a file called "File" and one called "file", and they don't have the same lstat() information (ie we're on a case-sensitive filesystem but have the "core.ignorecase" flag set), we will error out if we try to add them both. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:22:25 -07:00
Linus Torvalds	0a9b88b7de	Add 'core.ignorecase' option ..and start using it for directory entry traversal (ie "git status" will not consider entries that match an existing entry case-insensitively to be a new file) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:22:25 -07:00
Linus Torvalds	cd2fef59ed	Make hash_name_lookup able to do case-independent lookups Right now nobody uses it, but "index_name_exists()" gets a flag so you can enable it on a case-by-case basis. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:22:25 -07:00
Linus Torvalds	df292c791a	Make "index_name_exists()" return the cache_entry it found This allows verify_absent() in unpack_trees() to use the hash chains rather than looking it up using the binary search. Perhaps more importantly, it's also going to be useful for the next phase, where we actually start looking at the cache entry when we do case-insensitive lookups and checking the result. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:22:25 -07:00
Linus Torvalds	96872bc200	Move name hashing functions into a file of its own It's really totally separate functionality, and if we want to start doing case-insensitive hash lookups, I'd rather do it when it's separated out. It also renames "remove_index_entry()" to "remove_name_hash()", because that really describes the thing better. It doesn't actually remove the index entry, that's done by "remove_index_entry_at()", which is something very different, despite the similarity in names. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-04-09 01:22:25 -07:00
Junio C Hamano	b85997d14d	Merge branch 'lt/unpack-trees' * lt/unpack-trees: unpack_trees(): fix diff-index regression. traverse_trees_recursive(): propagate merge errors up unpack_trees(): minor memory leak fix in unused destination index Make 'unpack_trees()' have a separate source and destination index Make 'unpack_trees()' take the index to work on as an argument Add 'const' where appropriate to index handling functions Fix tree-walking compare_entry() in the presense of --prefix Move 'unpack_trees()' over to 'traverse_trees()' interface Make 'traverse_trees()' traverse conflicting DF entries in parallel Add return value to 'traverse_tree()' callback Make 'traverse_tree()' use linked structure rather than 'const char *base' Add 'df_name_compare()' helper function	2008-03-11 22:13:44 -07:00
Junio C Hamano	5d921e2931	Merge branch 'jc/cherry-pick' (early part) * 'jc/cherry-pick' (early part): expose a helper function peel_to_type(). merge-recursive: split low-level merge functions out. Conflicts: Makefile builtin-merge-recursive.c sha1_name.c	2008-03-11 02:05:12 -07:00
Linus Torvalds	d1f128b050	Add 'const' where appropriate to index handling functions This is in an effort to make the source index of 'unpack_trees()' as being const, and thus making the compiler help us verify that we only access it for reading. The constification also extended to some of the hashing helpers that get called indirectly. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-09 00:43:48 -08:00
Linus Torvalds	0ab9e1e8cd	Add 'df_name_compare()' helper function This new helper is identical to base_name_compare(), except it compares conflicting directory/file entries as equal in order to help handling DF conflicts (thus the name). Note that while a directory name compares as equal to a regular file with the new helper, they then individually compare _differently_ to a filename that has a dot after the basename (because '\0' < '.' < '/'). So a directory called "foo/" will compare equal to a file "foo", even though "foo.c" will compare after "foo" and before "foo/" This will be used by routines that want to traverse the git namespace but then handle conflicting entries together when possible. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-03-09 00:43:46 -08:00
Junio C Hamano	eadbcd498a	Merge branch 'mk/maint-parse-careful' * mk/maint-parse-careful: receive-pack: use strict mode for unpacking objects index-pack: introduce checking mode unpack-objects: prevent writing of inconsistent objects unpack-object: cache for non written objects add common fsck error printing function builtin-fsck: move common object checking code to fsck.c builtin-fsck: reports missing parent commits Remove unused object-ref code builtin-fsck: move away from object-refs to fsck_walk add generic, type aware object chain walker Conflicts: Makefile builtin-fsck.c	2008-03-02 15:11:07 -08:00
Junio C Hamano	60b188a984	Merge branch 'js/branch-track' * js/branch-track: doc: documentation update for the branch track changes branch: optionally setup branch.*.merge from upstream local branches Conflicts: Documentation/config.txt Documentation/git-branch.txt Documentation/git-checkout.txt builtin-branch.c cache.h t/t7201-co.sh	2008-02-27 13:02:57 -08:00
Junio C Hamano	5a4d707a6d	Merge branch 'db/checkout' * db/checkout: (21 commits) checkout: error out when index is unmerged even with -m checkout: show progress when checkout takes long time while switching branches Add merge-subtree back checkout: updates to tracking report builtin-checkout.c: Remove unused prefix arguments in switch_branches path checkout: work from a subdirectory checkout: tone down the "forked status" diagnostic messages Clean up reporting differences on branch switch builtin-checkout.c: fix possible usage segfault checkout: notice when the switched branch is behind or forked Build in checkout Move code to clean up after a branch change to branch.c Library function to check for unmerged index entries Use diff -u instead of diff in t7201 Move create_branch into a library file Build-in merge-recursive Add "skip_unmerged" option to unpack_trees. Discard "deleted" cache entries after using them to update the working tree Send unpack-trees debugging output to stderr Add flag to make unpack_trees() not print errors. ... Conflicts: Makefile	2008-02-27 12:53:26 -08:00
Junio C Hamano	d87aa32935	Merge branch 'jk/help-alias' * jk/help-alias: help: respect aliases make alias lookup a public, procedural function help: use parseopt	2008-02-27 11:55:43 -08:00
Junio C Hamano	2db511fdbd	Merge branch 'maint' * maint: Documentation/git-am.txt: Pass -r in the example invocation of rm -f .dotest timezone_names[]: fixed the tz offset for New Zealand. filter-branch documentation: non-zero exit status in command abort the filter rev-parse: fix potential bus error with --parseopt option spec handling Use a single implementation and API for copy_file() Documentation/git-filter-branch: add a new msg-filter example Correct fast-export file mode strings to match fast-import standard	2008-02-26 00:14:22 -08:00
Martin Koegler	355885d531	add generic, type aware object chain walker The requirements are: * it may not crash on NULL pointers * a callback function is needed, as index-pack/unpack-objects need to do different things * the type information is needed to check the expected <-> real type and print better error messages Signed-off-by: Martin Koegler <mkoegler@auto.tuwien.ac.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-25 23:57:34 -08:00
Daniel Barkalow	1468bd4783	Use a single implementation and API for copy_file() Originally by Kristian Hï¿œgsberg; I fixed the conversion of rerere, which had a different API. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-25 13:06:49 -08:00
Jeff King	94351118c0	make alias lookup a public, procedural function This converts git_config_alias to the public alias_lookup function. Because of the nature of our config parser, we still have to rely on setting static data. However, that interface is wrapped so that you can just say value = alias_lookup(key); Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-24 18:31:49 -08:00
Junio C Hamano	e38f892d18	Merge branch 'jc/apply-whitespace' * jc/apply-whitespace: ws_fix_copy(): move the whitespace fixing function to ws.c apply: do not barf on patch with too large an offset core.whitespace: cr-at-eol git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run builtin-apply.c: pass ws_rule down to match_fragment() builtin-apply.c: move copy_wsfix() function a bit higher. builtin-apply.c: do not feed copy_wsfix() leading '+' builtin-apply.c: simplify calling site to apply_line() builtin-apply.c: clean-up apply_one_fragment() builtin-apply.c: mark common context lines in lineinfo structure. builtin-apply.c: optimize match_beginning/end processing a bit. builtin-apply.c: make it more line oriented builtin-apply.c: push match-beginning/end logic down builtin-apply.c: restructure "offset" matching builtin-apply.c: refactor small part that matches context	2008-02-24 17:23:17 -08:00
Junio C Hamano	fe3403c320	ws_fix_copy(): move the whitespace fixing function to ws.c This is used by git-apply but we can use it elsewhere by slightly generalizing it. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-23 16:59:16 -08:00
Linus Torvalds	eb7a2f1d50	Use helper function for copying index entry information We used to just memcpy() the index entry when we copied the stat() and SHA1 hash information, which worked well enough back when the index entry was just an exact bit-for-bit representation of the information on disk. However, these days we actually have various management information in the cache entry too, and we should be careful to not overwrite it when we copy the stat information from another index entry. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-22 21:24:47 -08:00
Linus Torvalds	d070e3a31b	Name hash fixups: export (and rename) remove_hash_entry This makes the name hash removal function (which really just sets the bit that disables lookups of it) available to external routines, and makes read_cache_unmerged() use it when it drops an unmerged entry from the index. It's renamed to remove_index_entry(), and we drop the (unused) 'istate' argument. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-22 21:24:47 -08:00
Linus Torvalds	a22c637124	Fix name re-hashing semantics We handled the case of removing and re-inserting cache entries badly, which is something that merging commonly needs to do (removing the different stages, and then re-inserting one of them as the merged state). We even had a rather ugly special case for this failure case, where replace_index_entry() basically turned itself into a no-op if the new and the old entries were the same, exactly because the hash routines didn't handle it on their own. So what this patch does is to not just have the UNHASHED bit, but a HASHED bit too, and when you insert an entry into the name hash, that involves: - clear the UNHASHED bit, because now it's valid again for lookup (which is really all that UNHASHED meant) - if we're being lazy, we're done here (but we still want to clear the UNHASHED bit regardless of lazy mode, since we can become unlazy later, and so we need the UNHASHED bit to always be set correctly, even if we never actually insert the entry into the hash list) - if it was already hashed, we just leave it on the list - otherwise mark it HASHED and insert it into the list this all means that unhashing and rehashing a name all just works automatically. Obviously, you cannot change the name of an entry (that would be a serious bug), but nothing can validly do that anyway (you'd have to allocate a new struct cache_entry anyway since the name length could change), so that's not a new limitation. The code actually gets simpler in many ways, although the lazy hashing does mean that there are a few odd cases (ie something can be marked unhashed even though it was never on the hash in the first place, and isn't actually marked hashed!). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-22 21:24:47 -08:00
Jay Soffian	9ed36cfa35	branch: optionally setup branch..merge from upstream local branches "git branch" and "git checkout -b" now honor --track option even when the upstream branch is local. Previously --track was silently ignored when forking from a local branch. Also the command did not error out when --track was explicitly asked for but the forked point specified was not an existing branch (i.e. when there is no way to set up the tracking configuration), but now it correctly does. The configuration setting branch.autosetupmerge can now be set to "always", which is equivalent to using --track from the command line. Setting branch.autosetupmerge to "true" will retain the former behavior of only setting up branch..merge for remote upstream branches. Includes test cases for the new functionality. Signed-off-by: Jay Soffian <jaysoffian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-19 21:17:45 -08:00
Junio C Hamano	8177631547	expose a helper function peel_to_type(). This helper function is the core of "$object^{type}" parser. Now it is made available to callers outside sha1_name.c	2008-02-18 00:51:05 -08:00
Junio C Hamano	2ac4b4b222	Merge branch 'sp/safecrlf' * sp/safecrlf: safecrlf: Add mechanism to warn about irreversible crlf conversions	2008-02-16 17:59:20 -08:00
Junio C Hamano	987e315a6b	Merge branch 'jc/gitignore-ends-with-slash' * jc/gitignore-ends-with-slash: gitignore: lazily find dtype gitignore(5): Allow "foo/" in ignore list to match directory "foo"	2008-02-16 17:57:06 -08:00
Junio C Hamano	fef1c4c0a0	Merge branch 'jk/noetcconfig' * jk/noetcconfig: fix config reading in tests allow suppressing of global and system config Conflicts: cache.h	2008-02-16 17:56:51 -08:00
Junio C Hamano	d5558581d2	Merge branch 'maint' * maint: commit: discard index after setting up partial commit filter-branch: handle filenames that need quoting diff: Fix miscounting of --check output hg-to-git: fix parent analysis mailinfo: feed only one line to handle_filter() for QP input diff.c: add "const" qualifier to "char cmd" member of "struct ll_diff_driver" Add "const" qualifier to "char excludes_file". Add "const" qualifier to "char editor_program". Add "const" qualifier to "char pager_program". config: add 'git_config_string' to refactor string config variables. diff.c: remove useless check for value != NULL fast-import: check return value from unpack_entry() Validate nicknames of remote branches to prohibit confusing ones diff.c: replace a 'strdup' with 'xstrdup'. diff.c: fixup garding of config parser from value=NULL	2008-02-16 00:20:37 -08:00
Christian Couder	dfb068be8d	Add "const" qualifier to "char *excludes_file". Also use "git_config_string" to simplify "config.c" code where "excludes_file" is set. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-15 21:24:54 -08:00
Christian Couder	ee9601e6be	Add "const" qualifier to "char *editor_program". Also use "git_config_string" to simplify "config.c" code where "editor_program" is set. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-15 21:24:53 -08:00
Christian Couder	872da32d80	Add "const" qualifier to "char *pager_program". Also use "git_config_string" to simplify "config.c" code where "pager_program" is set. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-15 21:24:53 -08:00
Christian Couder	ea5105a5e3	config: add 'git_config_string' to refactor string config variables. In many places we just check if a value from the config file is not NULL, then we duplicate it and return 0. This patch introduces the new 'git_config_string' function to do that. This function is also used to refactor some code in 'config.c'. Refactoring other files is left for other patches. Also not all the code in "config.c" is refactored, because the function takes a "const char *" as its first parameter, but in many places a "char " is used instead of a "const char ". (And C does not allow using a "char " instead of a "const char *" without a warning.) Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-15 21:24:53 -08:00
Junio C Hamano	e0197c9aae	Merge branch 'lt/in-core-index' * lt/in-core-index: lazy index hashing Create pathname-based hash-table lookup into index read-cache.c: introduce is_racy_timestamp() helper read-cache.c: fix a couple more CE_REMOVE conversion Also use unpack_trees() in do_diff_cache() Make run_diff_index() use unpack_trees(), not read_tree() Avoid running lstat(2) on the same cache entry. index: be careful when handling long names Make on-disk index representation separate from in-core one	2008-02-11 16:46:20 -08:00
Junio C Hamano	40ea4ed903	Add config_error_nonbool() helper function This is used to report misconfigured configuration file that does not give any value to a non-boolean variable, e.g. [section] var It is perfectly fine to say it if the section.var is a boolean (it means true), but if a variable expects a string value it should be flagged as a configuration error. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-11 13:11:36 -08:00
Daniel Barkalow	94a5728cfb	Library function to check for unmerged index entries It's small, but it was in three places already, so it should be in the library. Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>	2008-02-09 23:16:51 -08:00
Jeff King	ab88c36321	allow suppressing of global and system config The GIT_CONFIG_NOGLOBAL and GIT_CONFIG_NOSYSTEM environment variables are magic undocumented switches that can be used to ensure a totally clean environment. This is necessary for running reliable tests, since those config files may contain settings that change the outcome of tests. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-06 14:52:23 -08:00
Steffen Prohaska	21e5ad50fc	safecrlf: Add mechanism to warn about irreversible crlf conversions CRLF conversion bears a slight chance of corrupting data. autocrlf=true will convert CRLF to LF during commit and LF to CRLF during checkout. A file that contains a mixture of LF and CRLF before the commit cannot be recreated by git. For text files this is the right thing to do: it corrects line endings such that we have only LF line endings in the repository. But for binary files that are accidentally classified as text the conversion can corrupt data. If you recognize such corruption early you can easily fix it by setting the conversion type explicitly in .gitattributes. Right after committing you still have the original file in your work tree and this file is not yet corrupted. You can explicitly tell git that this file is binary and git will handle the file appropriately. Unfortunately, the desired effect of cleaning up text files with mixed line endings and the undesired effect of corrupting binary files cannot be distinguished. In both cases CRLFs are removed in an irreversible way. For text files this is the right thing to do because CRLFs are line endings, while for binary files converting CRLFs corrupts data. This patch adds a mechanism that can either warn the user about an irreversible conversion or can even refuse to convert. The mechanism is controlled by the variable core.safecrlf, with the following values: - false: disable safecrlf mechanism - warn: warn about irreversible conversions - true: refuse irreversible conversions The default is to warn. Users are only affected by this default if core.autocrlf is set. But the current default of git is to leave core.autocrlf unset, so users will not see warnings unless they deliberately chose to activate the autocrlf mechanism. The safecrlf mechanism's details depend on the git command. The general principles when safecrlf is active (not false) are: - we warn/error out if files in the work tree can modified in an irreversible way without giving the user a chance to backup the original file. - for read-only operations that do not modify files in the work tree we do not not print annoying warnings. There are exceptions. Even though... - "git add" itself does not touch the files in the work tree, the next checkout would, so the safety triggers; - "git apply" to update a text file with a patch does touch the files in the work tree, but the operation is about text files and CRLF conversion is about fixing the line ending inconsistencies, so the safety does not trigger; - "git diff" itself does not touch the files in the work tree, it is often run to inspect the changes you intend to next "git add". To catch potential problems early, safety triggers. The concept of a safety check was originally proposed in a similar way by Linus Torvalds. Thanks to Dimitry Potapov for insisting on getting the naked LF/autocrlf=true case right. Signed-off-by: Steffen Prohaska <prohaska@zib.de>	2008-02-06 13:07:28 -08:00
Junio C Hamano	d6b8fc303b	gitignore(5): Allow "foo/" in ignore list to match directory "foo" A pattern "foo/" in the exclude list did not match directory "foo", but a pattern "foo" did. This attempts to extend the exclude mechanism so that it would while not matching a regular file or a symbolic link "foo". In order to differentiate a directory and non directory, this passes down the type of path being checked to excluded() function. A downside is that the recursive directory walk may need to run lstat(2) more often on systems whose "struct dirent" do not give the type of the entry; earlier it did not have to do so for an excluded path, but we now need to figure out if a path is a directory before deciding to exclude it. This is especially bad because an idea similar to the earlier CE_UPTODATE optimization to reduce number of lstat(2) calls would by definition not apply to the codepaths involved, as (1) directories will not be registered in the index, and (2) excluded paths will not be in the index anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-05 00:46:49 -08:00
Junio C Hamano	b2979ff599	core.whitespace: cr-at-eol This new error mode allows a line to have a carriage return at the end of the line when checking and fixing trailing whitespace errors. Some people like to keep CRLF line ending recorded in the repository, and still want to take advantage of the automated trailing whitespace stripping. We still show ^M in the diff output piped to "less" to remind them that they do have the CR at the end, but these carriage return characters at the end are no longer flagged as errors. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-02-05 00:38:41 -08:00
Junio C Hamano	9cb76b8cdc	lazy index hashing This delays the hashing of index names until it becomes necessary for the first time. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-22 23:01:13 -08:00
Linus Torvalds	cf558704fb	Create pathname-based hash-table lookup into index This creates a hash index of every single file added to the index. Right now that hash index isn't actually used for much: I implemented a "cache_name_exists()" function that uses it to efficiently look up a filename in the index without having to do the O(logn) binary search, but quite frankly, that's not why this patch is interesting. No, the whole and only reason to create the hash of the filenames in the index is that by modifying the hash function, you can fairly easily do things like making it always hash equivalent names into the same bucket. That, in turn, means that suddenly questions like "does this name exist in the index under an _equivalent_ name?" becomes much much cheaper. Guiding principles behind this patch: - it shouldn't be too costly. In fact, my primary goal here was to actually speed up "git commit" with a fully populated kernel tree, by being faster at checking whether a file already existed in the index. I did succeed, but only barely: Best before: [torvalds@woody linux]$ time git commit > /dev/null real 0m0.255s user 0m0.168s sys 0m0.088s Best after: [torvalds@woody linux]$ time ~/git/git commit > /dev/null real 0m0.233s user 0m0.144s sys 0m0.088s so some things are actually faster (~8%). Caveat: that's really the best case. Other things are invariably going to be slightly slower, since we populate that index cache, and quite frankly, few things really use it to look things up. That said, the cost is really quite small. The worst case is probably doing a "git ls-files", which will do very little except puopulate the index, and never actually looks anything up in it, just lists it. Before: [torvalds@woody linux]$ time git ls-files > /dev/null real 0m0.016s user 0m0.016s sys 0m0.000s After: [torvalds@woody linux]$ time ~/git/git ls-files > /dev/null real 0m0.021s user 0m0.012s sys 0m0.008s and while the thing has really gotten relatively much slower, we're still talking about something almost unmeasurable (eg 5ms). And that really should be pretty much the worst case. So we lose 5ms on one "benchmark", but win 22ms on another. Pick your poison - this patch has the advantage that it will _likely_ speed up the cases that are complex and expensive more than it slows down the cases that are already so fast that nobody cares. But if you look at relative speedups/slowdowns, it doesn't look so good. - It should be simple and clean The code may be a bit subtle (the reasons I do hash removal the way I do etc), but it re-uses the existing hash.c files, so it really is fairly small and straightforward apart from a few odd details. Now, this patch on its own doesn't really do much, but I think it's worth looking at, if only because if done correctly, the name hashing really can make an improvement to the whole issue of "do we have a filename that looks like this in the index already". And at least it gets real testing by being used even by default (ie there is a real use-case for it even without any insane filesystems). NOTE NOTE NOTE! The current hash is a joke. I'm ashamed of it, I'm just not ashamed of it enough to really care. I took all the numbers out of my nether regions - I'm sure it's good enough that it works in practice, but the whole point was that you can make a really much fancier hash that hashes characters not directly, but by their upper-case value or something like that, and thus you get a case-insensitive hash, while still keeping the name and the index itself totally case sensitive. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-01-22 21:46:30 -08:00
Junio C Hamano	eadb583134	Avoid running lstat(2) on the same cache entry. Aside from the lstat(2) done for work tree files, there are quite many lstat(2) calls in refname dwimming codepath. This patch is not about reducing them. * It adds a new ce_flag, CE_UPTODATE, that is meant to mark the cache entries that record a regular file blob that is up to date in the work tree. If somebody later walks the index and wants to see if the work tree has changes, they do not have to be checked with lstat(2) again. * fill_stat_cache_info() marks the cache entry it just added with CE_UPTODATE. This has the effect of marking the paths we write out of the index and lstat(2) immediately as "no need to lstat -- we know it is up-to-date", from quite a lot fo callers: - git-apply --index - git-update-index - git-checkout-index - git-add (uses add_file_to_index()) - git-commit (ditto) - git-mv (ditto) * refresh_cache_ent() also marks the cache entry that are clean with CE_UPTODATE. * write_index is changed not to write CE_UPTODATE out to the index file, because CE_UPTODATE is meant to be transient only in core. For the same reason, CE_UPDATE is not written to prevent an accident from happening. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-01-21 12:44:31 -08:00

1 2 3 4 5 ...