mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-10-30 13:57:54 +01:00

Author	SHA1	Message	Date
Michael Haggerty	3bc581b940	refs: introduce an iterator interface Currently, the API for iterating over references is via a family of for_each_ref()-type functions that invoke a callback function for each selected reference. All of these eventually call do_for_each_ref(), which knows how to do one thing: iterate in parallel through two ref_caches, one for loose and one for packed refs, giving loose references precedence over packed refs. This is rather complicated code, and is quite specialized to the files backend. It also requires callers to encapsulate their work into a callback function, which often means that they have to define and use a "cb_data" struct to manage their context. The current design is already bursting at the seams, and will become even more awkward in the upcoming world of multiple reference storage backends: * Per-worktree vs. shared references are currently handled via a kludge in git_path() rather than iterating over each part of the reference namespace separately and merging the results. This kludge will cease to work when we have multiple reference storage backends. * The current scheme is inflexible. What if we sometimes want to bypass the ref_cache, or use it only for packed or only for loose refs? What if we want to store symbolic refs in one type of storage backend and non-symbolic ones in another? In the future, each reference backend will need to define its own way of iterating over references. The crux of the problem with the current design is that it is impossible to compose for_each_ref()-style iterations, because the flow of control is owned by the for_each_ref() function. There is nothing that a caller can do but iterate through all references in a single burst, so there is no way for it to interleave references from multiple backends and present the result to the rest of the world as a single compound backend. This commit introduces a new iteration primitive for references: a ref_iterator. A ref_iterator is a polymorphic object that a reference storage backend can be asked to instantiate. There are three functions that can be applied to a ref_iterator: * ref_iterator_advance(): move to the next reference in the iteration * ref_iterator_abort(): end the iteration before it is exhausted * ref_iterator_peel(): peel the reference currently being looked at Iterating using a ref_iterator leaves the flow of control in the hands of the caller, which means that ref_iterators from multiple sources (e.g., loose and packed refs) can be composed and presented to the world as a single compound ref_iterator. It also means that the backend code for implementing reference iteration will sometimes be more complicated. For example, the cache_ref_iterator (which iterates over a ref_cache) can't use the C stack to recurse; instead, it must manage its own stack internally as explicit data structures. There is also a lot of boilerplate connected with object-oriented programming in C. Eventually, end-user callers will be able to be written in a more natural way—managing their own flow of control rather than having to work via callbacks. Since there will only be a few reference backends but there are many consumers of this API, this is a good tradeoff. More importantly, we gain composability, and especially the possibility of writing interchangeable parts that can work with any ref_iterator. For example, merge_ref_iterator implements a generic way of merging the contents of any two ref_iterators. It is used to merge loose + packed refs as part of the implementation of the files_ref_iterator. But it will also be possible to use it to merge other pairs of reference sources (e.g., per-worktree vs. shared refs). Another example is prefix_ref_iterator, which can be used to trim a prefix off the front of reference names before presenting them to the caller (e.g., "refs/heads/master" -> "master"). In this patch, we introduce the iterator abstraction and many utilities, and implement a reference iterator for the files ref storage backend. (I've written several other obvious utilities, for example a generic way to filter references being iterated over. These will probably be useful in the future. But they are not needed for this patch series, so I am not including them at this time.) In a moment we will rewrite do_for_each_ref() to work via reference iterators (allowing some special-purpose code to be discarded), and do something similar for reflogs. In future patch series, we will expose the ref_iterator abstraction in the public refs API so that callers can use it directly. Implementation note: I tried abstracting this a layer further to allow generic iterators (over arbitrary types of objects) and generic utilities like a generic merge_iterator. But the implementation in C was very cumbersome, involving (in my opinion) too much boilerplate and too much unsafe casting, some of which would have had to be done on the caller side. However, I did put a few iterator-related constants in a top-level header file, iterator.h, as they will be useful in a moment to implement iteration over directory trees and possibly other types of iterators in the future. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-20 11:38:20 -07:00
Michael Haggerty	a873924483	ref_resolves_to_object(): new function Extract new function ref_resolves_to_object() from entry_resolves_to_object(). It can be used even if there is no ref_entry at hand. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-20 11:38:19 -07:00
Michael Haggerty	ffeef64231	entry_resolves_to_object(): rename function from ref_resolves_to_object() Free up the old name for a more general purpose. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-20 11:38:19 -07:00
Michael Haggerty	2eed2780f0	get_ref_cache(): only create an instance if there is a submodule If there is not a nonbare repository where a submodule is supposedly located, then don't instantiate a ref_cache for it. The analogous check can be removed from resolve_gitlink_ref(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-20 11:38:19 -07:00
Michael Haggerty	c5f04dddb6	delete_refs(): add a flags argument This will be useful for passing REF_NODEREF through. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-20 11:38:18 -07:00
Michael Haggerty	4633a846f5	refs: use name "prefix" consistently In the context of the for_each_ref() functions, call the prefix that references must start with "prefix". (In some places it was called "base".) This is clearer, and also prevents confusion with another planned use of the word "base". Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-20 11:38:18 -07:00
Michael Haggerty	067622b0e8	do_for_each_ref(): move docstring to the header file Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-20 11:38:17 -07:00
Michael Haggerty	7a418f3a17	lock_ref_sha1_basic(): only handle REF_NODEREF mode Now lock_ref_sha1_basic() is only called with flags==REF_NODEREF. So we don't have to handle other cases anymore. This enables several simplifications, the most interesting of which come from the fact that ref_lock::orig_ref_name is now always the same as ref_lock::ref_name: * Remove ref_lock::orig_ref_name * Remove local variable orig_refname from lock_ref_sha1_basic() * ref_name can be initialize once and its value reused * commit_ref_update() never has to write to the reflog for lock->orig_ref_name Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:50 +02:00
Michael Haggerty	5d9b2de4ef	commit_ref_update(): remove the flags parameter commit_ref_update() is now only called with flags=0. So remove the flags parameter entirely. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:50 +02:00
Michael Haggerty	6e30b2f652	lock_ref_for_update(): don't resolve symrefs If a transaction includes a non-NODEREF update to a symbolic reference, we don't have to look it up in lock_ref_for_update(). The reference will be dereferenced anyway when the split-off update is processed. This change requires that we store a backpointer from the split-off update to its parent update, for two reasons: * We still want to report the original reference name in error messages. So if an error occurs when checking the split-off update's old_sha1, walk the parent_update pointers back to find the original reference name, and report that one. * We still need to write the old_sha1 of the symref to its reflog. So after we read the split-off update's reference value, walk the parent_update pointers back and fill in their old_sha1 fields. Aside from eliminating unnecessary reads, this change fixes a subtle (though not very serious) race condition: in the old code, the old_sha1 of the symref was resolved before the reference that it pointed at was locked. So it was possible that the old_sha1 value logged to the symref's reflog could be wrong if another process changed the downstream reference before it was locked. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:50 +02:00
Michael Haggerty	8169d0d06a	lock_ref_for_update(): don't re-read non-symbolic references Before the previous patch, our first read of the reference happened before the reference was locked, so we couldn't trust its value and had to read it again. But now that our first read of the reference happens after acquiring the lock, there is no need to read it a second time. So move the read_ref_full() call into the (update->type & REF_ISSYMREF) block. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:50 +02:00
Michael Haggerty	92b1551b1d	refs: resolve symbolic refs first Before committing ref updates, split symbolic ref updates into two parts: an update to the underlying ref, and a log-only update to the symbolic ref. This ensures that both references are locked correctly during the transaction, including while their reflogs are updated. Similarly, if the reference pointed to by HEAD is modified directly, add a separate log-only update to HEAD, rather than leaving the job of updating HEAD's reflog to commit_ref_update(). This change ensures that HEAD is locked correctly while its reflog is being modified, as well as being cheaper (HEAD only needs to be resolved once). This makes use of a new function, lock_raw_ref(), which is analogous to read_raw_ref(), but acquires a lock on the reference before reading it. This change still has two problems: * There are redundant read_ref_full() reference lookups. * It is still possible to get incorrect reflogs for symbolic references if there is a concurrent update by another process, since the old_oid of a symref is determined before the lock on the pointed-to ref is held. Both problems will soon be fixed. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> WIP	2016-06-13 11:23:50 +02:00
Michael Haggerty	8415d24746	unlock_ref(): move definition higher in the file This avoids the need for a forward declaration in the next patch. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	165056b2fc	lock_ref_for_update(): new function Extract a new function, lock_ref_for_update(), from ref_transaction_commit(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	71564516de	add_update(): initialize the whole ref_update Change add_update() to initialize all of the fields in the new ref_update object. Rename the function to ref_transaction_add_update(), and increase its visibility to all of the refs-related code. All of this makes the function more useful for other future callers. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	3a8af7be8f	verify_refname_available(): adjust constness in declaration The two string_list arguments can be const. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
David Turner	12fd3496d1	refs: don't dereference on rename When renaming refs, don't dereference either the origin or the destination before renaming. The origin does not need to be dereferenced because it is presently forbidden to rename symbolic refs. Not dereferencing the destination fixes a bug where renaming on top of a broken symref would use the pointed-to ref name for the moved reflog. Add a test for the reflog bug. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
David Turner	d99aa884df	refs: allow log-only updates The refs infrastructure learns about log-only ref updates, which only update the reflog. Later, we will use this to separate symbolic reference resolution from ref updating. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	5a563d4ad1	ref_transaction_commit(): correctly report close_ref() failure Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	c52ce248d6	ref_transaction_create(): disallow recursive pruning It is nonsensical (and a little bit dangerous) to use REF_ISPRUNING without REF_NODEREF. Forbid it explicitly. Change the one REF_ISPRUNING caller to pass REF_NODEREF too. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	0568c8e9dc	refs: make error messages more consistent * Always start error messages with a lower-case letter. * Always enclose reference names in single quotes. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	bcb497d0f8	lock_ref_sha1_basic(): remove unneeded local variable resolve_ref_unsafe() can cope with being called with NULL passed to its flags argument. So lock_ref_sha1_basic() can just hand its own type parameter through. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	cf596442c6	read_raw_ref(): move docstring to header file Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	bb462b0028	read_raw_ref(): improve docstring Among other things, document the (important!) requirement that input refname be checked for safety before calling this function. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	92b380931e	read_raw_ref(): rename symref argument to referent After all, it doesn't hold the symbolic reference, but rather the reference referred to. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	fa96ea1b88	read_raw_ref(): clear *type at start of function This is more convenient and less error-prone for callers. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	3a0b6b9aba	read_raw_ref(): rename flags argument to type This will hopefully reduce confusion with the "flags" arguments that are used in many functions in this module as an input parameter to choose how the function should operate. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:49 +02:00
Michael Haggerty	efe472813d	ref_transaction_commit(): remove local variables n and updates These microoptimizations don't make a significant difference in speed. And they cause problems if somebody ever wants to modify the function to add updates to a transaction as part of processing it, as will happen shortly. Make the same changes in initial_ref_transaction_commit(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-06-13 11:23:26 +02:00
Michael Haggerty	e711b1af2e	rename_ref(): remove unneeded local variable Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-05-05 16:37:30 +02:00
Michael Haggerty	76fc394d50	commit_ref_update(): write error message to *err, not stderr Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-05-05 16:37:30 +02:00
Michael Haggerty	e167a5673e	read_raw_ref(): don't get confused by an empty directory Even if there is an empty directory where we look for the loose version of a reference, check for a packed reference before giving up. This fixes the failing test that was introduced two commits ago. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-05-05 16:37:04 +02:00
Michael Haggerty	5387c0d883	commit_ref(): if there is an empty dir in the way, delete it Part of the bug revealed in the last commit is that resolve_ref_unsafe() incorrectly returns EISDIR if it finds a directory in the place where it is looking for a loose reference, even if the corresponding packed reference exists. lock_ref_sha1_basic() notices the bogus EISDIR, and use it as an indication that it should call remove_empty_directories() and call resolve_ref_unsafe() again. But resolve_ref_unsafe() shouldn't report EISDIR in this case. If we would simply make that change, then remove_empty_directories() wouldn't get called anymore, and the empty directory would get in the way when commit_ref() calls commit_lock_file() to rename the lockfile into place. So instead of relying on lock_ref_sha1_basic() to delete empty directories, teach commit_ref(), just before calling commit_lock_file(), to check whether a directory is in the way, and if so, try to delete it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>	2016-05-05 16:31:17 +02:00
Junio C Hamano	edc2f715bd	Merge branch 'dt/pre-refs-backend' Code restructuring around the "refs" area to prepare for pluggable refs backends. * dt/pre-refs-backend: (24 commits) refs: on symref reflog expire, lock symref not referrent refs: move resolve_ref_unsafe into common code show_head_ref(): check the result of resolve_ref_namespace() check_aliased_update(): check that dst_name is non-NULL checkout_paths(): remove unneeded flag variable cmd_merge(): remove unneeded flag variable fsck_head_link(): remove unneeded flag variable read_raw_ref(): change flags parameter to unsigned int files-backend: inline resolve_ref_1() into resolve_ref_unsafe() read_raw_ref(): manage own scratch space files-backend: break out ref reading resolve_ref_1(): eliminate local variable "bad_name" resolve_ref_1(): reorder code resolve_ref_1(): eliminate local variable resolve_ref_unsafe(): ensure flags is always set resolve_ref_unsafe(): use for loop to count up to MAXDEPTH resolve_missing_loose_ref(): simplify semantics t1430: improve test coverage of deletion of badly-named refs t1430: test for-each-ref in the presence of badly-named refs t1430: don't rely on symbolic-ref for creating broken symrefs ...	2016-04-25 15:17:15 -07:00
David Turner	41d796ed5c	refs: on symref reflog expire, lock symref not referrent When locking a symbolic ref to expire a reflog, lock the symbolic ref (using REF_NODEREF) instead of its referent. Add a test for this. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:46 -07:00
David Turner	2d0663b216	refs: move resolve_ref_unsafe into common code Now that resolve_ref_unsafe's only interaction with the backend is through read_raw_ref, we can move it into the common code. Later, we'll replace read_raw_ref with a backend function. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:41 -07:00
Michael Haggerty	89e8238965	read_raw_ref(): change flags parameter to unsigned int read_raw_ref() is going to be part of the vtable for reference backends, so clean up its interface to use "unsigned int flags" rather than "int flags". Its caller still uses signed int for its flags arguments. But changing that would touch a lot of code, so leave it for now. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:31 -07:00
Michael Haggerty	8c346fb1d7	files-backend: inline resolve_ref_1() into resolve_ref_unsafe() resolve_ref_unsafe() wasn't doing anything useful anymore. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:29 -07:00
Michael Haggerty	42a38cf788	read_raw_ref(): manage own scratch space Instead of creating scratch space in resolve_ref_unsafe() and passing it down through resolve_ref_1 to read_raw_ref(), teach read_raw_ref() to manage its own scratch space. This reduces coupling across the functions at the cost of some extra allocations. Also, when read_raw_ref() is implemented for different reference backends, the other implementations might have different scratch space requirements. Note that we now preserve errno across the calls to strbuf_release(), which calls free() and can thus theoretically overwrite errno. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:26 -07:00
David Turner	7048653a73	files-backend: break out ref reading Refactor resolve_ref_1 in terms of a new function read_raw_ref, which is responsible for reading ref data from the ref storage. Later, we will make read_raw_ref a pluggable backend function, and make resolve_ref_unsafe common. Signed-off-by: David Turner <dturner@twopensource.com> Helped-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:24 -07:00
Michael Haggerty	afbe782fa3	resolve_ref_1(): eliminate local variable "bad_name" We can use (*flags & REF_BAD_NAME) for that purpose. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:22 -07:00
Michael Haggerty	e6702e570b	resolve_ref_1(): reorder code There is no need to adjust *flags if we're just about to fail. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:21 -07:00
Michael Haggerty	90c28ae11c	resolve_ref_1(): eliminate local variable In place of `buf`, use `refname`, which is anyway a better description of what is being pointed at. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:19 -07:00
Michael Haggerty	a70a93b794	resolve_ref_unsafe(): ensure flags is always set If the caller passes flags==NULL, then set it to point at a local scratch variable. This removes the need for a lot of "if (flags)" guards in resolve_ref_1() and resolve_missing_loose_ref(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:17 -07:00
Michael Haggerty	37da4227b2	resolve_ref_unsafe(): use for loop to count up to MAXDEPTH The loop's there anyway; we might as well use it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:15 -07:00
Michael Haggerty	419c6f4c76	resolve_missing_loose_ref(): simplify semantics Make resolve_missing_loose_ref() only responsible for looking up a packed reference, without worrying about whether we want to read or write the reference and without setting errno on failure. Move the other logic to the caller. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:35:13 -07:00
David Turner	937705901b	refs: move for_each_ref functions into common code Make do_for_each_ref take a submodule as an argument instead of a ref_cache. Since all for_each_ref functions are defined in terms of do_for_each_ref, we can then move them into the common code. Later, we can simply make do_for_each_ref into a backend function. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:34:55 -07:00
David Turner	2bf68ed5aa	refs: move head_ref{,_submodule} to the common code These don't use any backend-specific functions. These were previously defined in terms of the do_head_ref helper function, but since they are otherwise identical, we don't need that function. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-10 11:34:41 -07:00
Kazuki Yamaguchi	18eb3a9ce7	set_worktree_head_symref(): fix error message Emit an informative error when failed to hold lock of HEAD. `2233066e` (refs: add a new function set_worktree_head_symref, 2016-03-27) added set_worktree_head_symref(), but this is missing a call to unable_to_lock_message() after hold_lock_file_for_update() fails, so it emits an empty error message: % git branch -m oldname newname error: error: HEAD of working tree /path/to/wt is not updated fatal: Branch renamed to newname, but HEAD is not updated! Thanks to Eric Sunshine for pointing this out. Signed-off-by: Kazuki Yamaguchi <k@rhe.jp> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-08 10:26:23 -07:00
Kazuki Yamaguchi	2233066e77	refs: add a new function set_worktree_head_symref Add a new function set_worktree_head_symref, to update HEAD symref for the specified worktree. To update HEAD of a linked working tree, create_symref("worktrees/$work_tree/HEAD", "refs/heads/$branch", msg) could be used. However when it comes to updating HEAD of the main working tree, it is unusable because it uses $GIT_DIR for worktree-specific symrefs (HEAD). The new function takes git_dir (real directory) as an argument, and updates HEAD of the working tree. This function will be used when renaming a branch. Signed-off-by: Kazuki Yamaguchi <k@rhe.jp> Acked-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-04 12:57:21 -07:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Jeff King	96ffc06f72	convert trivial cases to FLEX_ARRAY macros Using FLEX_ARRAY macros reduces the amount of manual computation size we have to do. It also ensures we don't overflow size_t, and it makes sure we write the same number of bytes that we allocated. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:51:09 -08:00
Junio C Hamano	ad25723e69	Merge branch 'jk/ref-cache-non-repository-optim' The underlying machinery used by "ls-files -o" and other commands have been taught not to create empty submodule ref cache for a directory that is not a submodule. This removes a ton of wasted CPU cycles. * jk/ref-cache-non-repository-optim: resolve_gitlink_ref: ignore non-repository paths clean: make is_git_repository a public function	2016-02-03 14:16:07 -08:00
Jeff King	a2d5156c2b	resolve_gitlink_ref: ignore non-repository paths When we want to look up a submodule ref, we use get_ref_cache(path) to find or auto-create its ref cache. But if we feed a path that isn't actually a git repository, we blindly create the ref cache, and then may die deeper in the code when we try to access it. This is a problem because many callers speculatively feed us a path that looks vaguely like a repository, and expect us to tell them when it is not. This patch teaches resolve_gitlink_ref to reject non-repository paths without creating a ref_cache. This avoids the die(), and also performs better if you have a large number of these faux-submodule directories (because the ref_cache lookup is linear, under the assumption that there won't be a large number of submodules). To accomplish this, we also break get_ref_cache into two pieces: the lookup and auto-creation (the latter is lumped into create_ref_cache). This lets us first cheaply ask our cache "is it a submodule we know about?" If so, we can avoid repeating our filesystem lookup. So lookups of real submodules are not penalized; they examine the submodule's .git directory only once. The test in t3000 demonstrates a case where this improves correctness (we used to just die). The new perf case in p7300 shows off the speed improvement in an admittedly pathological repository: Test HEAD^ HEAD ---------------------------------------------------------------- 7300.4: ls-files -o 66.97(66.15+0.87) 0.33(0.08+0.24) -99.5% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-25 11:42:13 -08:00
Jeff King	2859dcd4c8	lock_ref_sha1_basic: handle REF_NODEREF with invalid refs We sometimes call lock_ref_sha1_basic with REF_NODEREF to operate directly on a symbolic ref. This is used, for example, to move to a detached HEAD, or when updating the contents of HEAD via checkout or symbolic-ref. However, the first step of the function is to resolve the refname to get the "old" sha1, and we do so without telling resolve_ref_unsafe() that we are only interested in the symref. As a result, we may detect a problem there not with the symref itself, but with something it points to. The real-world example I found (and what is used in the test suite) is a HEAD pointing to a ref that cannot exist, because it would cause a directory/file conflict with other existing refs. This situation is somewhat broken, of course, as trying to _commit_ on that HEAD would fail. But it's not explicitly forbidden, and we should be able to move away from it. However, neither "git checkout" nor "git symbolic-ref" can do so. We try to take the lock on HEAD, which is pointing to a non-existent ref. We bail from resolve_ref_unsafe() with errno set to EISDIR, and the lock code thinks we are attempting to create a d/f conflict. Of course we're not. The problem is that the lock code has no idea what level we were at when we got EISDIR, so trying to diagnose or remove empty directories for HEAD is not useful. To make things even more complicated, we only get EISDIR in the loose-ref case. If the refs are packed, the resolution may "succeed", giving us the pointed-to ref in "refname", but a null oid. Later, we say "ah, the null oid means we are creating; let's make sure there is room for it", but mistakenly check against the _resolved_ refname, not the original. We can fix this by making two tweaks: 1. Call resolve_ref_unsafe() with RESOLVE_REF_NO_RECURSE when REF_NODEREF is set. This means any errors we get will be from the orig_refname, and we can act accordingly. We already do this in the REF_DELETING case, but we should do it for update, too. 2. If we do get a "refname" return from resolve_ref_unsafe(), even with RESOLVE_REF_NO_RECURSE it may be the name of the ref pointed-to by a symref. We already normalize this back to orig_refname before taking the lockfile, but we need to do so before the null_oid check. While we're rearranging the REF_NODEREF handling, we can also bump the initialization of lflags to the top of the function, where we are setting up other flags. This saves us from having yet another conditional block on REF_NODEREF just to set it later. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-13 09:05:42 -08:00
Jeff King	6294dcb49f	lock_ref_sha1_basic: always fill old_oid while holding lock Our basic strategy for taking a ref lock is: 1. Create $ref.lock to take the lock 2. Read the ref again while holding the lock (during which time we know that nobody else can be updating it). 3. Compare the value we read to the expected "old_sha1" The value we read in step (2) is returned to the caller via the lock->old_oid field, who may use it for other purposes (such as writing a reflog). If we have no "old_sha1" (i.e., we are unconditionally taking the lock), then we obviously must omit step 3. But we _also_ omit step 2. This seems like a nice optimization, but it means that the caller sees only whatever was left in lock->old_oid from previous calls to resolve_ref_unsafe(), which happened outside of the lock. We can demonstrate this race pretty easily. Imagine you have three commits, $one, $two, and $three. One script just flips between $one and $two, without providing an old-sha1: while true; do git update-ref -m one refs/heads/foo $one git update-ref -m two refs/heads/foo $two done Meanwhile, another script tries to set the value to $three, also not using an old-sha1: while true; do git update-ref -m three refs/heads/foo $three done If these run simultaneously, we'll see a lot of lock contention, but each of the writes will succeed some of the time. The reflog may record movements between any of the three refs, but we would expect it to provide a consistent log: the "from" field of each log entry should be the same as the "to" field of the previous one. But if we check this: perl -alne ' print "mismatch on line $." if defined $last && $F[0] ne $last; $last = $F[1]; ' .git/logs/refs/heads/foo we'll see many mismatches. Why? Because sometimes, in the time between lock_ref_sha1_basic filling lock->old_oid via resolve_ref_unsafe() and it taking the lock, there may be a complete write by another process. And the "from" field in our reflog entry will be wrong, and will refer to an older value. This is probably quite rare in practice. It requires writers which do not provide an old-sha1 value, and it is a very quick race. However, it is easy to fix: we simply perform step (2), the read-under-lock, whether we have an old-sha1 or not. Then the value we hand back to the caller is always atomic. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-13 09:05:30 -08:00
Jeff King	396da8f7a0	create_symref: write reflog while holding lock We generally hold a lock on the matching ref while writing to its reflog; this prevents two simultaneous writers from clobbering each other's reflog lines (it does not even have to be two symref updates; because we don't hold the lock, we could race with somebody writing to the pointed-to ref via HEAD, for example). We can fix this by writing the reflog before we commit the lockfile. This runs the risk of writing the reflog but failing the final rename(), but at least we now err on the same side as the rest of the ref code. Noticed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-12-29 10:34:25 -08:00
Jeff King	370e5ad65e	create_symref: use existing ref-lock code The create_symref() function predates the existence of "struct lock_file", let alone the more recent "struct ref_lock". Instead, it just does its own manual dot-locking. Besides being more code, this has a few downsides: - if git is interrupted while holding the lock, we don't clean up the lockfile - we don't do the usual directory/filename conflict check. So you can sometimes create a symref "refs/heads/foo/bar", even if "refs/heads/foo" exists (namely, if the refs are packed and we do not hit the d/f conflict in the filesystem). This patch refactors create_symref() to use the "struct ref_lock" interface, which handles both of these things. There are a few bonus cleanups that come along with it: - we leaked ref_path in some error cases - the symref contents were stored in a fixed-size buffer, putting an artificial (albeit large) limitation on the length of the refname. We now write through fprintf, and handle refnames of any size. - we called adjust_shared_perm only after the file was renamed into place, creating a potential race with readers in a shared repository. The lockfile code now handles this when creating the lockfile, making it atomic. - the legacy prefer_symlink_refs path did not do any locking at all. Admittedly, it is not atomic from a reader's perspective (as it unlinks and re-creates the symlink to overwrite), but at least it cannot conflict with other writers now. - the result of this patch is hopefully more readable. It eliminates three goto labels. Two were for error checking that is now simplified, and the third was to reach shared code that has been pulled into its own function. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-12-29 10:33:31 -08:00
Jeff King	b9badadd06	create_symref: modernize variable names Once upon a time, create_symref() was used only to point HEAD at a branch name, and the variable names reflect that (e.g., calling the path git_HEAD). However, it is much more generic these days (and has been for some time). Let's update the variable names to make it easier to follow: - `ref_target` is now just `refname`. This is closer to the `ref` that is already in `cache.h`, but with the extra twist that "name" makes it clear this is the name and not a ref struct. Dropping "target" hopefully makes it clear that we are talking about the symref itself, not what it points to. - `git_HEAD` is now `ref_path`; the on-disk path corresponding to `ref`. - `refs_heads_master` is now just `target`; i.e., what the symref points at. This term also matches what is in the symlink(2) manpage (at least on Linux). - the buffer to hold the symref file's contents was simply called `ref`. It's now `buf` (admittedly also generic, but at least not actively introducing confusion with the other variable holding the refname). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-12-29 10:33:09 -08:00
Junio C Hamano	e0048d3e0d	Merge branch 'sg/lock-file-commit-error' Cosmetic improvement to lock-file error messages. * sg/lock-file-commit-error: Make error message after failing commit_lock_file() less confusing	2015-12-11 10:40:55 -08:00
David Turner	0845122c39	refs: break out ref conflict checks Create new function find_descendant_ref, to hold one of the ref conflict checks used in verify_refname_available. Multiple backends will need this function, so move it to the common code. Also move rename_ref_available to the common code, because alternate backends might need it and it has no files-backend-specific code. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 04:52:01 -05:00
David Turner	5f3c3a4e6f	files_log_ref_write: new function Because HEAD and stash are per-worktree, every refs backend needs to go through the files backend to write these refs. So create a new function, files_log_ref_write, and add it to refs/refs-internal.h. Later, we will use this to handle reflog updates for per-worktree symbolic refs (HEAD). Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 04:52:01 -05:00
Michael Haggerty	7bd9bcf372	refs: split filesystem-based refs code into a new file As another step in the move to pluggable reference backends, move the code that is specific to the filesystem-based reference backend (i.e., the current system of storing references as loose and packed files) into a separate file, refs/files-backend.c. Aside from a tiny bit of file header boilerplate, this commit only moves a subset of the code verbatim from refs.c to the new file, as can easily be verified using patience diff: git diff --patience $commit^:refs.c $commit:refs.c git diff --patience $commit^:refs.c $commit:refs/files-backend.c Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 04:52:01 -05:00
Michael Haggerty	4cb77009e1	refs/refs-internal.h: new header file There are a number of constants, structs, and static functions defined in refs.c and treated as private to the references module. But we want to support multiple reference backends within the reference module, and those backends will need access to some heretofore private declarations. We don't want those declarations to be visible to non-refs code, so we don't want to move them to refs.h. Instead, add a new header file, refs/refs-internal.h, that is intended to be included only from within the refs module. Make some functions non-static and move some declarations (and their corresponding docstrings) from refs.c to this file. In a moment we will add more content to the "refs" subdirectory. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 04:52:01 -05:00

1 2 3