Now that we keep track of the packed-refs file metadata, we can detect
when the packed-refs file has been modified since we last read it, and
we do so automatically every time that get_packed_ref_cache() is
called. So there is no need to invalidate the cache automatically
when lock_packed_refs() is called; usually the old copy will still be
valid.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are iterating through the refs using for_each_ref (or
any of its sister functions), we can get into a race
condition with a simultaneous "pack-refs --prune" that looks
like this:
0. We have a large number of loose refs, and a few packed
refs. refs/heads/z/foo is loose, with no matching entry
in the packed-refs file.
1. Process A starts iterating through the refs. It loads
the packed-refs file from disk, then starts lazily
traversing through the loose ref directories.
2. Process B, running "pack-refs --prune", writes out the
new packed-refs file. It then deletes the newly packed
refs, including refs/heads/z/foo.
3. Meanwhile, process A has finally gotten to
refs/heads/z (it traverses alphabetically). It
descends, but finds nothing there. It checks its
cached view of the packed-refs file, but it does not
mention anything in "refs/heads/z/" at all (it predates
the new file written by B in step 2).
The traversal completes successfully without mentioning
refs/heads/z/foo at all (the name, of course, isn't
important; but the more refs you have and the farther down
the alphabetical list a ref is, the more likely it is to hit
the race). If refs/heads/z/foo did exist in the packed refs
file at state 0, we would see an entry for it, but it would
show whatever sha1 the ref had the last time it was packed
(which could be an arbitrarily long time ago).
This can be especially dangerous when process A is "git
prune", as it means our set of reachable tips will be
incomplete, and we may erroneously prune objects reachable
from that tip (the same thing can happen if "repack -ad" is
used, as it simply drops unreachable objects that are
packed).
This patch solves it by loading all of the loose refs for
our traversal into our in-memory cache, and then refreshing
the packed-refs cache. Because a pack-refs writer will
always put the new packed-refs file into place before
starting the prune, we know that any loose refs we fail to
see will either truly be missing, or will have already been
put in the packed-refs file by the time we refresh.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once we read the packed-refs file into memory, we cache it
to save work on future ref lookups. However, our cache may
be out of date with respect to what is on disk if another
process is simultaneously packing the refs. Normally it
is acceptable for us to be a little out of date, since there
is no guarantee whether we read the file before or after the
simultaneous update. However, there is an important special
case: our packed-refs file must be up to date with respect
to any loose refs we read. Otherwise, we risk the following
race condition:
0. There exists a loose ref refs/heads/master.
1. Process A starts and looks up the ref "master". It
first checks $GIT_DIR/master, which does not exist. It
then loads (and caches) the packed-refs file to see if
"master" exists in it, which it does not.
2. Meanwhile, process B runs "pack-refs --all --prune". It
creates a new packed-refs file which contains
refs/heads/master, and removes the loose copy at
$GIT_DIR/refs/heads/master.
3. Process A continues its lookup, and eventually tries
$GIT_DIR/refs/heads/master. It sees that the loose ref
is missing, and falls back to the packed-refs file. But
it examines its cached version, which does not have
refs/heads/master. After trying a few other prefixes,
it reports master as a non-existent ref.
There are many variants (e.g., step 1 may involve process A
looking up another ref entirely, so even a fully qualified
refname can fail). One of the most interesting ones is if
"refs/heads/master" is already packed. In that case process
A will not see it as missing, but rather will report
whatever value happened to be in the packed-refs file before
process B repacked (which might be an arbitrarily old
value).
We can fix this by making sure we reload the packed-refs
file from disk after looking at any loose refs. That's
unacceptably slow, so we can check its stat()-validity as a
proxy, and read it only when it appears to have changed.
Reading the packed-refs file after performing any loose-ref
system calls is sufficient because we know the ordering of
the pack-refs process: it always makes sure the newly
written packed-refs file is installed into place before
pruning any loose refs. As long as those operations by B
appear in their executed order to process A, by the time A
sees the missing loose ref, the new packed-refs file must be
in place.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can sometimes be useful to know whether a path in the
filesystem has been updated without going to the work of
opening and re-reading its content. We trust the stat()
information on disk already to handle index updates, and we
can use the same trick here.
This patch introduces a "stat_validity" struct which
encapsulates the concept of checking the stat-freshness of a
file. It is implemented on top of "struct stat_data" to
reuse the logic about which stat entries to trust for a
particular platform, but hides the complexity behind two
simple functions: check and update.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add public functions fill_stat_data() and match_stat_data() to work
with it. This infrastructure will later be used to check the validity
of other types of file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Increment the packed_ref_cache reference count while it is locked to
prevent its being freed.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function calls a user-supplied callback function which could do
something that causes the packed refs cache to be invalidated. So
acquire a reference count on the data structure to prevent our copy
from being freed while we are iterating over it.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In struct packed_ref_cache, keep a count of the number of users of the
data structure. Only free the packed ref cache when the reference
count goes to zero rather than when the packed ref cache is cleared.
This mechanism will be used to prevent the cache data structure from
being freed while it is being iterated over.
So far, only the reference in struct ref_cache::packed is counted;
other users will be adjusted in separate commits.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handle simple transactions for the packed-refs file at the
packed_ref_cache level via new functions lock_packed_refs(),
commit_packed_refs(), and rollback_packed_refs().
Only allow the packed ref cache to be modified (via add_packed_ref())
while the packed refs file is locked.
Change clone to add the new references within a transaction.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we know, we can solve any problem in this manner. In this case,
the problem is to avoid freeing a packed refs cache while somebody is
using it. So add a level of indirection as a prelude to
reference-counting the packed refs cache.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split pack_refs() into multiple passes:
* Iterate over loose refs. For each one that can be turned into a
packed ref, create a corresponding entry in the packed refs cache.
* Write the packed refs to the packed-refs file.
This change isolates the mutation of the packed-refs file to a single
place.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repack_without_ref() function first removes the deleted ref from
the internal packed-refs list, then writes the packed-refs list to
disk, omitting any broken or stale entries. This patch splits that
second step into multiple passes:
* collect the list of refnames that should be deleted from packed_refs
* delete those refnames from the cache
* write the remainder to the packed-refs file
The purpose of this change is to make the "write the remainder" part
reusable.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Define memory ownership and lifetime rules for what for-each-ref
feeds to its callbacks (in short, "you do not own it, so make a
copy if you want to keep it").
* mh/reflife: (25 commits)
refs: document the lifetime of the args passed to each_ref_fn
register_ref(): make a copy of the bad reference SHA-1
exclude_existing(): set existing_refs.strdup_strings
string_list_add_refs_by_glob(): add a comment about memory management
string_list_add_one_ref(): rename first parameter to "refname"
show_head_ref(): rename first parameter to "refname"
show_head_ref(): do not shadow name of argument
add_existing(): do not retain a reference to sha1
do_fetch(): clean up existing_refs before exiting
do_fetch(): reduce scope of peer_item
object_array_entry: fix memory handling of the name field
find_first_merges(): remove unnecessary code
find_first_merges(): initialize merges variable using initializer
fsck: don't put a void*-shaped peg in a char*-shaped hole
object_array_remove_duplicates(): rewrite to reduce copying
revision: use object_array_filter() in implementation of gc_boundary()
object_array: add function object_array_filter()
revision: split some overly-long lines
cmd_diff(): make it obvious which cases are exclusive of each other
cmd_diff(): rename local variable "list" -> "entry"
...
Major update to the revision traversal logic to improve culling of
irrelevant parents while traversing a mergy history.
* kb/full-history-compute-treesame-carefully-2:
revision.c: make default history consider bottom commits
revision.c: don't show all merges for --parents
revision.c: discount side branches when computing TREESAME
revision.c: add BOTTOM flag for commits
simplify-merges: drop merge from irrelevant side branch
simplify-merges: never remove all TREESAME parents
t6012: update test for tweaked full-history traversal
revision.c: Make --full-history consider more merges
Documentation: avoid "uninteresting"
rev-list-options.txt: correct TREESAME for P
t6111: add parents to tests
t6111: allow checking the parents as well
t6111: new TREESAME test set
t6019: test file dropped in -s ours merge
decorate.c: compact table when growing
When a file (or a directory) called HEAD exists in the working tree,
internal calls git svn makes trigger "did you mean a revision or a
path?" ambiguity check.
$ git svn rebase
fatal: ambiguous argument 'HEAD': both revision and filename
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
rev-list --first-parent --pretty=medium HEAD: command returned error: 128
Explicitly disambiguate by adding "--" after the revision.
Signed-off-by: Slava Kardakov <ojab@ojab.ru>
Reviewed-by: Jeff King <peff@peff.net>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Blameview was a quick-and-dirty demonstration of how blame's
incremental output could be used in an interface. These days
one can find much better (and less ugly!) demonstrations in
"git gui blame" and "tig blame".
The only advantage blameview has is that its code is perhaps
simpler to read. However, that is balanced by the fact that
it probably has bugs, as nobody uses it nor has touched the
code in 6 years. An implementor is probably better off just
reading the "incremental output" section of "man git-blame".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the SANITY prerequisite when testing if a temp file can
be created in a read only directory.
Skip the test under CYGWIN, or skip it under Unix/Linux when
it is run as root.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"gitweb" forgot to clear a global variable $search_regexp upon each
request, mistakenly carrying over the previous search to a new one
when used as a persistent CGI.
* cm/gitweb-project-list-persistent-cgi-fix:
gitweb: fix problem causing erroneous project list
* rr/rebase-autostash:
rebase: implement --[no-]autostash and rebase.autostash
rebase --merge: return control to caller, for housekeeping
rebase -i: return control to caller, for housekeeping
am: return control to caller, for housekeeping
rebase: prepare to do generic housekeeping
rebase -i: don't error out if $state_dir already exists
am: tighten a conditional that checks for $dotest
"git cmd <name>", when <name> happens to be a 40-hex string,
directly uses the 40-hex string as an object name, even if a ref
"refs/<some hierarchy>/<name>" exists. This disambiguation order
is unlikely to change, but we should warn about the ambiguity just
like we warn when more than one refs/ hierachies share the same
name.
* nd/warn-ambiguous-object-name:
get_sha1: warn about full or short object names that look like refs
Update the low-level diffcore documentation on -S/-G and --pickaxe-all.
* rr/diffcore-pickaxe-doc:
diffcore-pickaxe doc: document -S and -G properly
diffcore-pickaxe: make error messages more consistent
These days, "git --work-tree=there cmd" without specifying an
explicit --git-dir=here will do the usual discovery, but we had a
description of older behaviour in the documentation.
* cr/git-work-tree-sans-git-dir:
git.txt: remove stale comment regarding GIT_WORK_TREE
Hint users when https:// connection failed to check the certificate.
* mm/mediawiki-https-fail-message:
git-remote-mediawiki: better error message when HTTP(S) access fails
* fc/remote-bzr:
remote-bzr: add fallback check for a partial clone
remote-bzr: reorganize the way 'wanted' works
remote-bzr: trivial cleanups
remote-bzr: change global repo
remote-bzr: delay cloning/pulling
remote-bzr: simplify get_remote_branch()
remote-bzr: fix for files with spaces
remote-bzr: recover from failed clones
Update build for Cygwin 1.[57]. Torsten Bögershausen reports that
this is fine with Cygwin 1.7 ($gmane/225824) so let's try moving it
ahead.
* rj/mingw-cygwin:
cygwin: Remove the CYGWIN_V15_WIN32API build variable
mingw: rename WIN32 cpp macro to GIT_WINDOWS_NATIVE
* rs/unpack-trees-plug-leak:
unpack-trees: free cache_entry array members for merges
diff-lib, read-tree, unpack-trees: mark cache_entry array paramters const
diff-lib, read-tree, unpack-trees: mark cache_entry pointers const
unpack-trees: create working copy of merge entry in merged_entry
unpack-trees: factor out dup_entry
read-cache: mark cache_entry pointers const
cache: mark cache_entry pointers const
When a reflog notation is used for implicit "current branch", we
did not say which branch and worse said "branch ''".
* rr/die-on-missing-upstream:
sha1_name: fix error message for @{<N>}, @{<date>}
sha1_name: fix error message for @{u}
githooks(5) says that "[...]the .sample files are executable by default"
which was not true.
Signed-off-by: Wieland Hoffmann <themineo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Primarily to push out two regression issues that seem to affect many
people, namely, the ".gitignore !directory" bug and "daemon cannot
read from $HOME owned by root" bug.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Returning the SIGALRM handler for SIGINT is not very useful.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A git daemon that starts as "root" and then drops privilege often
leaves $HOME set to that of the root user, which is unreadable by
the daemon process, which was diagnosed as a configuration error.
Make per-user configuration files that are inaccessible due to
EACCES as though these files do not exist to avoid this issue, as
the tightening which was originally meant as an additional security
has annoyed enough sysadmins.
* jn/config-ignore-inaccessible:
config: allow inaccessible configuration under $HOME
Fix recent regression of .gitignore files that list !directory to
mark it not-ignored.
* kb/status-ignored-optim-2:
dir.c: fix ignore processing within not-ignored directories
read_cache already performs the same check and returns immediately if
the cache has already been loaded.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A note in the beginning of this document describes the behavior already.
This patch just adds where to find the repositories.
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>