If we are iterating through the refs using for_each_ref (or
any of its sister functions), we can get into a race
condition with a simultaneous "pack-refs --prune" that looks
like this:
0. We have a large number of loose refs, and a few packed
refs. refs/heads/z/foo is loose, with no matching entry
in the packed-refs file.
1. Process A starts iterating through the refs. It loads
the packed-refs file from disk, then starts lazily
traversing through the loose ref directories.
2. Process B, running "pack-refs --prune", writes out the
new packed-refs file. It then deletes the newly packed
refs, including refs/heads/z/foo.
3. Meanwhile, process A has finally gotten to
refs/heads/z (it traverses alphabetically). It
descends, but finds nothing there. It checks its
cached view of the packed-refs file, but it does not
mention anything in "refs/heads/z/" at all (it predates
the new file written by B in step 2).
The traversal completes successfully without mentioning
refs/heads/z/foo at all (the name, of course, isn't
important; but the more refs you have and the farther down
the alphabetical list a ref is, the more likely it is to hit
the race). If refs/heads/z/foo did exist in the packed refs
file at state 0, we would see an entry for it, but it would
show whatever sha1 the ref had the last time it was packed
(which could be an arbitrarily long time ago).
This can be especially dangerous when process A is "git
prune", as it means our set of reachable tips will be
incomplete, and we may erroneously prune objects reachable
from that tip (the same thing can happen if "repack -ad" is
used, as it simply drops unreachable objects that are
packed).
This patch solves it by loading all of the loose refs for
our traversal into our in-memory cache, and then refreshing
the packed-refs cache. Because a pack-refs writer will
always put the new packed-refs file into place before
starting the prune, we know that any loose refs we fail to
see will either truly be missing, or will have already been
put in the packed-refs file by the time we refresh.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once we read the packed-refs file into memory, we cache it
to save work on future ref lookups. However, our cache may
be out of date with respect to what is on disk if another
process is simultaneously packing the refs. Normally it
is acceptable for us to be a little out of date, since there
is no guarantee whether we read the file before or after the
simultaneous update. However, there is an important special
case: our packed-refs file must be up to date with respect
to any loose refs we read. Otherwise, we risk the following
race condition:
0. There exists a loose ref refs/heads/master.
1. Process A starts and looks up the ref "master". It
first checks $GIT_DIR/master, which does not exist. It
then loads (and caches) the packed-refs file to see if
"master" exists in it, which it does not.
2. Meanwhile, process B runs "pack-refs --all --prune". It
creates a new packed-refs file which contains
refs/heads/master, and removes the loose copy at
$GIT_DIR/refs/heads/master.
3. Process A continues its lookup, and eventually tries
$GIT_DIR/refs/heads/master. It sees that the loose ref
is missing, and falls back to the packed-refs file. But
it examines its cached version, which does not have
refs/heads/master. After trying a few other prefixes,
it reports master as a non-existent ref.
There are many variants (e.g., step 1 may involve process A
looking up another ref entirely, so even a fully qualified
refname can fail). One of the most interesting ones is if
"refs/heads/master" is already packed. In that case process
A will not see it as missing, but rather will report
whatever value happened to be in the packed-refs file before
process B repacked (which might be an arbitrarily old
value).
We can fix this by making sure we reload the packed-refs
file from disk after looking at any loose refs. That's
unacceptably slow, so we can check its stat()-validity as a
proxy, and read it only when it appears to have changed.
Reading the packed-refs file after performing any loose-ref
system calls is sufficient because we know the ordering of
the pack-refs process: it always makes sure the newly
written packed-refs file is installed into place before
pruning any loose refs. As long as those operations by B
appear in their executed order to process A, by the time A
sees the missing loose ref, the new packed-refs file must be
in place.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can sometimes be useful to know whether a path in the
filesystem has been updated without going to the work of
opening and re-reading its content. We trust the stat()
information on disk already to handle index updates, and we
can use the same trick here.
This patch introduces a "stat_validity" struct which
encapsulates the concept of checking the stat-freshness of a
file. It is implemented on top of "struct stat_data" to
reuse the logic about which stat entries to trust for a
particular platform, but hides the complexity behind two
simple functions: check and update.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add public functions fill_stat_data() and match_stat_data() to work
with it. This infrastructure will later be used to check the validity
of other types of file.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Increment the packed_ref_cache reference count while it is locked to
prevent its being freed.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function calls a user-supplied callback function which could do
something that causes the packed refs cache to be invalidated. So
acquire a reference count on the data structure to prevent our copy
from being freed while we are iterating over it.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In struct packed_ref_cache, keep a count of the number of users of the
data structure. Only free the packed ref cache when the reference
count goes to zero rather than when the packed ref cache is cleared.
This mechanism will be used to prevent the cache data structure from
being freed while it is being iterated over.
So far, only the reference in struct ref_cache::packed is counted;
other users will be adjusted in separate commits.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Handle simple transactions for the packed-refs file at the
packed_ref_cache level via new functions lock_packed_refs(),
commit_packed_refs(), and rollback_packed_refs().
Only allow the packed ref cache to be modified (via add_packed_ref())
while the packed refs file is locked.
Change clone to add the new references within a transaction.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we know, we can solve any problem in this manner. In this case,
the problem is to avoid freeing a packed refs cache while somebody is
using it. So add a level of indirection as a prelude to
reference-counting the packed refs cache.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Split pack_refs() into multiple passes:
* Iterate over loose refs. For each one that can be turned into a
packed ref, create a corresponding entry in the packed refs cache.
* Write the packed refs to the packed-refs file.
This change isolates the mutation of the packed-refs file to a single
place.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repack_without_ref() function first removes the deleted ref from
the internal packed-refs list, then writes the packed-refs list to
disk, omitting any broken or stale entries. This patch splits that
second step into multiple passes:
* collect the list of refnames that should be deleted from packed_refs
* delete those refnames from the cache
* write the remainder to the packed-refs file
The purpose of this change is to make the "write the remainder" part
reusable.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
submodules with names using UTF-8 need core.precomposeunicode true
under Mac OS X, set it in the test case.
Improve the portability:
- Not all shells on all OS may understand literal UTF-8 strings.
- Use a help variable filled by printf, as we do it in e.g. t0050.
"strange names" can be called UTF-8, rephrase the heading.
While at it, unbreak &&-chain in the test, and use test_config.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
sender is now sanitized, but we didn't sanitize author when checking
whether From: line is needed in the message body.
As a result git started writing duplicate From: lines when author
matched sender and has utf8 characters.
Reported-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Verify that author name is not duplicated if it matches sender, even
if it is in utf8 (the test expects a failure that will be fixed in
the next patch).
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The goal of the patch is to introduce the GNU diff
-B/--ignore-blank-lines as closely as possible. The short option is not
available because it's already used for "break-rewrites".
When this option is used, git-diff will not create hunks that simply
add or remove empty lines, but will still show empty lines
addition/suppression if they are close enough to "valuable" changes.
There are two differences between this option and GNU diff -B option:
- GNU diff doesn't have "--inter-hunk-context", so this must be handled
- The following sequence looks like a bug (context is displayed twice):
$ seq 5 >file1
$ cat <<EOF >file2
change
1
2
3
4
5
change
EOF
$ diff -u -B file1 file2
--- file1 2013-06-08 22:13:04.471517834 +0200
+++ file2 2013-06-08 22:13:23.275517855 +0200
@@ -1,5 +1,7 @@
+change
1
2
+
3
4
5
@@ -3,3 +5,4 @@
3
4
5
+change
So here is a more thorough description of the option:
- real changes are interesting
- blank lines that are close enough (less than context size) to
interesting changes are considered interesting (recursive definition)
- "context" lines are used around each hunk of interesting changes
- If two hunks are separated by less than "inter-hunk-context", they
will be merged into one.
The implementation does the "interesting changes selection" in a single
pass.
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The following command
$ git cherry-pick --ff b8bb3f
writes the following uninformative message to the reflog
cherry-pick
Improve it to
cherry-pick: fast-forward
Avoid hard-coding "cherry-pick" in fast_forward_to(), so the sequencer
is generic enough to support future actions.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We read loose references in two steps. The code is roughly:
lstat()
if error ENOENT:
loose ref is missing; look for corresponding packed ref
else if S_ISLNK:
readlink()
if error:
report failure
else if S_ISDIR:
report failure
else
open()
if error:
report failure
read()
The problem is that the first filesystem call, to lstat(), is not
atomic with the second filesystem call, to readlink() or open().
Therefore it is possible for another process to change the file
between our two calls, for example:
* If the other process deletes the file, our second call will fail
with ENOENT, which we *should* interpret as "loose ref is missing;
look for corresponding packed ref". This can arise if the other
process is pack-refs; it might have just written a new packed-refs
file containing the old contents of the reference then deleted the
loose ref.
* If the other process changes a symlink into a plain file, our call
to readlink() will fail with EINVAL, which we *should* respond to by
trying to open() and read() the file.
The old code treats the reference as missing in both of these cases,
which is incorrect.
So instead, handle errors more selectively: if the result of
readline()/open() is a failure that is inconsistent with the result of
the previous lstat(), then something is fishy. In this case jump back
and start over again with a fresh call to lstat().
One race is still possible and undetected: another process could
change the file from a regular file into a symlink between the call to
lstat and the call to open(). The open() call would silently follow
the symlink and not know that something is wrong. This situation
could be detected in two ways:
* On systems that support O_NOFOLLOW, pass that option to the open().
* On other systems, call fstat() on the fd returned by open() and make
sure that it agrees with the stat info from the original lstat().
However, we don't use symlinks anymore, so this situation is unlikely.
Moreover, it doesn't appear that treating a symlink as a regular file
would have grave consequences; after all, this is exactly how the code
handles non-relative symlinks. So this commit leaves that race
unaddressed.
Note that this solves only the part of the race within
resolve_ref_unsafe. In the situation described above, we may still be
depending on a cached view of the packed-refs file; that race will be
dealt with in a future patch.
This problem was reported and diagnosed by Jeff King <peff@peff.net>,
and this solution is derived from his patch.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is only one "break" statement within the loop, which jumps to
the code after the loop that handles the case of a file that holds a
SHA-1. So move that code from below the loop into the if statement
where the break was previously located. This makes the logic flow
more local.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The nesting was getting a bit out of hand, and it's about to get
worse.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Curl older than 7.17 (RHEL 4.X provides 7.12 and RHEL 5.X provides
7.15) requires that we manage any strings that we pass to it as
pointers. So, we really shouldn't be modifying this strbuf after we
have passed it to curl.
Our interaction with curl is currently safe (before or after this
patch) since the pointer that is passed to curl is never invalidated;
it is repeatedly rewritten with the same sequence of characters but
the strbuf functions never need to allocate a larger string, so the
same memory buffer is reused.
This "guarantee" of safety is somewhat subtle and could be overlooked
by someone who may want to add a more complex handling of the username
and password. So, let's stop modifying this strbuf after we have
passed it to curl, but also leave a note to describe the assumptions
that have been made about username/password lifetime and to draw
attention to the code.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When an user wants to filter specific ref using the --refs option,
the pattern needs to match the full ref, e.g. --refs=refs/tags/v1.*.
It'd be convenient to specify a subpath of ref pattern. For
example, --refs=origin/* can find refs/remotes/origin/master by
searching the pattern against its substrings in turn:
refs/remotes/origin/master
remotes/origin/master
origin/master
If it finds a match in a subpath, unambigous part of the ref path will
be removed in the output.
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation and some comments still refer to files in builtin/
as 'builtin-*.[cho]'. Update these to show the correct location.
Signed-off-by: Phil Hord <hordp@cisco.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Assisted-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The section describing "git diff <blob> <blob>" had been placed in a
position that disrupted the statement "This is synonymous to the
previous form".
Reorder to place this form after all the <commit>-using forms, and the
note applying to them. Also mention this form in the initial description
paragraph.
Signed-off-by: Kevin Bracey <kevin@bracey.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 35d2fffd introduced 'git merge --abort' as a synonym to 'git reset
--merge', and added some failing tests in t7611-merge-abort.sh (search
'###' in this file) showing that 'git merge --abort' could not always
recover the pre-merge state.
Still, in many cases, 'git merge --abort' just works, and it is usually
considered that the ability to start a merge with uncommited changes is
an important property of Git.
Weaken the warning by discouraging only merge with /non-trivial/
uncommited changes.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prior to commit fa83a33b, the 'git checkout' DWIMery would create a
new local branch if the specified branch name did not exist and it
matched exactly one ref in the "remotes" namespace. It searched
the "remotes" namespace for matching refs using a simple comparison
of the trailing portion of the remote ref names. This approach
could sometimes produce false positives or negatives.
Since fa83a33b, the DWIMery more strictly excludes the remote name
from the ref comparison by iterating through the remotes that are
configured in the .gitconfig file. This has the side-effect that
any refs that exist in the "remotes" namespace, but do not match
the destination side of any remote refspec, will not be used by
the DWIMery.
This change in behavior breaks the tests in t9802 which relied on
the old behavior of searching all refs in the remotes namespace,
since the git-p4 script does not configure any remotes in the
.gitconfig. Let's work around this in these tests by explicitly
naming the upstream branch to base the new local branch on when
calling 'git checkout'.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Acked-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The recently introduced tests used uppercase letters to denote
cherry-picks of commits having the corresponding lowercase letter names.
The helper functions also set up tags with the names of the commits.
But this constellation fails on case-insensitive file systems because
there cannot be distinct tags with names that differ only in case.
Use a less subtle convention for the names of cherry-picked commits.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The behavior of "git push --force" is rather clear when it updates only
one remote ref, but running it when pushing several branches can really
be dangerous. Warn the users a bit more and give them the alternative to
push only one branch.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote_find_tracking() populates the query struct with an allocated
string in the dst member. So, we do not need to xstrdup() the string,
since we can transfer ownership from the query struct (which will go
out of scope at the end of this function) to our callback struct, but
we must free the string if it will not be used so we will not leak
memory.
Let's do so.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the new rev-parse --prefix option to process all paths given to the
submodule command, dropping the requirement that it be run from the
top-level of the repository.
Since the interpretation of a relative submodule URL depends on whether
or not "remote.origin.url" is configured, explicitly block relative URLs
in "git submodule add" when not at the top level of the working tree.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This makes 'git rev-parse' behave as if it were invoked from the
specified subdirectory of a repository, with the difference that any
file paths which it prints are prefixed with the full path from the top
of the working tree.
This is useful for shell scripts where we may want to cd to the top of
the working tree but need to handle relative paths given by the user on
the command line.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When --recursive was added to "submodule foreach" in commit 15fc56a (git
submodule foreach: Add --recursive to recurse into nested submodules,
2009-08-19), the error message when the script returns a non-zero status
was not updated to contain $prefix to show the full path. Fix this.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the indentation to use tabs consistently and start content on the
line after the paren opening a subshell.
Also don't put a space in ">file" and remove ":" from ": >file" to be
consistent with the majority of tests elsewhere.
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It was missed in the option list while mentioned from the general
description. Add it for completeness.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
set_rev_name is a possiblly expensive operation. If a submodule has
changes in it, set_rev_name was called twice.
Move call to set_rev_name so it's only called once, no matter which
codepath is taken.
Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cherry-pick is in progress, 'git status' gives the advice to
run "git commit" to finish the cherry-pick.
However, this won't continue the sequencer, when picking a range of
commits.
Advise users to run "git cherry-pick --continue/--abort"; they work
when picking a single commit as well.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of needing a wrapper to call the diff/merge command, simply
provide the diff_cmd and merge_cmd functions for user-specified tools in
the same way as we do for built-in tools.
Signed-off-by: John Keeping <john@keeping.me.uk>
Acked-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
make and make test both work when $GIT_DIR isn't .git, but make dist
included a bogus GIT-VERSION-FILE.
Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When running 'make test' from a path such as
.../daily-build/master@bdff0e3a374617dce784f801b97500d9ba2e4705, the
logic in fuzz.sed as generated by t5105-request-pull.sh was backwards,
replacing object names before replacing urls, making the test fail.
Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
rebase has no reason to know about the implementation of the stash. In
the case when applying the autostash results in conflicts, replace the
relevant code in finish_rebase () to simply call 'git stash store'.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
save_stash() contains the logic for doing two potentially independent
operations; the first is preparing the stash merge commit, and the
second is updating the stash ref/ reflog accordingly. While the first
operation is abstracted out into a create_stash() for callers to access
via 'git stash create', the second one is not. Fix this by factoring
out the logic for storing the stash into a store_stash() that callers
can access via 'git stash store'.
Like create, store is not intended for end user interactive use, but for
callers in other scripts. We can simplify the logic in the
rebase.autostash feature using this new subcommand.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If o->merge is set, the struct traverse_info member conflicts is shifted
left in unpack_callback, then passed through traverse_trees_recursive
to unpack_nondirectories, where it is shifted right before use. Stop
the shifting and just pass the conflict bit mask as is. Rename the
member to df_conflicts to prove that it isn't used anywhere else.
Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The option parser for create unnecessarily checks "$1" inside a case
statement that matches "$1" in the first place.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git stash save' can take -p, the short form of --patch, as an option.
Document this.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a note saying that the user probably wants "save" in the create
description. While at it, document that it can optionally take a
message in the synopsis.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace instances of ! test -d with test_path_is_missing.
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>