The previous commit added tests to show that index-pack
correctly bails in unrecoverable situations. There are some
situations where the data could be recovered, but it is not
currently:
1. If we can break the cycle using an object from another
pack via --fix-thin.
2. If we can break the cycle using a duplicate of one of
the objects found in the same pack.
Note that neither of these is particularly high priority; a
delta cycle within a pack should never occur, and we have no
record of even a buggy git implementation creating such a
pack.
However, it's worth adding these tests for two reasons. One,
to document that we do not currently handle the situation,
even though it is possible. And two, to exercise the code
that runs in this situation; even though it fails, by
running it we can confirm that index-pack detects the
situation and aborts, and does not misbehave (e.g., by
following the cycle in an infinite loop).
In both cases, we hit an assert that aborts index-pack.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we receive a broken or malicious pack from a remote, we
will feed it to index-pack. As index-pack processes the
objects as a stream, reconstructing and hashing each object
to get its name, it is not very susceptible to doing the
wrong with bad data (it simply notices that the data is
bogus and aborts).
However, one question raised on the list is whether it could
be susceptible to problems during the delta-resolution
phase. In particular, can a cycle in the packfile deltas
cause us to go into an infinite loop or cause any other
problem?
The answer is no.
We cannot have a cycle of delta-base offsets, because they
go only in one direction (the OFS_DELTA object mentions its
base by an offset towards the beginning of the file, and we
explicitly reject negative offsets).
We can have a cycle of REF_DELTA objects, which refer to
base objects by sha1 name. However, index-pack does not know
these sha1 names ahead of time; it has to reconstruct the
objects to get their names, and it cannot do so if there is
a delta cycle (in other words, it does not even realize
there is a cycle, but only that there are items that cannot
be resolved).
Even though we can reason out that index-pack should handle
this fine, let's add a few tests to make sure it behaves
correctly.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sha1_entry_pos function tries to be smart about
selecting the middle of a range for its binary search by
looking at the value differences between the "lo" and "hi"
constraints. However, it is unable to cope with entries with
duplicate keys in the sorted list.
We may hit a point in the search where both our "lo" and
"hi" point to the same key. In this case, the range of
values between our endpoints is 0, and trying to scale the
difference between our key and the endpoints over that range
is undefined (i.e., divide by zero). The current code
catches this with an "assert(lov < hiv)".
Moreover, after seeing that the first 20 byte of the key are
the same, we will try to establish a value from the 21st
byte. Which is nonsensical.
Instead, we can detect the case that we are in a run of
duplicates, and simply do a final comparison against any one
of them (since they are all the same, it does not matter
which). If the keys match, we have found our entry (or one
of them, anyway). If not, then we know that we do not need
to look further, as we must be in a run of the duplicate
key.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git commit --author=$name" sets the author to one whose name
matches the given string from existing commits, when $name is not in
the "Name <e-mail>" format. However, it does not honor the mailmap
to use the canonical name for the author found this way.
Fix it by telling the logic to find a matching existing author to
honor the mailmap, and use the name and email after applying the
mailmap.
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
directory_exists_in_index() takes pathname and its length, but its
helper function directory_exists_in_index_icase() reads one byte
beyond the end of the pathname and expects there to be a '/'.
This needs to be fixed, as that one-byte-beyond-the-end location may
not even be readable, possibly by not registering directories to
name hashes with trailing slashes. In the meantime, update the new
caller added recently to treat_one_path() to make sure that the path
buffer it gives the function is one byte longer than the path it is
asking the function about by appending a slash to it.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test-sha1 helper program will run our internal sha1
routines over its input and output the 40-byte hex sha1.
Sometimes, however, it is useful to have the binary 20-byte
sha1 (and it's a pain to convert back in the shell). Let's
add a "-b" option to output the binary version.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Clarify documentation for "diff --no-index". State that when not
inside a repository, --no-index is implied and two arguments are
mandatory.
Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.
Signed-off-by: Dale Worley <worley@ariadne.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
platforms that are still in wide use, do not have a printf that
supports -v. Neither does Zsh (which is already handled in the code).
As suggested by Junio, let's test whether printf supports the -v
option and store the result. Then later, we can use it to
determine whether 'printf -v' can be used, or whether printf
must be called in a subshell.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Old Bash (3.0) which is distributed with RHEL 4.X and other ancient
platforms that are still in wide use, does not understand the
array+=() notation. Let's use an explicit assignment to the new array
element which works everywhere, like:
array[${#array[@]}+1]=''
The right-hand side '' is not strictly necessary, but in this case
I think it is more clear.
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The syntax for retrieving the number of elements in an array is:
${#name[@]}
Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When "merge.log" config is set, "rebase --preserve-merges" will add
the log lines to the message of the rebased merge commit. A rebase
should not modify a commit message automatically.
Teach "git-rebase" to ignore that configuration by passing
"--no-log" to the git-merge call.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The search help link was a superscript question mark right next to
a drop-down menu, which looks misaligned and is a cramped and
awkward click target. Remove the superscript tags and add some
spacing to fix these nits. Add a title attribute to provide an
explanatory mouseover.
Signed-off-by: Tony Finch <dot@dotat.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On the repository summary page, leave the owner line out if the
repo does not have an owner, rather than displaying a labelled empty
field. This does not affect the owner column in the projects list
page, which is present unless $omit_owner is true.
Signed-off-by: Tony Finch <dot@dotat.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The rss_logo CSS style has a fixed width which is too narrow for
the string "OPML". Replace the fixed width with horizontal padding
so the text fits with nice margins.
Signed-off-by: Tony Finch <dot@dotat.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the fixed width integer types uint16_t and uint32_t for on-disk
structures; unsigned short and unsigned int do not have a guaranteed
size.
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The deflate loop in bulk-checkin::stream_to_pack expects to get all bytes
from a file that it requests to read in a single function call. But it
used xread(), which does not give that guarantee. Replace it by
read_in_full().
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit 6c642a8786.
The previous commit introduced a size limit on IO chunks on all
platforms. The compat clipped_write() is not needed anymore.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Checking out 2GB or more through an external filter (see test) fails
on Mac OS X 10.8.4 (12E55) for a 64-bit executable with:
error: read from external filter cat failed
error: cannot feed the input to external filter cat
error: cat died of signal 13
error: external filter cat failed 141
error: external filter cat failed
The reason is that read() immediately returns with EINVAL when asked
to read more than 2GB. According to POSIX [1], if the value of
nbyte passed to read() is greater than SSIZE_MAX, the result is
implementation-defined. The write function has the same restriction
[2]. Since OS X still supports running 32-bit executables, the
32-bit limit (SSIZE_MAX = INT_MAX = 2GB - 1) seems to be also
imposed on 64-bit executables under certain conditions. For write,
the problem has been addressed earlier [6c642a].
Address the problem for read() and write() differently, by limiting
size of IO chunks unconditionally on all platforms in xread() and
xwrite(). Large chunks only cause problems, like causing latencies
when killing the process, even if OS X was not buggy. Doing IO in
reasonably sized smaller chunks should have no negative impact on
performance.
The compat wrapper clipped_write() introduced earlier [6c642a] is
not needed anymore. It will be reverted in a separate commit. The
new test catches read and write problems.
Note that 'git add' exits with 0 even if it prints filtering errors
to stderr. The test, therefore, checks stderr. 'git add' should
probably be changed (sometime in another commit) to exit with
nonzero if filtering fails. The test could then be changed to use
test_must_fail.
Thanks to the following people for suggestions and testing:
Johannes Sixt <j6t@kdbg.org>
John Keeping <john@keeping.me.uk>
Jonathan Nieder <jrnieder@gmail.com>
Kyle J. McKay <mackyle@gmail.com>
Linus Torvalds <torvalds@linux-foundation.org>
Torsten Bögershausen <tboegi@web.de>
[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
[2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html
[6c642a] commit 6c642a8786
compate/clipped-write.c: large write(2) fails on Mac OS X/XNU
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The condition as it is written in that line has already been checked
in the beginning of the function, which was introduced in
8503ee4 (2007-05-01, Fix read_mailmap to handle a caller uninterested
in repo abbreviation)
Helped-by: Jeff King <peff@peff.net>
Helped-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git fails due to a segmentation fault if a submodule path is empty.
Here is an example .gitmodules that will cause a segmentation fault:
[submodule "foo-module"]
path
url = http://host/repo.git
$ git status
Segmentation fault (core dumped)
This is because the parsing of "submodule.*.path" is not prepared to
see a value-less "true" and assumes that the value is always
non-NULL (parsing of "ignore" has the same problem).
Fix it by checking the NULL-ness of value and complain with
config_error_nonbool().
Signed-off-by: Jharrod LaFon <jlafon@eyesopen.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As we had to revert two topics at the last minute, let's have
another (hopefully short) round of rc to make sure the final release
will be sound.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Newlines in the path to a git repository were not an issue for the
git-specific bash prompt before commit efaa0c1532 (bash prompt:
combine 'git rev-parse' executions in the main code path, 2013-06-17),
because the path returned by 'git rev-parse --git-dir' was directly
stored in a variable, and this variable was later always accessed
inside double quotes.
Newlines are not an issue after commit efaa0c1532 either, but it's
more subtle. Since efaa0c1532 we use the following single 'git
rev-parse' execution to query various info about the repository:
git rev-parse --git-dir --is-inside-git-dir \
--is-bare-repository --is-inside-work-tree
The results to these queries are separated by a newline character in
the output, e.g.:
/home/szeder/src/git/.git
false
false
true
A newline in the path to the git repository could potentially break
the parsing of these results and ultimately the bash prompt, unless
the parsing is done right. Commit efaa0c1532 got it right, as I
consciously started parsing 'git rev-parse's output from the end,
where each record is a single line containing either 'true' or 'false'
or, after e3e0b9378b (bash prompt: combine 'git rev-parse' for
detached head, 2013-06-24), the abbreviated commit object name, and
all what remains at the beginning is the path to the git repository,
no matter how many lines it is.
This subtlety really warrants its own test, especially since I didn't
explain it in the log message or in an in-code comment back then, so
add a test to excercise the prompt with newline characters in the path
to the repository. Guard this test with the FUNNYNAMES prerequisite,
because not all filesystems support newlines in filenames. Note that
'git rev-parse --git-dir' prints '.git' or '.' when at the top of the
worktree or the repository, respectively, and only prints the full
path to the repository when in a subdirectory, hence the need for
changing into a subdir in the test.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
180bad3d (rebase -i: respect core.commentchar, 2013-02-11) updated
"rebase -i" to honor core.commentchar but missed one instance of
hard-coded '#' comment character in skip_unnecessary_picks().
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The code sequence ' (1u << i) < hsize && i < 31 ' is a multi step
process, whose first step requires that 'i' is already less that 31,
otherwise the result (1u << i) is undefined (and 'undef_val < hsize'
can therefore be assumed to be 'false'), and so the later test i < 31
can always be optimized away as dead code ('i' is already less than 31,
or the short circuit 'and' applies).
So we need to get rid of that code. One way would be to exchange the
order of the conditions, so the expression 'i < 31 && (1u << i) < hsize'
would remove that optimized unstable code already.
However when checking the previous lines in that function, we can deduce
that 'hsize' must always be smaller than (1u<<31), since 506049c7df
(fix >4GiB source delta assertion failure), because 'entries' is
capped at an upper bound of 0xfffffffeU, so 'hsize' contains a maximum
value of 0x3fffffff, which is smaller than (1u<<31), so the value of
'i' will never be larger than 31 and we can remove that condition
entirely.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Acked-by: Philip Oakley <philipoakley@iee.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An earlier draft of the previous step used cache_name_exists() to
check the directory we were looking at, which missed the second case
described in its log message. Demonstrate why it is not sufficient.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"ls-files -o" and "ls-files -k" both traverse the working tree down
to find either all untracked paths or those that will be "killed"
(removed from the working tree to make room) when the paths recorded
in the index are checked out. It is necessary to traverse the
working tree fully when enumerating all the "other" paths, but when
we are only interested in "killed" paths, we can take advantage of
the fact that paths that do not overlap with entries in the index
can never be killed.
The treat_one_path() helper function, which is called during the
recursive traversal, is the ideal place to implement an
optimization.
When we are looking at a directory P in the working tree, there are
three cases:
(1) P exists in the index. Everything inside the directory P in
the working tree needs to go when P is checked out from the
index.
(2) P does not exist in the index, but there is P/Q in the index.
We know P will stay a directory when we check out the contents
of the index, but we do not know yet if there is a directory
P/Q in the working tree to be killed, so we need to recurse.
(3) P does not exist in the index, and there is no P/Q in the index
to require P to be a directory, either. Only in this case, we
know that everything inside P will not be killed without
recursing.
Note that this helper is called by treat_leading_path() that decides
if we need to traverse only subdirectories of a single common
leading directory, which is essential for this optimization to be
correct. This caller checks each level of the leading path
component from shallower directory to deeper ones, and that is what
allows us to only check if the path appears in the index. If the
call to treat_one_path() weren't there, given a path P/Q/R, the real
traversal may start from directory P/Q/R, even when the index
records P as a regular file, and we would end up having to check if
any leading subpath in P/Q/R, e.g. P, appears in the index.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These codepaths always start from the_index and use index_*
functions, but there is no reason to do so. Use the compatibility
cache_* macro to access the current in-core index like everybody
else.
While at it, fix typo in the comment for a function to check if a
path within a directory appears in the index.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit cdfd94837b, as it
does not just apply to "@" (and forms with modifiers like @{u}
applied to it), but also affects e.g. "refs/heads/@/foo", which it
shouldn't.
The basic idea of giving a short-hand might be good, and the topic
can be retried later, but let's revert to avoid affecting existing
use cases for now for the upcoming release.
This reverts commit a73653130e, as it
has been reported that "ls-files --killed" is too time-consuming in
a deep directory with too many untracked crufts (e.g. $HOME/.git
tracking only a few files).
We'd need to revisit it later but "ls-files --killed" needs to be
optimized before it happens.
Before overwriting the destination index, first let's discard its
contents.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Tested-by: Лежанкин Иван <abyss.7@gmail.com> wrote:
Reviewed-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A handful of past contributors are recorded with multiple e-mail
addresses, all of which are undeliverable. With a lot of help from
Jonathan, we located all of them except for one person, and a pair
of addresses we suspect belong to a single person but we are not
certain.
Update the found ones with their currently preferred address, and
use the last known address to consolidate contributions by the lost
one.
Helped-by: Stefan Beller <stefanbeller@googlemail.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
- From the beginning of push.c in 755225d, 2006-04-29, "thin" option
was enabled by default but could be turned off with --no-thin.
- Then Shawn changed the default to 0 in favor of saving server
resources in a4503a1, 2007-09-09. --no-thin worked great.
- One day later, in 9b28851 Daniel extracted some code from push.c to
create transport.c. He (probably accidentally) flipped the default
value from 0 to 1 in transport_get().
From then on --no-thin is effectively no-op because git-push still
expects the default value to be false and only calls
transport_set_option() when "thin" variable in push.c is true (which
is unnecessary). Correct the code to respect --no-thin by calling
transport_set_option() in both cases.
receive-pack learns about --reject-thin-pack-for-testing option,
which only is for testing purposes, hence no document update.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-contacts invokes git-blame once for each patch hunk it encounters.
No attempt is made to consolidate invocations for multiple hunks
referencing the same file at the same revision. This can become
expensive quickly.
Reduce the number of git-blame invocations by taking advantage of the
ability to specify multiple -L ranges for a single invocation.
Without this patch, on a randomly chosen range of commits:
% time git-contacts 25fba78d36be6297^..23c339c0f262aad2 >/dev/null
real 0m6.142s
user 0m5.429s
sys 0m0.356s
With this patch:
% time git-contacts 25fba78d36be6297^..23c339c0f262aad2 >/dev/null
real 0m2.285s
user 0m2.093s
sys 0m0.165s
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-contacts invokes git-blame immediately upon encountering a patch
hunk. No attempt is made to consolidate invocations for multiple hunks
referencing the same file at the same revision. This can become
expensive quickly.
Any effort to reduce the number of times git-blame is run will need to
to know in advance which line ranges to blame per file per revision.
Make this information available by collecting all sources as a distinct
step from invoking git-blame. A subsequent patch will utilize the
information to optimize git-blame invocations.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than calling get_blame() with a zero-length hunk only to have it
rejected immediately, perform hunk-length validation earlier in order to
avoid calling get_blame() unnecessarily.
This is a preparatory step to simplify later patches which reduce the
number of git-blame invocations by collecting together all lines to
blame within a single file at a particular revision. By validating the
blame range early, the subsequent patch can more easily avoid adding
empty ranges at collection time.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Encourage new users to use 'log' instead. These days, these
commands are unified and just have different defaults.
'git log' only allowed you to view the log messages and no diffs
when it was added in early June 2005. It was only in early April
2006 that the command learned to take diff options. Because of
this, power users tended to use 'whatchanged' that already existed
since mid May 2005 and supported diff options.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Back when the core tutorial was written, `log` and `whatchanged`
were scripted Porcelains. In the "Inspecting Changes" section that
talks about the plumbing commands in the diff family, it made sense
to use `log` and `whatchanged` as good examples of the use of these
plumbing commands, and because even these scripted Porcelains were
novelty (there wasn't the new end-user tutorial written), it made
some sense to illustrate uses of the `git log` (and `git
whatchanged`) scripted Porcelain commands.
But we no longer have scripted `log` and `whatchanged` to serve as
examples, and this document is not where the end users learn what
`git log` command is about. Stop at briefly mentioning the
possibility of combining rev-list with diff-tree to build your own
log, and leave the end-user documentation of `log` to the new
tutorial and the user manual.
Also resurrect the last version of `git-log`, `git-whatchanged`, and
`git-show` to serve as examples to contrib/examples/ directory.
While at it, remove 'whatchanged' from a list of sample commands
that are affected by GIT_FLUSH environment variable. This is not
meant to be an exhaustive list but as a list of typical ones, and an
old command that is kept primarily for backward compatibility does
not belong to it.
Helped-by: Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Symlink contents in p4 print sometimes have a trailing
new line character, but sometimes it doesn't. git-p4
should only remove the last character if that character
is '\n'.
Signed-off-by: Alex Juncu <ajuncu@ixiacom.com>
Signed-off-by: Alex Badea <abadea@ixiacom.com>
Acked-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the tests, p4d is started without using "internationalized
mode". Make sure this environment variable is unset, otherwise
a mis-matched user setting would break the tests. The error
message would be "Unicode clients require a unicode enabled server."
[pw: use unset, add commit text]
Signed-off-by: Kazuki Saitoh <ksaitoh560@gmail.com>
Signed-off-by: Pete Wyckoff <pw@padd.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>