1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-02 07:17:58 +01:00
Commit graph

11473 commits

Author SHA1 Message Date
Nguyễn Thái Ngọc Duy
425a28e0a4 diff-lib: allow ita entries treated as "not yet exist in index"
When comparing the index and the working tree to show which paths are
new, and comparing the tree recorded in the HEAD and the index to see if
committing the contents recorded in the index would result in an empty
commit, we would want the former comparison to say "these are new paths"
and the latter to say "there is no change" for paths that are marked as
intent-to-add.

We made a similar attempt at d95d728a ("diff-lib.c: adjust position of
i-t-a entries in diff", 2015-03-16), which redefined the semantics of
these two comparison modes globally, which was a disaster and had to be
reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a
entries in diff"", 2015-06-23).

To make sure we do not repeat the same mistake, introduce a new internal
diffopt option so that this different semantics can be asked for only by
callers that ask it, while making sure other unaudited callers will get
the same comparison result.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-24 10:47:28 -07:00
Jeff King
614fe01521 test-lib: bail out when "-v" used under "prove"
When there is a TAP harness consuming the output of our test
scripts, the "--verbose" breaks the output by mingling
test command output with TAP. Because the TAP::Harness
module used by "prove" is fairly lenient, this _usually_
works, but it violates the spec, and things get very
confusing if the commands happen to output a line that looks
like TAP (e.g., the word "ok" on its own line).

Let's detect this situation and complain. Just calling
error() isn't great, though; prove will tell us that the
script failed, but the message doesn't make it through to
the user. Instead, we can use the special TAP signal "Bail
out!". This not only shows the message to the user, but
instructs the harness to stop running the tests entirely.
This is exactly what we want here, as the problem is in the
command-line options, and every test script would produce
the same error.

The result looks like this (the first "Bailout called" line
is in red if prove uses color on your terminal):

 $ make GIT_TEST_OPTS='--verbose --tee'
 rm -f -r 'test-results'
 *** prove ***
 Bailout called.  Further testing stopped:  verbose mode forbidden under TAP harness; try --verbose-log
 FAILED--Further testing stopped: verbose mode forbidden under TAP harness; try --verbose-log
 Makefile:39: recipe for target 'prove' failed
 make: *** [prove] Error 255

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-24 09:26:00 -07:00
Jonathan Tan
60ef86a162 trailer: support values folded to multiple lines
Currently, interpret-trailers requires that a trailer be only on 1 line.
For example:

a: first line
   second line

would be interpreted as one trailer line followed by one non-trailer line.

Make interpret-trailers support RFC 822-style folding, treating those
lines as one logical trailer.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21 11:48:35 -07:00
Jonathan Tan
c463a6b280 trailer: forbid leading whitespace in trailers
Currently, interpret-trailers allows leading whitespace in trailer
lines. This leads to false positives, especially for quoted lines or
bullet lists.

Forbid leading whitespace in trailers.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21 11:48:35 -07:00
Jonathan Tan
146245063e trailer: allow non-trailers in trailer block
Currently, interpret-trailers requires all lines of a trailer block to
be trailers (or comments) - if not it would not identify that block as a
trailer block, and thus create its own trailer block, inserting a blank
line.  For example:

  echo -e "\nSigned-off-by: x\nnot trailer" |
  git interpret-trailers --trailer "c: d"

would result in:

  Signed-off-by: x
  not trailer

  c: d

Relax the definition of a trailer block to require that the trailers (i)
are all trailers, or (ii) contain at least one Git-generated trailer and
consists of at least 25% trailers.

  Signed-off-by: x
  not trailer
  c: d

(i) is the existing functionality. (ii) allows arbitrary lines to be
included in trailer blocks, like those in [1], and still allow
interpret-trailers to be used.

[1]
e7d316a02f

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21 11:48:35 -07:00
Jeff King
452320f1f5 test-lib: add --verbose-log option
The "--verbose" option redirects output from arbitrary
test commands to stdout. This is useful for examining the
output manually, like:

  ./t5547-push-quarantine.sh -v | less

But it also means that the output is intermingled with the
TAP directives, which can confuse a TAP parser like "prove".
This has always been a potential problem, but became an
issue recently when one test happened to output the word
"ok" on a line by itself, which prove interprets as a test
success:

  $ prove t5547-push-quarantine.sh :: -v
  t5547-push-quarantine.sh .. 1/? To dest.git
   * [new branch]      HEAD -> master
  To dest.git
   ! [remote rejected] reject -> reject (pre-receive hook declined)
  error: failed to push some refs to 'dest.git'
  fatal: git cat-file d08c8eba97f4e683ece08654c7c8d2ba0c03b129: bad file
  t5547-push-quarantine.sh .. Failed -1/4 subtests

  Test Summary Report
  -------------------
  t5547-push-quarantine.sh (Wstat: 0 Tests: 5 Failed: 0)
    Parse errors: Tests out of sequence.  Found (2) but expected (3)
                  Tests out of sequence.  Found (3) but expected (4)
                  Tests out of sequence.  Found (4) but expected (5)
                  Bad plan.  You planned 4 tests but ran 5.
  Files=1, Tests=5,  0 wallclock secs ( 0.01 usr +  0.01 sys =  0.02 CPU)
  Result: FAIL

One answer is "if it hurts, don't do it", but that's not
quite the whole story. The Travis tests use "--verbose
--tee" so that they can get the benefit of prove's parallel
options, along with a verbose log in case there is a
failure. We just need the verbose output to go to the log,
but keep stdout clean.

Getting this right turns out to be surprisingly difficult.
Here's the progression of alternatives I considered:

 1. Add an option to write verbose output to stderr. This is
    hard to capture, though, because we want each test to
    have its own log (because they're all run in parallel
    and the jumbled output would be useless).

 2. Add an option to write verbose output to a file in
    test-results. This works, but the log is missing all of
    the non-verbose output, which gives context.

 3. Like (2), but teach say_color() to additionally output
    to the log. This mostly works, but misses any output
    that happens outside of the say() functions (which isn't
    a lot, but is a potential maintenance headache).

 4. Like (2), but make the log file the same as the "--tee"
    file. That almost works, but now we have two processes
    opening the same file. That gives us two separate
    descriptors, each with their own idea of the current
    position. They'll each start writing at offset 0, and
    overwrite each other's data.

 5. Like (4), but in each case open the file for appending.
    That atomically positions each write at the end of the
    file.

    It's possible we may still get sheared writes between
    the two processes, but this is already the case when
    writing to stdout. It's not a problem in practice
    because the test harness generally waits for snippets to
    finish before writing the TAP output.

    We can ignore buffering issues with tee, because POSIX
    mandates that it does not buffer. Likewise, POSIX
    specifies "tee -a", so it should be available
    everywhere.

This patch implements option (5), which seems to work well
in practice.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21 09:54:35 -07:00
Jeff King
925bdc928e test-lib: handle TEST_OUTPUT_DIRECTORY with spaces
We are careful in test_done to handle a results directory
with a space in it, but the "--tee" code path does not.
Doing:

  export TEST_OUTPUT_DIRECTORY='/tmp/path with spaces'
  ./t000-init.sh --tee

results in errors. Let's consistently double-quote our path
variables so that this works.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21 09:54:34 -07:00
Matthieu Moy
8a420edbd2 t9000-addresses: update expected results after fix
e3fdbcc8e1 (parse_mailboxes: accept extra text after <...> address,
2016-10-13) improved our in-house address parser and made it closer to
Mail::Address. As a consequence, some tests comparing it to
Mail::Address now pass, but e3fdbcc8e1 forgot to update the test.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21 09:47:51 -07:00
Johannes Schindelin
93b3df6f14 sequencer: start error messages consistently with lower case
Quite a few error messages touched by this developer during the work to
speed up rebase -i started with an upper case letter, violating our
current conventions. Instead of sneaking in this fix (and forgetting
quite a few error messages), let's just have one wholesale patch fixing
all of the error messages in the sequencer.

While at it, the funny "error: Error wrapping up..." was changed to a
less funny, but more helpful, "error: failed to finalize...".

Pointed out by Junio Hamano.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-21 09:32:35 -07:00
Jacob Keller
98985c6911 rev-list: use hdr_termination instead of a always using a newline
When adding support for prefixing output of log and other commands using
--line-prefix, commit 660e113ce1 ("graph: add support for
--line-prefix on all graph-aware output", 2016-08-31) accidentally
broke rev-list --header output.

In order to make the output appear with a line-prefix, the flow was
changed to always use the graph subsystem for display. Unfortunately
the graph flow in rev-list did not use info->hdr_termination as it was
assumed that graph output would never need to putput NULs.

Since we now always use the graph code in order to handle the case of
line-prefix, simply replace putchar('\n') with
putchar(info->hdr_termination) which will correct this issue.

Add a test for the --header case to make sure we don't break it in the
future.

Reported-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-20 14:44:37 -07:00
Junio C Hamano
76e368c378 t3700: fix broken test under !SANITY
An "add --chmod=+x" test recently added by 610d55af0f ("add: modify
already added files when --chmod is given", 2016-09-14) used "xfoo3"
as a test file.  The paths xfoo[1-3] were used by earlier tests for
symbolic links but they were expected to have been removed by the
time the execution reached this new test.

The removal with "git reset --hard" however happened in a pair of
earlier tests, both of which are protected by POSIXPERM,SANITY
prerequisites.  Platforms and test environments that lacked these
would have seen xfoo3 as a leftover symbolic link that points at
somewhere else at this point of the sequence, and the chmod test
would have given a wrong result.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-20 09:19:05 -07:00
Vasco Almeida
87cb7845fe i18n: convert mark error messages for translation
Mark error messages about CRLF for translation.

Update test to reflect changes.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 14:51:45 -07:00
Vasco Almeida
f25dfb5e8d i18n: apply: mark error message for translation
Update test to reflect changes.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 14:51:42 -07:00
Pranit Bauva
86009f32bb t0040: convert all possible tests to use test-parse-options --expect
Use "test-parse-options --expect" to rewrite the tests to avoid checking
the whole variable dump by just testing what is required.

This commit is a follow-up to 8ca65aebad ("t0040: convert a few
tests to use test-parse-options --expect", 2016-05-06).

Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 14:45:01 -07:00
Junio C Hamano
5b4c45af0c Merge branch 'da/mergetool-diff-order'
"git mergetool" learned to honor "-O<orderfile>" to control the
order of paths to present to the end user.

* da/mergetool-diff-order:
  mergetool: honor -O<orderfile>
  mergetool: honor diff.orderFile
  mergetool: move main program flow into a main() function
  mergetool: add copyright
2016-10-17 13:25:21 -07:00
Junio C Hamano
f7300cbfdd Merge branch 'jk/ref-symlink-loop'
A stray symbolic link in $GIT_DIR/refs/ directory could make name
resolution loop forever, which has been corrected.

* jk/ref-symlink-loop:
  files_read_raw_ref: prevent infinite retry loops in general
  files_read_raw_ref: avoid infinite loop on broken symlinks
2016-10-17 13:25:20 -07:00
Junio C Hamano
25ab004c53 Merge branch 'jk/quarantine-received-objects'
In order for the receiving end of "git push" to inspect the
received history and decide to reject the push, the objects sent
from the sending end need to be made available to the hook and
the mechanism for the connectivity check, and this was done
traditionally by storing the objects in the receiving repository
and letting "git gc" to expire it.  Instead, store the newly
received objects in a temporary area, and make them available by
reusing the alternate object store mechanism to them only while we
decide if we accept the check, and once we decide, either migrate
them to the repository or purge them immediately.

* jk/quarantine-received-objects:
  tmp-objdir: do not migrate files starting with '.'
  tmp-objdir: put quarantine information in the environment
  receive-pack: quarantine objects until pre-receive accepts
  tmp-objdir: introduce API for temporary object directories
  check_connected: accept an env argument
2016-10-17 13:25:20 -07:00
Junio C Hamano
dec040192f Merge branch 'jk/alt-odb-cleanup'
Codepaths involved in interacting alternate object store have
been cleaned up.

* jk/alt-odb-cleanup:
  alternates: use fspathcmp to detect duplicates
  sha1_file: always allow relative paths to alternates
  count-objects: report alternates via verbose mode
  fill_sha1_file: write into a strbuf
  alternates: store scratch buffer as strbuf
  fill_sha1_file: write "boring" characters
  alternates: use a separate scratch space
  alternates: encapsulate alt->base munging
  alternates: provide helper for allocating alternate
  alternates: provide helper for adding to alternates list
  link_alt_odb_entry: refactor string handling
  link_alt_odb_entry: handle normalize_path errors
  t5613: clarify "too deep" recursion tests
  t5613: do not chdir in main process
  t5613: whitespace/style cleanups
  t5613: use test_must_fail
  t5613: drop test_valid_repo function
  t5613: drop reachable_via function
2016-10-17 13:25:20 -07:00
Lars Schneider
edcc85814c convert: add filter.<driver>.process option
Git's clean/smudge mechanism invokes an external filter process for
every single blob that is affected by a filter. If Git filters a lot of
blobs then the startup time of the external filter processes can become
a significant part of the overall Git execution time.

In a preliminary performance test this developer used a clean/smudge
filter written in golang to filter 12,000 files. This process took 364s
with the existing filter mechanism and 5s with the new mechanism. See
details here: https://github.com/github/git-lfs/pull/1382

This patch adds the `filter.<driver>.process` string option which, if
used, keeps the external filter process running and processes all blobs
with the packet format (pkt-line) based protocol over standard input and
standard output. The full protocol is explained in detail in
`Documentation/gitattributes.txt`.

A few key decisions:

* The long running filter process is referred to as filter protocol
  version 2 because the existing single shot filter invocation is
  considered version 1.
* Git sends a welcome message and expects a response right after the
  external filter process has started. This ensures that Git will not
  hang if a version 1 filter is incorrectly used with the
  filter.<driver>.process option for version 2 filters. In addition,
  Git can detect this kind of error and warn the user.
* The status of a filter operation (e.g. "success" or "error) is set
  before the actual response and (if necessary!) re-set after the
  response. The advantage of this two step status response is that if
  the filter detects an error early, then the filter can communicate
  this and Git does not even need to create structures to read the
  response.
* All status responses are pkt-line lists terminated with a flush
  packet. This allows us to send other status fields with the same
  protocol in the future.

Helped-by: Martin-Louis Bright <mlbright@gmail.com>
Reviewed-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 11:45:52 -07:00
Lars Schneider
ed54970324 convert: modernize tests
Use `test_config` to set the config, check that files are empty with
`test_must_be_empty`, compare files with `test_cmp`, and remove spaces
after ">" and "<".

Please note that the "rot13" filter configured in "setup" keeps using
`git config` instead of `test_config` because subsequent tests might
depend on it.

Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17 11:36:49 -07:00
Jeff King
5827a03545 fetch: use "quick" has_sha1_file for tag following
When we auto-follow tags in a fetch, we look at all of the
tags advertised by the remote and fetch ones where we don't
already have the tag, but we do have the object it peels to.
This involves a lot of calls to has_sha1_file(), some of
which we can reasonably expect to fail. Since 45e8a74
(has_sha1_file: re-check pack directory before giving up,
2013-08-30), this may cause many calls to
reprepare_packed_git(), which is potentially expensive.

This has gone unnoticed for several years because it
requires a fairly unique setup to matter:

  1. You need to have a lot of packs on the client side to
     make reprepare_packed_git() expensive (the most
     expensive part is finding duplicates in an unsorted
     list, which is currently quadratic).

  2. You need a large number of tag refs on the server side
     that are candidates for auto-following (i.e., that the
     client doesn't have). Each one triggers a re-read of
     the pack directory.

  3. Under normal circumstances, the client would
     auto-follow those tags and after one large fetch, (2)
     would no longer be true. But if those tags point to
     history which is disconnected from what the client
     otherwise fetches, then it will never auto-follow, and
     those candidates will impact it on every fetch.

So when all three are true, each fetch pays an extra
O(nr_tags * nr_packs^2) cost, mostly in string comparisons
on the pack names. This was exacerbated by 47bf4b0
(prepare_packed_git_one: refactor duplicate-pack check,
2014-06-30) which uses a slightly more expensive string
check, under the assumption that the duplicate check doesn't
happen very often (and it shouldn't; the real problem here
is how often we are calling reprepare_packed_git()).

This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
accuracy for speed, in cases where we might be racy with a
simultaneous repack. This is similar to the fix in 0eeb077
(index-pack: avoid excessive re-reading of pack directory,
2015-06-09). As with that case, it's OK for has_sha1_file()
occasionally say "no I don't have it" when we do, because
the worst case is not a corruption, but simply that we may
fail to auto-follow a tag that points to it.

Here are results from the included perf script, which sets
up a situation similar to the one described above:

Test            HEAD^               HEAD
----------------------------------------------------------
5550.4: fetch   11.21(10.42+0.78)   0.08(0.04+0.02) -99.3%

Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-14 11:31:32 -07:00
Matthieu Moy
e3fdbcc8e1 parse_mailboxes: accept extra text after <...> address
The test introduced in this commit succeeds without the patch to Git.pm
if Mail::Address is installed, but fails otherwise because our in-house
parser does not accept any text after the email address. They succeed
both with and without Mail::Address after this commit.

Mail::Address accepts extra text and considers it as part of the name,
iff the address is surrounded with <...>. The implementation mimics
this behavior as closely as possible.

This mostly restores the behavior we had before b1c8a11 (send-email:
allow multiple emails using --cc, --to and --bcc, 2015-06-30), but we
keep the possibility to handle comma-separated lists.

Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-14 10:06:09 -07:00
Dennis Kaarsemaker
171c646f8c worktree: allow the main brach of a bare repository to be checked out
In bare repositories, get_worktrees() still returns the main repository,
so git worktree list can show it. ignore it in find_shared_symref so we
can still check out the main branch.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-14 09:58:58 -07:00
Jeff King
4f21454b55 merge-base: handle --fork-point without reflog
The --fork-point option looks in the reflog to try to find
where a derived branch forked from a base branch. However,
if the reflog for the base branch is totally empty (as it
commonly is right after cloning, which does not write a
reflog entry), then our for_each_reflog call will not find
any entries, and we will come up with no merge base, even
though there may be one with the current tip of the base.

We can fix this by just adding the current tip to
our list of collected entries.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-12 14:30:16 -07:00
Michael J Gruber
661a180681 gpg-interface: use more status letters
According to gpg2's doc/DETAILS:

    For each signature only one of the codes GOODSIG, BADSIG,
    EXPSIG, EXPKEYSIG, REVKEYSIG or ERRSIG will be emitted.

gpg1 ("classic") behaves the same (although doc/DETAILS differs).

Currently, we parse gpg's status output for GOODSIG, BADSIG and
trust information and translate that into status codes G, B, U, N
for the %G?  format specifier.

git-verify-* returns success in the GOODSIG case only. This is
somewhat in disagreement with gpg, which considers the first 5 of
the 6 above as VALIDSIG, but we err on the very safe side.

Introduce additional status codes E, X, Y, R for ERRSIG, EXPSIG,
EXPKEYSIG, and REVKEYSIG so that a user of %G? gets more information
about the absence of a 'G' on first glance.

Requested-by: Alex <agrambot@gmail.com>
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-12 10:41:59 -07:00
Vasco Almeida
3bb464437e t1512: become resilient to GETTEXT_POISON build
The concerned message was marked for translation by 0c99171
("get_short_sha1: mark ambiguity error for translation", 2016-09-26).

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-12 10:39:01 -07:00
Junio C Hamano
e1eb84cccb Merge branch 'kd/mailinfo-quoted-string' into maint
An author name, that spelled a backslash-quoted double quote in the
human readable part "My \"double quoted\" name", was not unquoted
correctly while applying a patch from a piece of e-mail.

* kd/mailinfo-quoted-string:
  mailinfo: unescape quoted-pair in header fields
  t5100-mailinfo: replace common path prefix with variable
2016-10-11 14:20:32 -07:00
David Aguilar
654311bf6e mergetool: honor -O<orderfile>
Teach mergetool to pass "-O<orderfile>" down to `git diff` when
specified on the command-line.

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: David Aguilar <davvid@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-11 10:04:31 -07:00
David Aguilar
57937f70a0 mergetool: honor diff.orderFile
Teach mergetool to get the list of files to edit via `diff` so that we
gain support for diff.orderFile.

Suggested-by: Luis Gutierrez <luisgutz@gmail.com>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: David Aguilar <davvid@gmail.com>
Reviewed-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-11 10:04:27 -07:00
Junio C Hamano
a460ea4a3c Merge branch 'nd/shallow-deepen'
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper.  A new option, "--deepen=<n>", has been added to make this
easier to use.  "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".

* nd/shallow-deepen: (27 commits)
  fetch, upload-pack: --deepen=N extends shallow boundary by N commits
  upload-pack: add get_reachable_list()
  upload-pack: split check_unreachable() in two, prep for get_reachable_list()
  t5500, t5539: tests for shallow depth excluding a ref
  clone: define shallow clone boundary with --shallow-exclude
  fetch: define shallow boundary with --shallow-exclude
  upload-pack: support define shallow boundary by excluding revisions
  refs: add expand_ref()
  t5500, t5539: tests for shallow depth since a specific date
  clone: define shallow clone boundary based on time with --shallow-since
  fetch: define shallow boundary with --shallow-since
  upload-pack: add deepen-since to cut shallow repos based on time
  shallow.c: implement a generic shallow boundary finder based on rev-list
  fetch-pack: use a separate flag for fetch in deepening mode
  fetch-pack.c: mark strings for translating
  fetch-pack: use a common function for verbose printing
  fetch-pack: use skip_prefix() instead of starts_with()
  upload-pack: move rev-list code out of check_non_tip()
  upload-pack: make check_non_tip() clean things up on error
  upload-pack: tighten number parsing at "deepen" lines
  ...
2016-10-10 14:03:50 -07:00
Junio C Hamano
e6e24c94df Merge branch 'jk/pack-objects-optim-mru'
"git pack-objects" in a repository with many packfiles used to
spend a lot of time looking for/at objects in them; the accesses to
the packfiles are now optimized by checking the most-recently-used
packfile first.

* jk/pack-objects-optim-mru:
  pack-objects: use mru list when iterating over packs
  pack-objects: break delta cycles before delta-search phase
  sha1_file: make packed_object_info public
  provide an initializer for "struct object_info"
2016-10-10 14:03:47 -07:00
Junio C Hamano
b8688adb12 Merge branch 'rs/qsort'
We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of
the time third parameter is redundant.  A new QSORT() macro lets us
omit it.

* rs/qsort:
  show-branch: use QSORT
  use QSORT, part 2
  coccicheck: use --all-includes by default
  remove unnecessary check before QSORT
  use QSORT
  add QSORT
2016-10-10 14:03:46 -07:00
Jeff King
722ff7f876 receive-pack: quarantine objects until pre-receive accepts
When a client pushes objects to us, index-pack checks the
objects themselves and then installs them into place. If we
then reject the push due to a pre-receive hook, we cannot
just delete the packfile; other processes may be depending
on it. We have to do a normal reachability check at this
point via `git gc`.

But such objects may hang around for weeks due to the
gc.pruneExpire grace period. And worse, during that time
they may be exploded from the pack into inefficient loose
objects.

Instead, this patch teaches receive-pack to put the new
objects into a "quarantine" temporary directory. We make
these objects available to the connectivity check and to the
pre-receive hook, and then install them into place only if
it is successful (and otherwise remove them as tempfiles).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:54:02 -07:00
Jeff King
ea0fc3b417 alternates: use fspathcmp to detect duplicates
On a case-insensitive filesystem, we should realize that
"a/objects" and "A/objects" are the same path. We already
use fspathcmp() to check against the main object directory,
but until recently we couldn't use it for comparing against
other alternates (because their paths were not
NUL-terminated strings). But now we can, so let's do so.

Note that we also need to adjust count-objects to load the
config, so that it can see the setting of core.ignorecase
(this is required by the test, but is also a general bugfix
for users of count-objects).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:52:37 -07:00
Jeff King
087b6d5840 sha1_file: always allow relative paths to alternates
We recursively expand alternates repositories, so that if A
borrows from B which borrows from C, A can see all objects.

For the root object database, we allow relative paths, so A
can point to B as "../B/objects". However, we currently do
not allow relative paths when recursing, so B must use an
absolute path to reach C.

That is an ancient protection from c2f493a (Transitively
read alternatives, 2006-05-07) that tries to avoid adding
the same alternate through two different paths. Since
5bdf0a8 (sha1_file: normalize alt_odb path before comparing
and storing, 2011-09-07), we use a normalized absolute path
for each alt_odb entry.

This means that in most cases the protection is no longer
necessary; we will detect the duplicate no matter how we got
there (but see below).  And it's a good idea to get rid of
it, as it creates an unnecessary complication when setting
up recursive alternates (B has to know that A is going to
borrow from it and make sure to use an absolute path).

Note that our normalization doesn't actually look at the
filesystem, so it can still be fooled by crossing symbolic
links. But that's also true of absolute paths, so it's not a
good reason to disallow only relative paths (it's
potentially a reason to switch to real_path(), but that's a
separate and non-trivial change).

We adjust the test script here to demonstrate that this now
works, and add new tests to show that the normalization does
indeed suppress duplicates.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:52:37 -07:00
Jeff King
5fe849d651 count-objects: report alternates via verbose mode
There's no way to get the list of alternates that git
computes internally; our tests only infer it based on which
objects are available. In addition to testing, knowing this
list may be helpful for somebody debugging their alternates
setup.

Let's add it to the "count-objects -v" output. We could give
it a separate flag, but there's not really any need.
"count-objects -v" is already a debugging catch-all for the
object database, its output is easily extensible to new data
items, and printing the alternates is not expensive (we
already had to find them to count the objects).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:52:37 -07:00
Jeff King
f0f86fa9f3 t5613: clarify "too deep" recursion tests
These tests are just trying to show that we allow recursion
up to a certain depth, but not past it. But the counting is
a bit non-intuitive, and rather than test at the edge of the
breakage, we test "OK" cases in the middle of the chain.
Let's explain what's going on, and explicitly test the
switch between "OK" and "too deep".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:52:36 -07:00
Stefan Beller
3389e78ec8 submodule: ignore trailing slash in relative url
This is similar to the previous patch, though no user reported a bug and
I could not find a regressive behavior.

However it is a good thing to be strict on the output and for that we
always omit a trailing slash.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:30:31 -07:00
Stefan Beller
087885049e submodule: ignore trailing slash on superproject URL
Before 63e95beb0 (2016-04-15, submodule: port resolve_relative_url from
shell to C), it did not matter if the superprojects URL had a trailing
slash or not. It was just chopped off as one of the first steps
(The "remoteurl=${remoteurl%/}" near the beginning of
resolve_relative_url(), which was removed in said commit).

When porting this to the C version, an off-by-one error was introduced
and we did not check the actual last character to be a slash, but the
NULL delimiter.

Reintroduce the behavior from before 63e95beb0, to ignore the trailing
slash.

Reported-by: <venv21@gmail.com>
Helped-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 13:30:28 -07:00
Brandon Williams
75a6315f74 ls-files: add pathspec matching for submodules
Pathspecs can be a bit tricky when trying to apply them to submodules.
The main challenge is that the pathspecs will be with respect to the
superproject and not with respect to paths in the submodule.  The
approach this patch takes is to pass in the identical pathspec from the
superproject to the submodule in addition to the submodule-prefix, which
is the path from the root of the superproject to the submodule, and then
we can compare an entry in the submodule prepended with the
submodule-prefix to the pathspec in order to determine if there is a
match.

This patch also permits the pathspec logic to perform a prefix match against
submodules since a pathspec could refer to a file inside of a submodule.
Due to limitations in the wildmatch logic, a prefix match is only done
literally.  If any wildcard character is encountered we'll simply punt
and produce a false positive match.  More accurate matching will be done
once inside the submodule.  This is due to the superproject not knowing
what files could exist in the submodule.

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 12:14:58 -07:00
Brandon Williams
07c01b9fd9 ls-files: pass through safe options for --recurse-submodules
Pass through some known-safe options when recursing into submodules.
(--cached, -v, -t, -z, --debug, --eol)

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 12:14:58 -07:00
Brandon Williams
e77aa336f1 ls-files: optionally recurse into submodules
Allow ls-files to recognize submodules in order to retrieve a list of
files from a repository's submodules.  This is done by forking off a
process to recursively call ls-files on all submodules. Use top-level
--super-prefix option to pass a path to the submodule which it can
use to prepend to output or pathspec matching logic.

Signed-off-by: Brandon Williams <bmwill@google.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 12:14:58 -07:00
Jeff King
3f7bd767ed files_read_raw_ref: avoid infinite loop on broken symlinks
Our ref resolution first runs lstat() on any path we try to
look up, because we want to treat symlinks specially (by
resolving them manually and considering them symrefs). But
if the results of `readlink` do _not_ look like a ref, we
fall through to treating it like a normal file, and just
read the contents of the linked path.

Since fcb7c76 (resolve_ref_unsafe(): close race condition
reading loose refs, 2013-06-19), that "normal file" code
path will stat() the file and if we see ENOENT, will jump
back to the lstat(), thinking we've seen inconsistent
results between the two calls. But for a symbolic ref, this
isn't a race: the lstat() found the symlink, and the stat()
is looking at the path it points to. We end up in an
infinite loop calling lstat() and stat().

We can fix this by avoiding the retry-on-inconsistent jump
when we know that we found a symlink. While we're at it,
let's add a comment explaining why the symlink case gets to
this code in the first place; without that, it is not
obvious that the correct solution isn't to avoid the stat()
code path entirely.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10 10:53:16 -07:00
Junio C Hamano
578e6021c0 Merge branch 'rs/c-auto-resets-attributes'
When "%C(auto)" appears at the very beginning of the pretty format
string, it did not need to issue the reset sequence, but it did.

* rs/c-auto-resets-attributes:
  pretty: avoid adding reset for %C(auto) if output is empty
2016-10-06 14:53:12 -07:00
Junio C Hamano
8c98a68981 Merge branch 'vn/revision-shorthand-for-side-branch-log'
"git log rev^..rev" is an often-used revision range specification
to show what was done on a side branch merged at rev.  This has
gained a short-hand "rev^-1".  In general "rev^-$n" is the same as
"^rev^$n rev", i.e. what has happened on other branches while the
history leading to nth parent was looking the other way.

* vn/revision-shorthand-for-side-branch-log:
  revision: new rev^-n shorthand for rev^n..rev
2016-10-06 14:53:10 -07:00
Junio C Hamano
66c22ba6fb Merge branch 'jk/ambiguous-short-object-names'
When given an abbreviated object name that is not (or more
realistically, "no longer") unique, we gave a fatal error
"ambiguous argument".  This error is now accompanied by hints that
lists the objects that begins with the given prefix.  During the
course of development of this new feature, numerous minor bugs were
uncovered and corrected, the most notable one of which is that we
gave "short SHA1 xxxx is ambiguous." twice without good reason.

* jk/ambiguous-short-object-names:
  get_short_sha1: make default disambiguation configurable
  get_short_sha1: list ambiguous objects on error
  for_each_abbrev: drop duplicate objects
  sha1_array: let callbacks interrupt iteration
  get_short_sha1: mark ambiguity error for translation
  get_short_sha1: NUL-terminate hex prefix
  get_short_sha1: refactor init of disambiguation code
  get_short_sha1: parse tags when looking for treeish
  get_sha1: propagate flags to child functions
  get_sha1: avoid repeating ourselves via ONLY_TO_DIE
  get_sha1: detect buggy calls with multiple disambiguators
2016-10-06 14:53:10 -07:00
Junio C Hamano
a17505f262 diff: introduce diff.wsErrorHighlight option
With the preparatory steps, it has become trivial to teach the
system a new diff.wsErrorHighlight configuration that gives the
default value for --ws-error-highlight command line option.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-04 15:49:05 -07:00
Junio C Hamano
f3f5c7f520 t4015: split out the "setup" part of ws-error-highlight test
We'd want to run this same set of test twice, once with the option
and another time with an equivalent configuration setting.  Split
out the step that prepares the test data and expected output and
move the test for the command line option into a separate test.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-04 15:49:05 -07:00
Jeff King
8f2ed26753 t5613: do not chdir in main process
Our usual style when working with subdirectories is to chdir
inside a subshell or to use "git -C", which means we do not
have to constantly return to the main test directory. Let's
convert this old test, which does not follow that style.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03 14:25:43 -07:00
Jeff King
128a348d04 t5613: whitespace/style cleanups
Our normal test style these days puts the opening quote of
the body on the description line, and indents the body with
a single tab. This ancient test did not follow this.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03 14:25:43 -07:00
Jeff King
195eb46546 t5613: use test_must_fail
Besides being our normal style, this correctly checks for an
error exit() versus signal death.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03 14:25:43 -07:00
Jeff King
b28a88f26a t5613: drop test_valid_repo function
This function makes sure that "git fsck" does not report any
errors. But "--full" has been the default since f29cd39
(fsck: default to "git fsck --full", 2009-10-20), and we can
use the exit code (instead of counting the lines) since
e2b4f63 (fsck: exit with non-zero status upon errors,
2007-03-05).

So we can just use "git fsck", which is shorter and more
flexible (e.g., we can use "git -C").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03 14:25:43 -07:00
Jeff King
32d8b4226a t5613: drop reachable_via function
This function was never used since its inception in dd05ea1
(test case for transitive info/alternates, 2006-05-07).
Which is just as well, since it mutates the repo state in a
way that would invalidate further tests, without cleaning up
after itself. Let's get rid of it so that nobody is tempted
to use it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-03 14:25:43 -07:00
Junio C Hamano
4e34e20c9f Merge branch 'dt/tree-fsck'
The codepath in "git fsck" to detect malformed tree objects has
been updated not to die but keep going after detecting them.

* dt/tree-fsck:
  fsck: handle bad trees like other errors
  tree-walk: be more specific about corrupt tree errors
2016-10-03 13:30:38 -07:00
Junio C Hamano
fe252ef81a Merge branch 'kd/mailinfo-quoted-string'
An author name, that spelled a backslash-quoted double quote in the
human readable part "My \"double quoted\" name", was not unquoted
correctly while applying a patch from a piece of e-mail.

* kd/mailinfo-quoted-string:
  mailinfo: unescape quoted-pair in header fields
  t5100-mailinfo: replace common path prefix with variable
2016-10-03 13:30:38 -07:00
Junio C Hamano
53eb85e623 Merge branch 'nd/init-core-worktree-in-multi-worktree-world'
"git init" tried to record core.worktree in the repository's
'config' file when GIT_WORK_TREE environment variable was set and
it was different from where GIT_DIR appears as ".git" at its top,
but the logic was faulty when .git is a "gitdir:" file that points
at the real place, causing trouble in working trees that are
managed by "git worktree".  This has been corrected.

* nd/init-core-worktree-in-multi-worktree-world:
  init: kill git_link variable
  init: do not set unnecessary core.worktree
  init: kill set_git_dir_init()
  init: call set_git_dir_init() from within init_db()
  init: correct re-initialization from a linked worktree
2016-10-03 13:30:35 -07:00
Junio C Hamano
347408496a Merge branch 'ik/gitweb-force-highlight'
"gitweb" can spawn "highlight" to show blob contents with
(programming) language-specific syntax highlighting, but only
when the language is known.  "highlight" can however be told
to make the guess itself by giving it "--force" option, which
has been enabled.

* ik/gitweb-force-highlight:
  gitweb: use highlight's shebang detection
  gitweb: remove unused guess_file_syntax() parameter
2016-10-03 13:30:34 -07:00
Junio C Hamano
f4315eed7f Merge branch 'jk/pack-tag-of-tag' into maint
"git pack-objects --include-tag" was taught that when we know that
we are sending an object C, we want a tag B that directly points at
C but also a tag A that points at the tag B.  We used to miss the
intermediate tag B in some cases.

* jk/pack-tag-of-tag:
  pack-objects: walk tag chains for --include-tag
  t5305: simplify packname handling
  t5305: use "git -C"
  t5305: drop "dry-run" of unpack-objects
  t5305: move cleanup into test block
2016-10-03 13:22:13 -07:00
René Scharfe
82b83da8d3 pretty: avoid adding reset for %C(auto) if output is empty
We emit an escape sequence for resetting color and attribute for
%C(auto) to make sure automatic coloring is displayed as intended.
Stop doing that if the output strbuf is empty, i.e. when %C(auto)
appears at the start of the format string, because then there is no
need for a reset and we save a few bytes in the output.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-29 20:44:09 -07:00
Junio C Hamano
4cff50b3fb Merge branch 'jt/mailinfo-fold-in-body-headers'
When "git format-patch --stdout" output is placed as an in-body
header and it uses the RFC2822 header folding, "git am" failed to
put the header line back into a single logical line.  The
underlying "git mailinfo" was taught to handle this properly.

* jt/mailinfo-fold-in-body-headers:
  mailinfo: handle in-body header continuations
  mailinfo: make is_scissors_line take plain char *
  mailinfo: separate in-body header processing
2016-09-29 16:57:12 -07:00
Junio C Hamano
36f64036f6 Merge branch 'tg/add-chmod+x-fix' into maint
"git add --chmod=+x <pathspec>" added recently only toggled the
executable bit for paths that are either new or modified. This has
been corrected to flip the executable bit for all paths that match
the given pathspec.

* tg/add-chmod+x-fix:
  t3700-add: do not check working tree file mode without POSIXPERM
  t3700-add: create subdirectory gently
  add: modify already added files when --chmod is given
  read-cache: introduce chmod_index_entry
  update-index: add test for chmod flags
2016-09-29 16:49:47 -07:00
Junio C Hamano
cec5f0bf80 Merge branch 'rt/rebase-i-broken-insn-advise' into maint
When "git rebase -i" is given a broken instruction, it told the
user to fix it with "--edit-todo", but didn't say what the step
after that was (i.e. "--continue").

* rt/rebase-i-broken-insn-advise:
  rebase -i: improve advice on bad instruction lines
2016-09-29 16:49:46 -07:00
Junio C Hamano
300e95f7df Merge branch 'js/regexec-buf' into maint
Some codepaths in "git diff" used regexec(3) on a buffer that was
mmap(2)ed, which may not have a terminating NUL, leading to a read
beyond the end of the mapped region.  This was fixed by introducing
a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND
extension.

* js/regexec-buf:
  regex: use regexec_buf()
  regex: add regexec_buf() that can work on a non NUL-terminated string
  regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails
2016-09-29 16:49:45 -07:00
Junio C Hamano
d336b67568 Merge branch 'nd/checkout-disambiguation' into maint
"git checkout <word>" does not follow the usual disambiguation
rules when the <word> can be both a rev and a path, to allow
checking out a branch 'foo' in a project that happens to have a
file 'foo' in the working tree without having to disambiguate.
This was poorly documented and the check was incorrect when the
command was run from a subdirectory.

* nd/checkout-disambiguation:
  checkout: fix ambiguity check in subdir
  checkout.txt: document a common case that ignores ambiguation rules
  checkout: add some spaces between code and comment
2016-09-29 16:49:44 -07:00
Junio C Hamano
e25e6f3947 Merge branch 'jk/rebase-i-drop-ident-check' into maint
Even when "git pull --rebase=preserve" (and the underlying "git
rebase --preserve") can complete without creating any new commit
(i.e. fast-forwards), it still insisted on having a usable ident
information (read: user.email is set correctly), which was less
than nice.  As the underlying commands used inside "git rebase"
would fail with a more meaningful error message and advice text
when the bogus ident matters, this extra check was removed.

* jk/rebase-i-drop-ident-check:
  rebase-interactive: drop early check for valid ident
2016-09-29 16:49:41 -07:00
Junio C Hamano
7b7e977b96 Merge branch 'jt/format-patch-base-info-above-sig' into maint
"git format-patch --base=..." feature that was recently added
showed the base commit information after "-- " e-mail signature
line, which turned out to be inconvenient.  The base information
has been moved above the signature line.

* jt/format-patch-base-info-above-sig:
  format-patch: show base info before email signature
2016-09-29 16:49:40 -07:00
Junio C Hamano
08d0f7a531 Merge branch 'ks/perf-build-with-autoconf' into maint
Performance tests done via "t/perf" did not use the same set of
build configuration if the user relied on autoconf generated
configuration.

* ks/perf-build-with-autoconf:
  t/perf/run: copy config.mak.autogen & friends to build area
2016-09-29 16:49:40 -07:00
Junio C Hamano
ef4f0cad4b Merge branch 'rs/xdiff-merge-overlapping-hunks-for-W-context' into maint
"git diff -W" output needs to extend the context backward to
include the header line of the current function and also forward to
include the body of the entire current function up to the header
line of the next one.  This process may have to merge to adjacent
hunks, but the code forgot to do so in some cases.

* rs/xdiff-merge-overlapping-hunks-for-W-context:
  xdiff: fix merging of hunks with -W context and -u context
2016-09-29 16:49:39 -07:00
Junio C Hamano
35ec7fd479 Merge branch 'jk/fix-remote-curl-url-wo-proto' into maint
"git fetch http::/site/path" did not die correctly and segfaulted
instead.

* jk/fix-remote-curl-url-wo-proto:
  remote-curl: handle URLs without protocol
2016-09-29 16:49:38 -07:00
René Scharfe
9ed0d8d6e6 use QSORT
Apply the semantic patch contrib/coccinelle/qsort.cocci to the code
base, replacing calls of qsort(3) with QSORT.  The resulting code is
shorter and supports empty arrays with NULL pointers.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-29 15:42:18 -07:00
Kevin Daudt
f357e5de31 mailinfo: unescape quoted-pair in header fields
rfc2822 has provisions for quoted strings in structured header fields,
but also allows for escaping these with so-called quoted-pairs.

The only thing git currently does is removing exterior quotes, but
quotes within are left alone.

Remove exterior quotes and remove escape characters so that they don't
show up in the author field.

Signed-off-by: Kevin Daudt <me@ikke.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-28 13:21:18 -07:00
Kevin Daudt
ee4d679f57 t5100-mailinfo: replace common path prefix with variable
Many tests need to store data in a file, and repeat the same pattern to
refer to that path:

    "$TEST_DIRECTORY"/t5100/

Create a variable that contains this path, and use that instead.

While we're making this change, make sure the quotes are not just around
the variable, but around the entire string to not give the impression
we want shell splitting to affect the other variables.

Signed-off-by: Kevin Daudt <me@ikke.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-28 13:16:59 -07:00
David Turner
8354fa3d4c fsck: handle bad trees like other errors
Instead of dying when fsck hits a malformed tree object, log the error
like any other and continue.  Now fsck can tell the user which tree is
bad, too.

Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-27 14:09:10 -07:00
Jeff King
2edffef233 tree-walk: be more specific about corrupt tree errors
When the tree-walker runs into an error, it just calls
die(), and the message is always "corrupt tree file".
However, we are actually covering several cases here; let's
give the user a hint about what happened.

Let's also avoid using the word "corrupt", which makes it
seem like the data bit-rotted on disk. Our sha1 check would
already have found that. These errors are ones of data that
is malformed in the first place.

Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-27 14:08:30 -07:00
Vegard Nossum
8779351dd7 revision: new rev^-n shorthand for rev^n..rev
"git log rev^..rev" is commonly used to show all work done on and merged
from a side branch. This patch introduces a shorthand "rev^-" for this
and additionally allows "rev^-$n" to mean "reachable from rev, excluding
what is reachable from the nth parent of rev". For example, for a
two-parent merge, you can use rev^-2 to get the set of commits which were
made to the main branch while the topic branch was prepared.

Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-27 10:59:28 -07:00
Jeff King
5b33cb1fd7 get_short_sha1: make default disambiguation configurable
When we find ambiguous short sha1s, we may get a
disambiguation rule from our caller's context. But if we
don't, we fall back to treating all sha1s the same, even
though most projects will tend to refer only to commits by
their short sha1s.

This patch introduces a configuration option that lets the
user pick a different fallback (e.g., only commits). It's
possible that we may want to make this the default, but it's
a good idea to start as a config option for two reasons:

  1. It lets people experiment with this and see if it's a
     good idea (i.e., the "tend to" above is an assumption;
     we don't really know if this will break some obscure
     cases).

  2. Even if we do flip the default, it gives people an
     escape hatch if it causes problems (you can sometimes
     override it by asking for "1234^{tree}", but not all
     combinations are possible).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-27 10:29:56 -07:00
Junio C Hamano
104a93a329 Merge branch 'rt/rebase-i-broken-insn-advise'
When "git rebase -i" is given a broken instruction, it told the
user to fix it with "--edit-todo", but didn't say what the step
after that was (i.e. "--continue").

* rt/rebase-i-broken-insn-advise:
  rebase -i: improve advice on bad instruction lines
2016-09-26 16:09:21 -07:00
Junio C Hamano
ebc63580a1 Merge branch 'tg/add-chmod+x-fix'
"git add --chmod=+x <pathspec>" added recently only toggled the
executable bit for paths that are either new or modified. This has
been corrected to flip the executable bit for all paths that match
the given pathspec.

* tg/add-chmod+x-fix:
  t3700-add: do not check working tree file mode without POSIXPERM
  t3700-add: create subdirectory gently
  add: modify already added files when --chmod is given
  read-cache: introduce chmod_index_entry
  update-index: add test for chmod flags
2016-09-26 16:09:20 -07:00
Junio C Hamano
6a67695268 Merge branch 'js/regexec-buf'
Some codepaths in "git diff" used regexec(3) on a buffer that was
mmap(2)ed, which may not have a terminating NUL, leading to a read
beyond the end of the mapped region.  This was fixed by introducing
a regexec_buf() helper that takes a <ptr,len> pair with REG_STARTEND
extension.

* js/regexec-buf:
  regex: use regexec_buf()
  regex: add regexec_buf() that can work on a non NUL-terminated string
  regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails
2016-09-26 16:09:19 -07:00
Junio C Hamano
31b83f361b Merge branch 'nd/checkout-disambiguation'
"git checkout <word>" does not follow the usual disambiguation
rules when the <word> can be both a rev and a path, to allow
checking out a branch 'foo' in a project that happens to have a
file 'foo' in the working tree without having to disambiguate.
This was poorly documented and the check was incorrect when the
command was run from a subdirectory.

* nd/checkout-disambiguation:
  checkout: fix ambiguity check in subdir
  checkout.txt: document a common case that ignores ambiguation rules
  checkout: add some spaces between code and comment
2016-09-26 16:09:18 -07:00
Junio C Hamano
8969feac7e Merge branch 'va/i18n-more'
Even more i18n.

* va/i18n-more:
  i18n: stash: mark messages for translation
  i18n: notes-merge: mark die messages for translation
  i18n: ident: mark hint for translation
  i18n: i18n: diff: mark die messages for translation
  i18n: connect: mark die messages for translation
  i18n: commit: mark message for translation
2016-09-26 16:09:18 -07:00
Junio C Hamano
e447d3182c Merge branch 'jt/format-patch-rfc'
In some projects, it is common to use "[RFC PATCH]" as the subject
prefix for a patch meant for discussion rather than application.  A
new option "--rfc" was a short-hand for "--subject-prefix=RFC PATCH"
to help the participants of such projects.

* jt/format-patch-rfc:
  format-patch: add "--rfc" for the common case of [RFC PATCH]
2016-09-26 16:09:17 -07:00
Junio C Hamano
b7af6ae5cf Merge branch 'mh/diff-indent-heuristic'
Output from "git diff" can be made easier to read by selecting
which lines are common and which lines are added/deleted
intelligently when the lines before and after the changed section
are the same.  A command line option is added to help with the
experiment to find a good heuristics.

* mh/diff-indent-heuristic:
  blame: honor the diff heuristic options and config
  parse-options: add parse_opt_unknown_cb()
  diff: improve positioning of add/delete blocks in diffs
  xdl_change_compact(): introduce the concept of a change group
  recs_match(): take two xrecord_t pointers as arguments
  is_blank_line(): take a single xrecord_t as argument
  xdl_change_compact(): only use heuristic if group can't be matched
  xdl_change_compact(): fix compaction heuristic to adjust ixo
2016-09-26 16:09:16 -07:00
Junio C Hamano
b3e588a48a Merge branch 'rs/c-auto-resets-attributes'
The pretty-format specifier "%C(auto)" used by the "log" family of
commands to enable coloring of the output is taught to also issue a
color-reset sequence to the output.

* rs/c-auto-resets-attributes:
  pretty: let %C(auto) reset all attributes
2016-09-26 16:09:15 -07:00
Jeff King
1ffa26c461 get_short_sha1: list ambiguous objects on error
When the user gives us an ambiguous short sha1, we print an
error and refuse to resolve it. In some cases, the next step
is for them to feed us more characters (e.g., if they were
retyping or cut-and-pasting from a full sha1). But in other
cases, that might be all they have. For example, an old
commit message may have used a 7-character hex that was
unique at the time, but is now ambiguous.  Git doesn't
provide any information about the ambiguous objects it
found, so it's hard for the user to find out which one they
probably meant.

This patch teaches get_short_sha1() to list the sha1s of the
objects it found, along with a few bits of information that
may help the user decide which one they meant. Here's what
it looks like on git.git:

  $ git rev-parse b2e1
  error: short SHA1 b2e1 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
  hint:   b2e1759 blob
  hint:   b2e18954 blob
  hint:   b2e1895c blob
  fatal: ambiguous argument 'b2e1': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

We show the tagname for tags, and the date and subject for
commits. For trees and blobs, in theory we could dig in the
history to find the paths at which they were present. But
that's very expensive (on the order of 30s for the kernel),
and it's not likely to be all that helpful. Most short
references are to commits, so the useful information is
typically going to be that the object in question _isn't_ a
commit. So it's silly to spend a lot of CPU preemptively
digging up the path; the user can do it themselves if they
really need to.

And of course it's somewhat ironic that we abbreviate the
sha1s in the disambiguation hint. But full sha1s would cause
annoying line wrapping for the commit lines, and presumably
the user is going to just re-issue their command immediately
with the corrected sha1.

We also restrict the list to those that match any
disambiguation hint. E.g.:

  $ git rev-parse b2e1:foo
  error: short SHA1 b2e1 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
  fatal: Invalid object name 'b2e1'.

does not bother reporting the blobs, because they cannot
work as a treeish.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:55:31 -07:00
Jeff King
fad6b9e590 for_each_abbrev: drop duplicate objects
If an object appears multiple times in the object database
(e.g., in both loose and packed form, or in two separate
packs), the disambiguation machinery may see it more than
once. The get_short_sha1() function handles this already,
but for_each_abbrev() blindly fires the callback for each
instance it finds.

We can fix this by collecting the output in a sha1 array and
de-duplicating it.  As a bonus, the sort done for the
de-duplication means that our output will be stable,
regardless of the order in which the objects are found.

Note that the old code normalized the callback's output to
0/1 to store in the 1-bit ds->ambiguous flag (which both
halted the iteration and was returned from the
for_each_abbrev function). Now that we are using sha1_array,
we can return the real value. In practice, it doesn't matter
as the sole caller only ever returns 0.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:41 -07:00
Jeff King
16ddcd403b sha1_array: let callbacks interrupt iteration
The callbacks for iterating a sha1_array must have a void
return.  This is unlike our usual for_each semantics, where
a callback may interrupt iteration and have its value
propagated. Let's switch it to the usual form, which will
enable its use in more places (e.g., where we are replacing
an existing iteration with a different data structure).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:41 -07:00
Jeff King
5d5def2aa5 get_short_sha1: parse tags when looking for treeish
The treeish disambiguation function tries to peel tags, but
it does so by calling:

  deref_tag(lookup_object(sha1), ...);

This will only work if we have previously looked at the tag
and created a "struct tag" for it. Since parsing revision
arguments typically happens before anything else, this is
usually not the case, and we would fail to peel the tag (we
are lucky that deref_tag() gracefully handles the NULL and
does not segfault).

Instead, we can use parse_object(). Note that this is the
same fix done by 94d75d1 (get_short_sha1(): correctly
disambiguate type-limited abbreviation, 2013-07-01), but
that commit fixed only the committish disambiguator, and
left the bug in the treeish one.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:30 -07:00
Jeff King
8a10fea49b get_sha1: propagate flags to child functions
The get_sha1() function is actually implementation by many
sub-functions, but we do not always pass our flags around to
all of those functions. As a result, we may forget that our
caller asked us to resolve with GET_SHA1_QUIETLY and output
messages. The two triggerable cases are:

  1. Resolving treeish:path will resolve the "treeish"
     portion using GET_SHA1_TREEISH, dropping all other
     flags.

  2. The peel_onion() function did not take flags at all
     but recurses to get_sha1_1(), which does.

The solution for both is to bitwise-OR their new flags with
the existing ones (after dropping any mutually exclusive
disambiguation flags).

This bug can trigger with "git rev-parse --quiet", which
asks for quiet resolution. But it can also happen in a more
vanilla code path when we do a follow-up ONLY_TO_DIE
invocation of get_sha1(), and that's what the tests check.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:30 -07:00
Jeff King
7243ffdd78 get_sha1: avoid repeating ourselves via ONLY_TO_DIE
When the revision code cannot parse an argument like
"HEAD:foo", it will call maybe_die_on_misspelt_object_name(),
which re-runs get_sha1() with an extra ONLY_TO_DIE flag. We
then spend more effort to generate a better error message.

Unfortunately, a side effect is that our second call may
repeat the same error messages from the original get_sha1()
call. You can see this with:

  $ git show 0017
  error: short SHA1 0017 is ambiguous.
  error: short SHA1 0017 is ambiguous.
  fatal: ambiguous argument '0017': unknown revision or path not in the working tree.
  Use '--' to separate paths from revisions, like this:
  'git <command> [<revision>...] -- [<file>...]'

where the second "error:" line comes from the ONLY_TO_DIE
call.

To fix this, we can make ONLY_TO_DIE imply QUIETLY. This is
a little odd, because the whole point of ONLY_TO_DIE is to
output error messages. But what we want to do is tell the
rest of the get_sha1() code (particularly get_sha1_1()) that
the _regular_ messages should be quiet, but the only-to-die
ones should not.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26 11:46:30 -07:00
Ian Kelling
779a206632 gitweb: use highlight's shebang detection
The "highlight" binary can, in some cases, determine the language type
by the means of file contents, for example the shebang in the first line
for some scripting languages.  Make use of this autodetection for files
which syntax is not known by gitweb.  In that case, pass the blob
contents to "highlight --force"; the parameter is needed to make it
always generate HTML output (which includes HTML-escaping).

Although we now run highlight on files which do not end up highlighted,
performance is virtually unaffected because when we call highlight, it
is used for escaping HTML.  In the case that highlight is used, gitweb
calls sanitize() instead of esc_html(), and the latter is significantly
slower (it does more, being roughly a superset of sanitize()).  Simple
benchmark comparing performance of 'blob' view of files without syntax
highlighting in gitweb before and after this change indicates ±1%
difference in request time for all file types.  Benchmark was performed
on local instance on Debian, using Apache/2.4.23 web server and CGI.

Document the feature and improve syntax highlight documentation, add
test to ensure gitweb doesn't crash when language detection is used.

Signed-off-by: Ian Kelling <ian@iankelling.org>
Acked-by: Jakub Narębski <jnareb@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-25 16:39:11 -07:00
Nguyễn Thái Ngọc Duy
6311cfaf93 init: do not set unnecessary core.worktree
The function needs_work_tree_config() that is called from
create_default_files() is supposed to be fed the path to ".git" that
looks as if it is at the top of the working tree, and decide if that
location matches the actual worktree being used.  This comparison allows
"git init" to decide if core.worktree needs to be recorded in the
working tree.

In the current code, however, we feed the return value from
get_git_dir(), which can be totally different from what the function
expects when "gitdir" file is involved.  Instead of giving the path to
the ".git" at the top of the working tree, we end up feeding the actual
path that the file points at.

This original location of ".git" however is only known to init_db().
Make init_db() save it and have it passed to create_default_files() as a
new parameter, which passes the correct location down to
needs_work_tree_config() to fix this.

Noticed-by: Max Nordlund <max.nordlund@sqore.com>
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-25 16:32:35 -07:00
Nguyễn Thái Ngọc Duy
fe9aa0b22e init: correct re-initialization from a linked worktree
When 'git init' is called from a linked worktree, we treat '.git'
dir (which is $GIT_COMMON_DIR/worktrees/something) as the main
'.git' (i.e. $GIT_COMMON_DIR) and populate the whole repository skeleton
in there. It does not harm anything (*) but it is still wrong.

Since 'git init' calls set_git_dir() at preparation time, which
indirectly calls get_common_dir() and correctly detects multiple
worktree setup, all git_path_buf() calls in create_default_files() will
return correct paths in both single and multiple worktree setups. The
only thing left is copy_templates(), which targets $GIT_DIR, not
$GIT_COMMON_DIR.

Fix that with get_git_common_dir(). This function will return $GIT_DIR
in single-worktree setup, so we don't have to make a special case for
multiple-worktree here.

(*) It does in fact, thanks to another bug. More on that later.

Noticed-by: Max Nordlund <max.nordlund@sqore.com>
Helped-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-25 16:32:35 -07:00
Junio C Hamano
ae1ae600db Merge branch 'jk/rebase-i-drop-ident-check'
Even when "git pull --rebase=preserve" (and the underlying "git
rebase --preserve") can complete without creating any new commit
(i.e. fast-forwards), it still insisted on having a usable ident
information (read: user.email is set correctly), which was less
than nice.  As the underlying commands used inside "git rebase"
would fail with a more meaningful error message and advice text
when the bogus ident matters, this extra check was removed.

* jk/rebase-i-drop-ident-check:
  rebase-interactive: drop early check for valid ident
2016-09-21 15:15:28 -07:00
Junio C Hamano
1fe6f5fb0a Merge branch 'va/i18n'
More i18n.

* va/i18n:
  i18n: update-index: mark warnings for translation
  i18n: show-branch: mark plural strings for translation
  i18n: show-branch: mark error messages for translation
  i18n: receive-pack: mark messages for translation
  notes: spell first word of error messages in lowercase
  i18n: notes: mark error messages for translation
  i18n: merge-recursive: mark verbose message for translation
  i18n: merge-recursive: mark error messages for translation
  i18n: config: mark error message for translation
  i18n: branch: mark option description for translation
  i18n: blame: mark error messages for translation
2016-09-21 15:15:28 -07:00
Junio C Hamano
e8f871a9ce Merge branch 'jt/format-patch-base-info-above-sig'
"git format-patch --base=..." feature that was recently added
showed the base commit information after "-- " e-mail signature
line, which turned out to be inconvenient.  The base information
has been moved above the signature line.

* jt/format-patch-base-info-above-sig:
  format-patch: show base info before email signature
2016-09-21 15:15:27 -07:00
Junio C Hamano
0c5ff91639 Merge branch 'ks/perf-build-with-autoconf'
Performance tests done via "t/perf" did not use the same set of
build configuration if the user relied on autoconf generated
configuration.

* ks/perf-build-with-autoconf:
  t/perf/run: copy config.mak.autogen & friends to build area
2016-09-21 15:15:27 -07:00
Junio C Hamano
4ed38637ec Merge branch 'rs/xdiff-merge-overlapping-hunks-for-W-context'
"git diff -W" output needs to extend the context backward to
include the header line of the current function and also forward to
include the body of the entire current function up to the header
line of the next one.  This process may have to merge to adjacent
hunks, but the code forgot to do so in some cases.

* rs/xdiff-merge-overlapping-hunks-for-W-context:
  xdiff: fix merging of hunks with -W context and -u context
2016-09-21 15:15:26 -07:00
Junio C Hamano
d845d727cb Merge branch 'jk/setup-sequence-update'
There were numerous corner cases in which the configuration files
are read and used or not read at all depending on the directory a
Git command was run, leading to inconsistent behaviour.  The code
to set-up repository access at the beginning of a Git process has
been updated to fix them.

* jk/setup-sequence-update:
  t1007: factor out repeated setup
  init: reset cached config when entering new repo
  init: expand comments explaining config trickery
  config: only read .git/config from configured repos
  test-config: setup git directory
  t1302: use "git -C"
  pager: handle early config
  pager: use callbacks instead of configset
  pager: make pager_program a file-local static
  pager: stop loading git_default_config()
  pager: remove obsolete comment
  diff: always try to set up the repository
  diff: handle --no-index prefixes consistently
  diff: skip implicit no-index check when given --no-index
  patch-id: use RUN_SETUP_GENTLY
  hash-object: always try to set up the git repository
2016-09-21 15:15:24 -07:00
Junio C Hamano
7f109ef54e Merge branch 'ks/pack-objects-bitmap'
Some codepaths in "git pack-objects" were not ready to use an
existing pack bitmap; now they are and as the result they have
become faster.

* ks/pack-objects-bitmap:
  pack-objects: use reachability bitmap index when generating non-stdout pack
  pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use
2016-09-21 15:15:21 -07:00
Junio C Hamano
7889ed25ac Merge branch 'js/cat-file-filters'
Even though "git hash-objects", which is a tool to take an
on-filesystem data stream and put it into the Git object store,
allowed to perform the "outside-world-to-Git" conversions (e.g.
end-of-line conversions and application of the clean-filter), and
it had the feature on by default from very early days, its reverse
operation "git cat-file", which takes an object from the Git object
store and externalize for the consumption by the outside world,
lacked an equivalent mechanism to run the "Git-to-outside-world"
conversion.  The command learned the "--filters" option to do so.

* js/cat-file-filters:
  cat-file: support --textconv/--filters in batch mode
  cat-file --textconv/--filters: allow specifying the path separately
  cat-file: introduce the --filters option
  cat-file: fix a grammo in the man page
2016-09-21 15:15:19 -07:00
Junio C Hamano
07d872434d Merge branch 'jt/accept-capability-advertisement-when-fetching-from-void'
JGit can show a fake ref "capabilities^{}" to "git fetch" when it
does not advertise any refs, but "git fetch" was not prepared to
see such an advertisement.  When the other side disconnects without
giving any ref advertisement, we used to say "there may not be a
repository at that URL", but we may have seen other advertisement
like "shallow" and ".have" in which case we definitely know that a
repository is there.  The code to detect this case has also been
updated.

* jt/accept-capability-advertisement-when-fetching-from-void:
  connect: advertized capability is not a ref
  connect: tighten check for unexpected early hang up
  tests: move test_lazy_prereq JGIT to test-lib.sh
2016-09-21 15:15:18 -07:00
Johannes Sixt
40e0dc17ce t3700-add: do not check working tree file mode without POSIXPERM
A recently introduced test checks the result of 'git status' after
setting the executable bit on a file. This check does not yield the
expected result when the filesystem does not support the executable
bit.

What we care about is that a file added with "--chmod=+x" has
executable bit in the index and that "--chmod=+x" (or any other
options for that matter) does not muck with working tree files.
The former is tested by other existing tests, so let's check the
latter more explicitly and only under POSIXPERM prerequisite.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 14:09:54 -07:00
Johannes Schindelin
b7d36ffca0 regex: use regexec_buf()
The new regexec_buf() function operates on buffers with an explicitly
specified length, rather than NUL-terminated strings.

We need to use this function whenever the buffer we want to pass to
regexec(3) may have been mmap(2)ed (and is hence not NUL-terminated).

Note: the original motivation for this patch was to fix a bug where
`git diff -G <regex>` would crash. This patch converts more callers,
though, some of which allocated to construct NUL-terminated strings,
or worse, modified buffers to temporarily insert NULs while calling
regexec(3).  By converting them to use regexec_buf(), the code has
become much cleaner.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 13:56:15 -07:00
Johannes Schindelin
db5dfa3314 regex: -G<pattern> feeds a non NUL-terminated string to regexec() and fails
When our pickaxe code feeds file contents to regexec(), it implicitly
assumes that the file contents are read into implicitly NUL-terminated
buffers (i.e. that we overallocate by 1, appending a single '\0').

This is not so.

In particular when the file contents are simply mmap()ed, we can be
virtually certain that the buffer is preceding uninitialized bytes, or
invalid pages.

Note that the test we add here is known to be flakey: we simply cannot
know whether the byte following the mmap()ed ones is a NUL or not.

Typically, on Linux the test passes. On Windows, it fails virtually
every time due to an access violation (that's a segmentation fault for
you Unix-y people out there). And Windows would be correct: the
regexec() call wants to operate on a regular, NUL-terminated string,
there is no NUL in the mmap()ed memory range, and it is undefined
whether the next byte is even legal to access.

When run with --valgrind it demonstrates quite clearly the breakage, of
course.

Being marked with `test_expect_failure`, this test will sometimes be
declare "TODO fixed", even if it only passes by mistake.

This test case represents a Minimal, Complete and Verifiable Example of
a breakage reported by Chris Sidi.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 13:56:15 -07:00
Johannes Sixt
b07ad46432 t3700-add: create subdirectory gently
The subdirectory 'sub' is created early in the test file. Later, a test
case removes it during its clean-up actions. However, this test case is
protected by POSIXPERM. Consequently, 'sub' remains when the POSIXPERM
prerequisite is not satisfied. Later, a recently introduced test case
creates 'sub' again. Use -p with mkdir so that it does not fail if 'sub'
already exists.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 11:05:35 -07:00
Jonathan Tan
6b4b013f18 mailinfo: handle in-body header continuations
Mailinfo currently handles multi-line headers, but it does not handle
multi-line in-body headers. Teach it to handle such headers, for
example, for this input:

  From: author <author@example.com>
  Date: Fri, 9 Jun 2006 00:44:16 -0700
  Subject: a very long
   broken line

  Subject: another very long
   broken line

interpret the in-body subject to be "another very long broken line"
instead of "another very long".

An existing test (t/t5100/msg0015) has an indented line immediately
after an in-body header - it has been modified to reflect the new
functionality.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 10:23:11 -07:00
Vasco Almeida
c041c6d06a i18n: notes-merge: mark die messages for translation
Update test to reflect changes.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 10:20:43 -07:00
Josh Triplett
68e83a5b82 format-patch: add "--rfc" for the common case of [RFC PATCH]
Add an alias for --subject-prefix='RFC PATCH', which is used
commonly in some development communities to deserve such a
short-hand.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 08:58:10 -07:00
Nguyễn Thái Ngọc Duy
b829b9439a checkout: fix ambiguity check in subdir
The two functions in parse_branchname_arg(), verify_non_filename and
check_filename, need correct prefix in order to reconstruct the paths
and check for their existence. With NULL prefix, they just check paths
at top dir instead.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-21 08:44:41 -07:00
Junio C Hamano
294573e6d7 Merge branch 'js/t9903-chaining' into maint
Test fix.

* js/t9903-chaining:
  t9903: fix broken && chain
2016-09-19 13:51:44 -07:00
Junio C Hamano
1e28677e5b Merge branch 'ep/use-git-trace-curl-in-tests' into maint
Update a few tests that used to use GIT_CURL_VERBOSE to use the
newer GIT_TRACE_CURL.

* ep/use-git-trace-curl-in-tests:
  t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var
  t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var
  test-lib.sh: preserve GIT_TRACE_CURL from the environment
  t5541-http-push-smart.sh: use the GIT_TRACE_CURL environment var
2016-09-19 13:51:41 -07:00
Junio C Hamano
8e26535866 Merge branch 'js/t6026-clean-up' into maint
A test spawned a short-lived background process, which sometimes
prevented the test directory from getting removed at the end of the
script on some platforms.

* js/t6026-clean-up:
  t6026-merge-attr: clean up background process at end of test case
2016-09-19 13:51:41 -07:00
Junio C Hamano
d6645312ff Merge branch 'jc/forbid-symbolic-ref-d-HEAD' into maint
"git symbolic-ref -d HEAD" happily removes the symbolic ref, but
the resulting repository becomes an invalid one.  Teach the command
to forbid removal of HEAD.

* jc/forbid-symbolic-ref-d-HEAD:
  symbolic-ref -d: do not allow removal of HEAD
2016-09-19 13:51:41 -07:00
Junio C Hamano
4c10c31137 Merge branch 'jc/submodule-anchor-git-dir' into maint
Having a submodule whose ".git" repository is somehow corrupt
caused a few commands that recurse into submodules loop forever.

* jc/submodule-anchor-git-dir:
  submodule: avoid auto-discovery in prepare_submodule_repo_env()
2016-09-19 13:51:40 -07:00
Junio C Hamano
79b51ebf6f Merge branch 'jk/test-lib-drop-pid-from-results' into maint
The test framework left the number of tests and success/failure
count in the t/test-results directory, keyed by the name of the
test script plus the process ID.  The latter however turned out not
to serve any useful purpose.  The process ID part of the filename
has been removed.

* jk/test-lib-drop-pid-from-results:
  test-lib: drop PID from test-results/*.count
2016-09-19 13:51:39 -07:00
Junio C Hamano
4af9a7d344 Merge branch 'bc/object-id'
The "unsigned char sha1[20]" to "struct object_id" conversion
continues.  Notable changes in this round includes that ce->sha1,
i.e. the object name recorded in the cache_entry, turns into an
object_id.

It had merge conflicts with a few topics in flight (Christian's
"apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's
"status --porcelain-v2").  Extra sets of eyes double-checking for
mismerges are highly appreciated.

* bc/object-id:
  builtin/reset: convert to use struct object_id
  builtin/commit-tree: convert to struct object_id
  builtin/am: convert to struct object_id
  refs: add an update_ref_oid function.
  sha1_name: convert get_sha1_mb to struct object_id
  builtin/update-index: convert file to struct object_id
  notes: convert init_notes to use struct object_id
  builtin/rm: convert to use struct object_id
  builtin/blame: convert file to use struct object_id
  Convert read_mmblob to take struct object_id.
  notes-merge: convert struct notes_merge_pair to struct object_id
  builtin/checkout: convert some static functions to struct object_id
  streaming: make stream_blob_to_fd take struct object_id
  builtin: convert textconv_object to use struct object_id
  builtin/cat-file: convert some static functions to struct object_id
  builtin/cat-file: convert struct expand_data to use struct object_id
  builtin/log: convert some static functions to use struct object_id
  builtin/blame: convert struct origin to use struct object_id
  builtin/apply: convert static functions to struct object_id
  cache: convert struct cache_entry to use struct object_id
2016-09-19 13:47:19 -07:00
Junio C Hamano
81358dc238 Merge branch 'cc/apply-am'
"git am" has been taught to make an internal call to "git apply"'s
innards without spawning the latter as a separate process.

* cc/apply-am: (41 commits)
  builtin/am: use apply API in run_apply()
  apply: learn to use a different index file
  apply: pass apply state to build_fake_ancestor()
  apply: refactor `git apply` option parsing
  apply: change error_routine when silent
  usage: add get_error_routine() and get_warn_routine()
  usage: add set_warn_routine()
  apply: don't print on stdout in verbosity_silent mode
  apply: make it possible to silently apply
  apply: use error_errno() where possible
  apply: make some parsing functions static again
  apply: move libified code from builtin/apply.c to apply.{c,h}
  apply: rename and move opt constants to apply.h
  builtin/apply: rename option parsing functions
  builtin/apply: make create_one_file() return -1 on error
  builtin/apply: make try_create_file() return -1 on error
  builtin/apply: make write_out_results() return -1 on error
  builtin/apply: make write_out_one_result() return -1 on error
  builtin/apply: make create_file() return -1 on error
  builtin/apply: make add_index_file() return -1 on error
  ...
2016-09-19 13:47:18 -07:00
Vasco Almeida
f2b93b388c i18n: connect: mark die messages for translation
Mark messages passed to die() in die_initial_contact().

Update test to reflect changes.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 10:55:36 -07:00
Vasco Almeida
4fa4b31507 i18n: commit: mark message for translation
Mark message commit_utf8_warn for translation.

Update tests to reflect changes.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 10:55:36 -07:00
René Scharfe
c99ad274b1 pretty: let %C(auto) reset all attributes
Reset colors and attributes upon %C(auto) to enable full automatic
control over them; otherwise attributes like bold or reverse could
still be in effect from previous %C placeholders.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 10:50:32 -07:00
Michael Haggerty
5b162879e9 blame: honor the diff heuristic options and config
Teach "git blame" and "git annotate" the --compaction-heuristic and
--indent-heuristic options that are now supported by "git diff".

Also teach them to honor the `diff.compactionHeuristic` and
`diff.indentHeuristic` configuration options.

It would be conceivable to introduce separate configuration options for
"blame" and "annotate"; for example `blame.compactionHeuristic` and
`blame.indentHeuristic`. But it would be confusing to users if blame
output is inconsistent with diff output, so it makes more sense for them
to respect the same configuration.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 10:25:11 -07:00
Michael Haggerty
433860f3d0 diff: improve positioning of add/delete blocks in diffs
Some groups of added/deleted lines in diffs can be slid up or down,
because lines at the edges of the group are not unique. Picking good
shifts for such groups is not a matter of correctness but definitely has
a big effect on aesthetics. For example, consider the following two
diffs. The first is what standard Git emits:

    --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
    +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
    @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) {
     }

     if (!$smtp_server) {
    +       $smtp_server = $repo->config('sendemail.smtpserver');
    +}
    +if (!$smtp_server) {
            foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
                    if (-x $_) {
                            $smtp_server = $_;

The following diff is equivalent, but is obviously preferable from an
aesthetic point of view:

    --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
    +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
    @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) {
            $initial_reply_to =~ s/(^\s+|\s+$)//g;
     }

    +if (!$smtp_server) {
    +       $smtp_server = $repo->config('sendemail.smtpserver');
    +}
     if (!$smtp_server) {
            foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
                    if (-x $_) {

This patch teaches Git to pick better positions for such "diff sliders"
using heuristics that take the positions of nearby blank lines and the
indentation of nearby lines into account.

The existing Git code basically always shifts such "sliders" as far down
in the file as possible. The only exception is when the slider can be
aligned with a group of changed lines in the other file, in which case
Git favors depicting the change as one add+delete block rather than one
add and a slightly offset delete block. This naive algorithm often
yields ugly diffs.

Commit d634d61ed6 improved the situation somewhat by preferring to
position add/delete groups to make their last line a blank line, when
that is possible. This heuristic does more good than harm, but (1) it
can only help if there are blank lines in the right places, and (2)
always picks the last blank line, even if there are others that might be
better. The end result is that it makes perhaps 1/3 as many errors as
the default Git algorithm, but that still leaves a lot of ugly diffs.

This commit implements a new and much better heuristic for picking
optimal "slider" positions using the following approach: First observe
that each hypothetical positioning of a diff slider introduces two
splits: one between the context lines preceding the group and the first
added/deleted line, and the other between the last added/deleted line
and the first line of context following it. It tries to find the
positioning that creates the least bad splits.

Splits are evaluated based only on the presence and locations of nearby
blank lines, and the indentation of lines near the split. Basically, it
prefers to introduce splits adjacent to blank lines, between lines that
are indented less, and between lines with the same level of indentation.
In more detail:

1. It measures the following characteristics of a proposed splitting
   position in a `struct split_measurement`:

   * the number of blank lines above the proposed split
   * whether the line directly after the split is blank
   * the number of blank lines following that line
   * the indentation of the nearest non-blank line above the split
   * the indentation of the line directly below the split
   * the indentation of the nearest non-blank line after that line

2. It combines the measured attributes using a bunch of
   empirically-optimized weighting factors to derive a `struct
   split_score` that measures the "badness" of splitting the text at
   that position.

3. It combines the `split_score` for the top and the bottom of the
   slider at each of its possible positions, and selects the position
   that has the best `split_score`.

I determined the initial set of weighting factors by collecting a corpus
of Git histories from 29 open-source software projects in various
programming languages. I generated many diffs from this corpus, and
determined the best positioning "by eye" for about 6600 diff sliders. I
used about half of the repositories in the corpus (corresponding to
about 2/3 of the sliders) as a training set, and optimized the weights
against this corpus using a crude automated search of the parameter
space to get the best agreement with the manually-determined values.
Then I tested the resulting heuristic against the full corpus. The
results are summarized in the following table, in column `indent-1`:

| repository            | count |      Git 2.9.0 |     compaction | compaction-fixed |       indent-1 |       indent-2 |
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| afnetworking          |   109 |    89  (81.7%) |    37  (33.9%) |      37  (33.9%) |     2   (1.8%) |     2   (1.8%) |
| alamofire             |    30 |    18  (60.0%) |    14  (46.7%) |      15  (50.0%) |     0   (0.0%) |     0   (0.0%) |
| angular               |   184 |   127  (69.0%) |    39  (21.2%) |      23  (12.5%) |     5   (2.7%) |     5   (2.7%) |
| animate               |   313 |     2   (0.6%) |     2   (0.6%) |       2   (0.6%) |     2   (0.6%) |     2   (0.6%) |
| ant                   |   380 |   356  (93.7%) |   152  (40.0%) |     148  (38.9%) |    15   (3.9%) |    15   (3.9%) | *
| bugzilla              |   306 |   263  (85.9%) |   109  (35.6%) |      99  (32.4%) |    14   (4.6%) |    15   (4.9%) | *
| corefx                |   126 |    91  (72.2%) |    22  (17.5%) |      21  (16.7%) |     6   (4.8%) |     6   (4.8%) |
| couchdb               |    78 |    44  (56.4%) |    26  (33.3%) |      28  (35.9%) |     6   (7.7%) |     6   (7.7%) | *
| cpython               |   937 |   158  (16.9%) |    50   (5.3%) |      49   (5.2%) |     5   (0.5%) |     5   (0.5%) | *
| discourse             |   160 |    95  (59.4%) |    42  (26.2%) |      36  (22.5%) |    18  (11.2%) |    13   (8.1%) |
| docker                |   307 |   194  (63.2%) |   198  (64.5%) |     253  (82.4%) |     8   (2.6%) |     8   (2.6%) | *
| electron              |   163 |   132  (81.0%) |    38  (23.3%) |      39  (23.9%) |     6   (3.7%) |     6   (3.7%) |
| git                   |   536 |   470  (87.7%) |    73  (13.6%) |      78  (14.6%) |    16   (3.0%) |    16   (3.0%) | *
| gitflow               |   127 |     0   (0.0%) |     0   (0.0%) |       0   (0.0%) |     0   (0.0%) |     0   (0.0%) |
| ionic                 |   133 |    89  (66.9%) |    29  (21.8%) |      38  (28.6%) |     1   (0.8%) |     1   (0.8%) |
| ipython               |   482 |   362  (75.1%) |   167  (34.6%) |     169  (35.1%) |    11   (2.3%) |    11   (2.3%) | *
| junit                 |   161 |   147  (91.3%) |    67  (41.6%) |      66  (41.0%) |     1   (0.6%) |     1   (0.6%) | *
| lighttable            |    15 |     5  (33.3%) |     0   (0.0%) |       2  (13.3%) |     0   (0.0%) |     0   (0.0%) |
| magit                 |    88 |    75  (85.2%) |    11  (12.5%) |       9  (10.2%) |     1   (1.1%) |     0   (0.0%) |
| neural-style          |    28 |     0   (0.0%) |     0   (0.0%) |       0   (0.0%) |     0   (0.0%) |     0   (0.0%) |
| nodejs                |   781 |   649  (83.1%) |   118  (15.1%) |     111  (14.2%) |     4   (0.5%) |     5   (0.6%) | *
| phpmyadmin            |   491 |   481  (98.0%) |    75  (15.3%) |      48   (9.8%) |     2   (0.4%) |     2   (0.4%) | *
| react-native          |   168 |   130  (77.4%) |    79  (47.0%) |      81  (48.2%) |     0   (0.0%) |     0   (0.0%) |
| rust                  |   171 |   128  (74.9%) |    30  (17.5%) |      27  (15.8%) |    16   (9.4%) |    14   (8.2%) |
| spark                 |   186 |   149  (80.1%) |    52  (28.0%) |      52  (28.0%) |     2   (1.1%) |     2   (1.1%) |
| tensorflow            |   115 |    66  (57.4%) |    48  (41.7%) |      48  (41.7%) |     5   (4.3%) |     5   (4.3%) |
| test-more             |    19 |    15  (78.9%) |     2  (10.5%) |       2  (10.5%) |     1   (5.3%) |     1   (5.3%) | *
| test-unit             |    51 |    34  (66.7%) |    14  (27.5%) |       8  (15.7%) |     2   (3.9%) |     2   (3.9%) | *
| xmonad                |    23 |    22  (95.7%) |     2   (8.7%) |       2   (8.7%) |     1   (4.3%) |     1   (4.3%) | *
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| totals                |  6668 |  4391  (65.9%) |  1496  (22.4%) |    1491  (22.4%) |   150   (2.2%) |   144   (2.2%) |
| totals (training set) |  4552 |  3195  (70.2%) |  1053  (23.1%) |    1061  (23.3%) |    86   (1.9%) |    88   (1.9%) |
| totals (test set)     |  2116 |  1196  (56.5%) |   443  (20.9%) |     430  (20.3%) |    64   (3.0%) |    56   (2.6%) |

In this table, the numbers are the count and percentage of human-rated
sliders that the corresponding algorithm got *wrong*. The columns are

* "repository" - the name of the repository used. I used the diffs
  between successive non-merge commits on the HEAD branch of the
  corresponding repository.

* "count" - the number of sliders that were human-rated. I chose most,
  but not all, sliders to rate from those among which the various
  algorithms gave different answers.

* "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0.

* "compaction" - the heuristic used by `git diff --compaction-heuristic`
  in Git 2.9.0.

* "compaction-fixed" - the heuristic used by `git diff
  --compaction-heuristic` after the fixes from earlier in this patch
  series. Note that the results are not dramatically different than
  those for "compaction". Both produce non-ideal diffs only about 1/3 as
  often as the default `git diff`.

* "indent-1" - the new `--indent-heuristic` algorithm, using the first
  set of weighting factors, determined as described above.

* "indent-2" - the new `--indent-heuristic` algorithm, using the final
  set of weighting factors, determined as described below.

* `*` - indicates that repo was part of training set used to determine
  the first set of weighting factors.

The fact that the heuristic performed nearly as well on the test set as
on the training set in column "indent-1" is a good indication that the
heuristic was not over-trained. Given that fact, I ran a second round of
optimization, using the entire corpus as the training set. The resulting
set of weights gave the results in column "indent-2". These are the
weights included in this patch.

The final result gives consistently and significantly better results
across the whole corpus than either `git diff` or `git diff
--compaction-heuristic`. It makes only about 1/30 as many errors as the
former and about 1/10 as many errors as the latter. (And a good fraction
of the remaining errors are for diffs that involve weirdly-formatted
code, sometimes apparently machine-generated.)

The tools that were used to do this optimization and analysis, along
with the human-generated data values, are recorded in a separate project
[1].

This patch adds a new command-line option `--indent-heuristic`, and a
new configuration setting `diff.indentHeuristic`, that activate this
heuristic. This interface is only meant for testing purposes, and should
be finalized before including this change in any release.

[1] https://github.com/mhagger/diff-slider-tools

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 10:25:11 -07:00
Junio C Hamano
c13f458d86 Merge branch 'jk/fix-remote-curl-url-wo-proto'
"git fetch http::/site/path" did not die correctly and segfaulted
instead.

* jk/fix-remote-curl-url-wo-proto:
  remote-curl: handle URLs without protocol
2016-09-15 14:11:15 -07:00
Junio C Hamano
9883ec2c73 Merge branch 'jk/pack-tag-of-tag'
"git pack-objects --include-tag" was taught that when we know that
we are sending an object C, we want a tag B that directly points at
C but also a tag A that points at the tag B.  We used to miss the
intermediate tag B in some cases.

* jk/pack-tag-of-tag:
  pack-objects: walk tag chains for --include-tag
  t5305: simplify packname handling
  t5305: use "git -C"
  t5305: drop "dry-run" of unpack-objects
  t5305: move cleanup into test block
2016-09-15 14:11:14 -07:00
Kirill Smelkov
cd5c2812b6 t/perf/run: copy config.mak.autogen & friends to build area
Otherwise for people who use autotools-based configure in main worktree,
the performance testing results will be inconsistent as work and build
trees could be using e.g. different optimization levels.

See e.g.

	http://public-inbox.org/git/20160818175222.bmm3ivjheokf2qzl@sigill.intra.peff.net/

for example.

NOTE config.status has to be copied because otherwise without it the build
would want to run reconfigure this way loosing just copied config.mak.autogen.

Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 13:41:11 -07:00
Vasco Almeida
8d79589ad6 notes: spell first word of error messages in lowercase
That's the usual style.

Update one test to reflect these changes.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 13:17:32 -07:00
Vasco Almeida
e3f54bff43 i18n: blame: mark error messages for translation
Mark error messages for translation passed to die() function.
Change "Cannot" to lowercase following the usual style.

Reflect changes to test by using test_i18ngrep.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 13:17:32 -07:00
Thomas Gummerer
610d55af0f add: modify already added files when --chmod is given
When the chmod option was added to git add, it was hooked up to the diff
machinery, meaning that it only works when the version in the index
differs from the version on disk.

As the option was supposed to mirror the chmod option in update-index,
which always changes the mode in the index, regardless of the status of
the file, make sure the option behaves the same way in git add.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 12:13:54 -07:00
Josh Triplett
480871e09e format-patch: show base info before email signature
Any text below the "-- " for the email signature gets treated as part of
the signature, and many mail clients will trim it from the quoted text
for a reply.  Move it above the signature, so people can reply to it
more easily.

Similarly, when producing the patch as a MIME attachment, the
original code placed the base info after the attached part, which
would be discarded.  Move the base info to the end of the part,
still inside the part boundary.

Add tests for the exact format of the email signature, and add tests
to ensure that the base info appears before the email signature when
producing a plain-text output, and that it appears before the part
boundary when producing a MIME attachment.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-15 10:07:10 -07:00
René Scharfe
45d2f75f91 xdiff: fix merging of hunks with -W context and -u context
If the function context for a hunk (with -W) reaches the beginning of
the next hunk then we need to merge these two -- otherwise we'd show
some lines twice, which looks strange and even confuses git apply.  We
already do this checking and merging in xdl_emit_diff(), but forget to
consider regular context (with -u or -U).

Fix that by merging hunks already if function context of the first one
touches or overlaps regular context of the second one.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-14 16:07:21 -07:00
Thomas Gummerer
22433ce461 update-index: add test for chmod flags
Currently there is no test checking the expected behaviour when multiple
chmod flags with different arguments are passed.  As argument handling
is not in line with other git commands it's easy to miss and
accidentally change the current behaviour.

While there, fix the argument type of chmod_path, which takes an int,
but had a char passed in.

Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-14 15:03:49 -07:00
Jeff King
4d0efa101b t1007: factor out repeated setup
We have a series of 3 CRLF tests that do exactly the same
(long) setup sequence. Let's pull it out into a common setup
test, which is shorter, more efficient, and will make it
easier to add new tests.

Note that we don't have to worry about cleaning up any of
the setup which was previously per-test; we call pop_repo
after the CRLF tests, which cleans up everything.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Jeff King
4543926ba8 init: reset cached config when entering new repo
After we copy the templates into place, we re-read the
config in case we copied in a default config file. But since
git_config() is backed by a cache these days, it's possible
that the call will not actually touch the filesystem at all;
we need to tell it that something has changed behind the
scenes.

Note that we also need to reset the shared_repository
config. At first glance, it seems like this should probably
just be folded into git_config_clear(). But unfortunately
that is not quite right. The shared repository value may
come from config, _or_ it may have been set manually. So
only the caller who knows whether or not they set it is the
one who can clear it (and indeed, if you _do_ put it into
git_config_clear(), then many tests fail, as we have to
clear the config cache any time we set a new config
variable).

There are three tests here. The first two actually pass
already, though it's largely luck: they just don't happen to
actually read any config before we enter the new repo.

But the third one does fail without this patch; we look at
core.sharedrepository while creating the directory, but need
to make sure the value from the template config overrides
it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Jeff King
b9605bc4f2 config: only read .git/config from configured repos
When git_config() runs, it looks in the system, user-wide,
and repo-level config files. It gets the latter by calling
git_pathdup(), which in turn calls get_git_dir(). If we
haven't set up the git repository yet, this may simply
return ".git", and we will look at ".git/config".  This
seems like it would be helpful (presumably we haven't set up
the repository yet, so it tries to find it), but it turns
out to be a bad idea for a few reasons:

  - it's not sufficient, and therefore hides bugs in a
    confusing way. Config will be respected if commands are
    run from the top-level of the working tree, but not from
    a subdirectory.

  - it's not always true that we haven't set up the
    repository _yet_; we may not want to do it at all. For
    instance, if you run "git init /some/path" from inside
    another repository, it should not load config from the
    existing repository.

  - there might be a path ".git/config", but it is not the
    actual repository we would find via setup_git_directory().
    This may happen, e.g., if you are storing a git
    repository inside another git repository, but have
    munged one of the files in such a way that the
    inner repository is not valid (e.g., by removing HEAD).

We have at least two bugs of the second type in git-init,
introduced by ae5f677 (lazily load core.sharedrepository,
2016-03-11). It causes init to use git_configset(), which
loads all of the config, including values from the current
repo (if any).  This shows up in two ways:

  1. If we happen to be in an existing repository directory,
     we'll read and respect core.sharedrepository from it,
     even though it should have no bearing on the new
     repository. A new test in t1301 covers this.

  2. Similarly, if we're in an existing repo that sets
     core.logallrefupdates, that will cause init to fail to
     set it in a newly created repository (because it thinks
     that the user's templates already did so). A new test
     in t0001 covers this.

We also need to adjust an existing test in t1302, which
gives another example of why this patch is an improvement.

That test creates an embedded repository with a bogus
core.repositoryformatversion of "99". It wants to make sure
that we actually stop at the bogus repo rather than
continuing upward to find the outer repo. So it checks that
"git config core.repositoryformatversion" returns 99. But
that only works because we blindly read ".git/config", even
though we _know_ we're in a repository whose vintage we do
not understand.

After this patch, we avoid reading config from the unknown
vintage repository at all, which is a safer choice.  But we
need to tweak the test, since core.repositoryformatversion
will not return 99; it will claim that it could not find the
variable at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Jeff King
11ca4bec96 t1302: use "git -C"
This is shorter, and saves a subshell.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Jeff King
28a4e58021 diff: always try to set up the repository
If we see an explicit "--no-index", we do not bother calling
setup_git_directory_gently() at all. This means that we may
miss out on reading repo-specific config.

It's arguable whether this is correct or not. If we were
designing from scratch, making "git diff --no-index"
completely ignore the repository makes some sense. But we
are nowhere near scratch, so let's look at the existing
behavior:

  1. If you're in the top-level of a repository and run an
     explicit "diff --no-index", the config subsystem falls
     back to reading ".git/config", and we will respect repo
     config.

  2. If you're in a subdirectory of a repository, then we
     still try to read ".git/config", but it generally
     doesn't exist. So "diff --no-index" there does not
     respect repo config.

  3. If you have $GIT_DIR set in the environment, we read
     and respect $GIT_DIR/config,

  4. If you run "git diff /tmp/foo /tmp/bar" to get an
     implicit no-index, we _do_ run the repository setup,
     and set $GIT_DIR (or respect an existing $GIT_DIR
     variable). We find the repo config no matter where we
     started, and respect it.

So we already respect the repository config in a number of
common cases, and case (2) is the only one that does not.
And at least one of our tests, t4034, depends on case (1)
behaving as it does now (though it is just incidental, not
an explicit test for this behavior).

So let's bring case (2) in line with the others by always
running the repository setup, even with an explicit
"--no-index". We shouldn't need to change anything else, as the
implicit case already handles the prefix.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Jeff King
7d8930d903 diff: handle --no-index prefixes consistently
If we see an explicit "git diff --no-index ../foo ../bar",
then we do not set up the git repository at all (we already
know we are in --no-index mode, so do not have to check "are
we in a repository?"), and hence have no "prefix" within the
repository. A patch generated by this command will have the
filenames "a/../foo" and "b/../bar", no matter which
directory we are in with respect to any repository.

However, in the implicit case, where we notice that the
files are outside the repository, we will have chdir()'d to
the top-level of the repository. We then feed the prefix
back to the diff machinery. As a result, running the same
diff from a subdirectory will result in paths that look like
"a/subdir/../../foo".

Besides being unnecessarily long, this may also be confusing
to the user: they don't care about the subdir or the
repository at all; it's just where they happened to be when
running the command. We should treat this the same as the
explicit --no-index case.

One way to address this would be to chdir() back to the
original path before running our diff. However, that's a bit
hacky, as we would also need to adjust $GIT_DIR, which could
be a relative path from our top-level.

Instead, we can reuse the diff machinery's RELATIVE_NAME
option, which automatically strips off the prefix. Note that
this _also_ restricts the diff to this relative prefix, but
that's OK for our purposes: we queue our own diff pairs
manually, and do not rely on that part of the diff code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Jeff King
4a73aaaf18 patch-id: use RUN_SETUP_GENTLY
Patch-id does not require a repository because it is just
processing the incoming diff on stdin, but it may look at
git config for keys like patchid.stable.

Even though we do not setup_git_directory(), this works from
the top-level of a repository because we blindly look at
".git/config" in this case. But as the included test
demonstrates, it does not work from a subdirectory.

We can fix it by using RUN_SETUP_GENTLY. We do not take any
filenames from the user on the command line, so there's no
need to adjust them via prefix_filename().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Jeff King
0e94ee9415 hash-object: always try to set up the git repository
When "hash-object" is run without "-w", we don't need to be
in a git repository at all; we can just hash the object and
write its sha1 to stdout. However, if we _are_ in a git
repository, we would want to know that so we can follow the
normal rules for respecting config, .gitattributes, etc.

This happens to work at the top-level of a git repository
because we blindly read ".git/config", but as the included
test shows, it does not work when you are in a subdirectory.

The solution is to just do a "gentle" setup in this case. We
already take care to use prefix_filename() on any filename
arguments we get (to handle the "-w" case), so we don't need
to do anything extra to handle the side effects of repo
setup.

An alternative would be to specify RUN_SETUP_GENTLY for this
command in git.c, and then die if "-w" is set but we are not
in a repository. However, the error messages generated at
the time of setup_git_directory() are more detailed, so it's
better to find out which mode we are in, and then call the
appropriate function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 15:45:45 -07:00
Junio C Hamano
930b67ebd7 Merge branch 'ep/use-git-trace-curl-in-tests'
Update a few tests that used to use GIT_CURL_VERBOSE to use the
newer GIT_TRACE_CURL.

* ep/use-git-trace-curl-in-tests:
  t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var
  t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var
  test-lib.sh: preserve GIT_TRACE_CURL from the environment
  t5541-http-push-smart.sh: use the GIT_TRACE_CURL environment var
2016-09-12 15:34:38 -07:00
Junio C Hamano
ba06991e5f Merge branch 'js/t6026-clean-up'
A test spawned a short-lived background process, which sometimes
prevented the test directory from getting removed at the end of the
script on some platforms.

* js/t6026-clean-up:
  t6026-merge-attr: clean up background process at end of test case
2016-09-12 15:34:37 -07:00
Junio C Hamano
038763c71a Merge branch 'js/t9903-chaining'
* js/t9903-chaining:
  t9903: fix broken && chain
2016-09-12 15:34:37 -07:00
Junio C Hamano
d1de693d0d Merge branch 'jc/forbid-symbolic-ref-d-HEAD'
"git symbolic-ref -d HEAD" happily removes the symbolic ref, but
the resulting repository becomes an invalid one.  Teach the command
to forbid removal of HEAD.

* jc/forbid-symbolic-ref-d-HEAD:
  symbolic-ref -d: do not allow removal of HEAD
2016-09-12 15:34:35 -07:00
Junio C Hamano
293c232ab1 Merge branch 'jc/submodule-anchor-git-dir'
Having a submodule whose ".git" repository is somehow corrupt
caused a few commands that recurse into submodules loop forever.

* jc/submodule-anchor-git-dir:
  submodule: avoid auto-discovery in prepare_submodule_repo_env()
2016-09-12 15:34:34 -07:00
Junio C Hamano
8f6fd086e6 Merge branch 'jk/test-lib-drop-pid-from-results'
The test framework left the number of tests and success/failure
count in the t/test-results directory, keyed by the name of the
test script plus the process ID.  The latter however turned out not
to serve any useful purpose.  The process ID part of the filename
has been removed.

* jk/test-lib-drop-pid-from-results:
  test-lib: drop PID from test-results/*.count
2016-09-12 15:34:33 -07:00
Junio C Hamano
305d7f1339 Merge branch 'jk/diff-submodule-diff-inline'
The "git diff --submodule={short,log}" mechanism has been enhanced
to allow "--submodule=diff" to show the patch between the submodule
commits bound to the superproject.

* jk/diff-submodule-diff-inline:
  diff: teach diff to display submodule difference with an inline diff
  submodule: refactor show_submodule_summary with helper function
  submodule: convert show_submodule_summary to use struct object_id *
  allow do_submodule_path to work even if submodule isn't checked out
  diff: prepare for additional submodule formats
  graph: add support for --line-prefix on all graph-aware output
  diff.c: remove output_prefix_length field
  cache: add empty_tree_oid object and helper function
2016-09-12 15:34:31 -07:00
Kirill Smelkov
645c432d61 pack-objects: use reachability bitmap index when generating non-stdout pack
Starting from 6b8fda2d (pack-objects: use bitmaps when packing objects)
if a repository has bitmap index, pack-objects can nicely speedup
"Counting objects" graph traversal phase. That however was done only for
case when resultant pack is sent to stdout, not written into a file.

The reason here is for on-disk repack by default we want:

- to produce good pack (with bitmap index not-yet-packed objects are
  emitted to pack in suboptimal order).

- to use more robust pack-generation codepath (avoiding possible
  bugs in bitmap code and possible bitmap index corruption).

Jeff King further explains:

    The reason for this split is that pack-objects tries to determine how
    "careful" it should be based on whether we are packing to disk or to
    stdout. Packing to disk implies "git repack", and that we will likely
    delete the old packs after finishing. We want to be more careful (so
    as not to carry forward a corruption, and to generate a more optimal
    pack), and we presumably run less frequently and can afford extra CPU.
    Whereas packing to stdout implies serving a remote via "git fetch" or
    "git push". This happens more frequently (e.g., a server handling many
    fetching clients), and we assume the receiving end takes more
    responsibility for verifying the data.

    But this isn't always the case. One might want to generate on-disk
    packfiles for a specialized object transfer. Just using "--stdout" and
    writing to a file is not optimal, as it will not generate the matching
    pack index.

    So it would be useful to have some way of overriding this heuristic:
    to tell pack-objects that even though it should generate on-disk
    files, it is still OK to use the reachability bitmaps to do the
    traversal.

So we can teach pack-objects to use bitmap index for initial object
counting phase when generating resultant pack file too:

- if we take care to not let it be activated under git-repack:

  See above about repack robustness and not forward-carrying corruption.

- if we know bitmap index generation is not enabled for resultant pack:

  The current code has singleton bitmap_git, so it cannot work
  simultaneously with two bitmap indices.

  We also want to avoid (at least with current implementation)
  generating bitmaps off of bitmaps. The reason here is: when generating
  a pack, not-yet-packed objects will be emitted into pack in
  suboptimal order and added to tail of the bitmap as "extended entries".
  When the resultant pack + some new objects in associated repository
  are in turn used to generate another pack with bitmap, the situation
  repeats: new objects are again not emitted optimally and just added to
  bitmap tail - not in recency order.

  So the pack badness can grow over time when at each step we have
  bitmapped pack + some other objects. That's why we want to avoid
  generating bitmaps off of bitmaps, not to let pack badness grow.

- if we keep pack reuse enabled still only for "send-to-stdout" case:

  Because pack-to-file needs to generate index for destination pack, and
  currently on pack reuse raw entries are directly written out to the
  destination pack by write_reused_pack(), bypassing needed for pack index
  generation bookkeeping done by regular codepath in write_one() and
  friends.

  ( In the future we might teach pack-reuse code about cases when index
    also needs to be generated for resultant pack and remove
    pack-reuse-only-for-stdout limitation )

This way for pack-objects -> file we get nice speedup:

    erp5.git[1] (~230MB) extracted from ~ 5GB lab.nexedi.com backup
    repository managed by git-backup[2] via

    time echo 0186ac99 | git pack-objects --revs erp5pack

before:  37.2s
after:   26.2s

And for `git repack -adb` packed git.git

    time echo 5c589a73 | git pack-objects --revs gitpack

before:   7.1s
after:    3.6s

i.e. it can be 30% - 50% speedup for pack extraction.

git-backup extracts many packs on repositories restoration. That was my
initial motivation for the patch.

[1] https://lab.nexedi.com/nexedi/erp5
[2] https://lab.nexedi.com/kirr/git-backup

NOTE

Jeff also suggests that pack.useBitmaps was probably a mistake to
introduce originally. This way we are not adding another config point,
but instead just always default to-file pack-objects not to use bitmap
index: Tools which need to generate on-disk packs with using bitmap, can
pass --use-bitmap-index explicitly. And git-repack does never pass
--use-bitmap-index, so this way we can be sure regular on-disk repacking
remains robust.

NOTE2

`git pack-objects --stdout >file.pack` + `git index-pack file.pack` is much slower
than `git pack-objects file.pack`. Extracting erp5.git pack from
lab.nexedi.com backup repository:

    $ time echo 0186ac99 | git pack-objects --stdout --revs >erp5pack-stdout.pack

    real    0m22.309s
    user    0m21.148s
    sys     0m0.932s

    $ time git index-pack erp5pack-stdout.pack

    real    0m50.873s   <-- more than 2 times slower than time to generate pack itself!
    user    0m49.300s
    sys     0m1.360s

So the time for

    `pack-object --stdout >file.pack` + `index-pack file.pack`  is  72s,

while

    `pack-objects file.pack` which does both pack and index     is  27s.

And even

    `pack-objects --no-use-bitmap-index file.pack`              is  37s.

Jeff explains:

    The packfile does not carry the sha1 of the objects. A receiving
    index-pack has to compute them itself, including inflating and applying
    all of the deltas.

that's why for `git-backup restore` we want to teach `git pack-objects
file.pack` to use bitmaps instead of using `git pack-objects --stdout
>file.pack` + `git index-pack file.pack`.

NOTE3

The speedup is now tracked via t/perf/p5310-pack-bitmaps.sh

    Test                                    56dfeb62          this tree
    --------------------------------------------------------------------------------
    5310.2: repack to disk                  8.98(8.05+0.29)   9.05(8.08+0.33) +0.8%
    5310.3: simulated clone                 2.02(2.27+0.09)   2.01(2.25+0.08) -0.5%
    5310.4: simulated fetch                 0.81(1.07+0.02)   0.81(1.05+0.04) +0.0%
    5310.5: pack to file                    7.58(7.04+0.28)   7.60(7.04+0.30) +0.3%
    5310.6: pack to file (bitmap)           7.55(7.02+0.28)   3.25(2.82+0.18) -57.0%
    5310.8: clone (partial bitmap)          1.83(2.26+0.12)   1.82(2.22+0.14) -0.5%
    5310.9: pack to file (partial bitmap)   6.86(6.58+0.30)   2.87(2.74+0.20) -58.2%

More context:

    http://marc.info/?t=146792101400001&r=1&w=2
    http://public-inbox.org/git/20160707190917.20011-1-kirr@nexedi.com/T/#t

Cc: Vicent Marti <tanoku@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-12 13:47:41 -07:00
Kirill Smelkov
702d1b9583 pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use
Since 6b8fda2d (pack-objects: use bitmaps when packing objects) there
are two codepaths in pack-objects: with & without using bitmap
reachability index.

However add_object_entry_from_bitmap(), despite its non-bitmapped
counterpart add_object_entry(), in no way does check for whether --local
or --honor-pack-keep or --incremental should be respected. In
non-bitmapped codepath this is handled in want_object_in_pack(), but
bitmapped codepath has simply no such checking at all.

The bitmapped codepath however was allowing to pass in all those options
and with bitmap indices still being used under such conditions -
potentially giving wrong output (e.g. including objects from non-local or
.keep'ed pack).

We can easily fix this by noting the following: when an object comes to
add_object_entry_from_bitmap() it can come for two reasons:

    1. entries coming from main pack covered by bitmap index, and
    2. object coming from, possibly alternate, loose or other packs.

"2" can be already handled by want_object_in_pack() and to cover
"1" we can teach want_object_in_pack() to expect that *found_pack can be
non-NULL, meaning calling client already found object's pack entry.

In want_object_in_pack() we care to start the checks from already found
pack, if we have one, this way determining the answer right away
in case neither --local nor --honour-pack-keep are active. In
particular, as p5310-pack-bitmaps.sh shows (3 consecutive runs), we do
not do harm to served-with-bitmap clones performance-wise:

    Test                      56dfeb62          this tree
    -----------------------------------------------------------------
    5310.2: repack to disk    9.08(8.20+0.25)   9.09(8.14+0.32) +0.1%
    5310.3: simulated clone   1.92(2.12+0.08)   1.93(2.12+0.09) +0.5%
    5310.4: simulated fetch   0.82(1.07+0.04)   0.82(1.06+0.04) +0.0%
    5310.6: partial bitmap    1.96(2.42+0.13)   1.95(2.40+0.15) -0.5%

    Test                      56dfeb62          this tree
    -----------------------------------------------------------------
    5310.2: repack to disk    9.11(8.16+0.32)   9.11(8.19+0.28) +0.0%
    5310.3: simulated clone   1.93(2.14+0.07)   1.92(2.11+0.10) -0.5%
    5310.4: simulated fetch   0.82(1.06+0.04)   0.82(1.04+0.05) +0.0%
    5310.6: partial bitmap    1.95(2.38+0.16)   1.94(2.39+0.14) -0.5%

    Test                      56dfeb62          this tree
    -----------------------------------------------------------------
    5310.2: repack to disk    9.13(8.17+0.31)   9.07(8.13+0.28) -0.7%
    5310.3: simulated clone   1.92(2.13+0.07)   1.91(2.12+0.06) -0.5%
    5310.4: simulated fetch   0.82(1.08+0.03)   0.82(1.08+0.03) +0.0%
    5310.6: partial bitmap    1.96(2.43+0.14)   1.96(2.42+0.14) +0.0%

with delta timings showing they are all within noise from run to run.

In the general case we do not want to call find_pack_entry_one() more than
once, because it is expensive. This patch splits the loop in
want_object_in_pack() into two parts: finding the object and seeing if it
impacts our choice to include it in the pack. We may call the inexpensive
want_found_object() twice, but we will never call find_pack_entry_one() if we
do not need to.

I appreciate help and discussing this change with Junio C Hamano and
Jeff King.

Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-12 13:47:41 -07:00
Johannes Schindelin
321459439e cat-file: support --textconv/--filters in batch mode
With this patch, --batch can be combined with --textconv or --filters.
For this to work, the input needs to have the form

	<object name><single white space><path>

so that the filters can be chosen appropriately.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-11 14:48:15 -07:00
Johannes Schindelin
7bcf341453 cat-file --textconv/--filters: allow specifying the path separately
There are circumstances when it is relatively easy to figure out the
object name for a given path, but not the name of the containing tree.
For example, when looking at a diff generated by Git, the object names
are recorded, but not the revision. As a matter of fact, the revisions
from which the diff was generated may not even exist locally.

In such a case, the user would have to generate a fake revision just to
be able to use --textconv or --filters.

Let's simplify this dramatically, because we do not really need that
revision at all: all we care about is that we know the path. In the
scenario described above, we do know the path, and we just want to
specify it separately from the object name.

Example usage:

	git cat-file --textconv --path=main.c 0f1937fd

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-11 14:48:15 -07:00
Johannes Schindelin
b9e62f6011 cat-file: introduce the --filters option
The --filters option applies the convert_to_working_tree() filter for
the path when showing the contents of a regular file blob object;
the contents are written out as-is for other types of objects.

This feature comes in handy when a 3rd-party tool wants to work with
the contents of files from past revisions as if they had been checked
out, but without detouring via temporary files.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-11 14:47:46 -07:00
Jonathan Tan
eb398797cd connect: advertized capability is not a ref
When cloning an empty repository served by standard git, "git clone" produces
the following reassuring message:

	$ git clone git://localhost/tmp/empty
	Cloning into 'empty'...
	warning: You appear to have cloned an empty repository.
	Checking connectivity... done.

Meanwhile when cloning an empty repository served by JGit, the output is more
haphazard:

	$ git clone git://localhost/tmp/empty
	Cloning into 'empty'...
	Checking connectivity... done.
	warning: remote HEAD refers to nonexistent ref, unable to checkout.

This is a common command to run immediately after creating a remote repository
as preparation for adding content to populate it and pushing. The warning is
confusing and needlessly worrying.

The cause is that, since v3.1.0.201309270735-rc1~22 (Advertise capabilities
with no refs in upload service., 2013-08-08), JGit's ref advertisement includes
a ref named capabilities^{} to advertise its capabilities on, while git's ref
advertisement is empty in this case. This allows the client to learn about the
server's capabilities and is needed, for example, for fetch-by-sha1 to work
when no refs are advertised.

This also affects "ls-remote". For example, against an empty repository served
by JGit:

	$ git ls-remote git://localhost/tmp/empty
	0000000000000000000000000000000000000000        capabilities^{}

Git advertises the same capabilities^{} ref in its ref advertisement for push
but since it never did so for fetch, the client didn't need to handle this
case.  Handle it.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-09 13:40:36 -07:00
Jonathan Tan
63b747ce1a tests: move test_lazy_prereq JGIT to test-lib.sh
This enables JGIT to be used as a prereq in invocations of
test_expect_success (and other functions) in other test scripts.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-09 13:37:52 -07:00
Junio C Hamano
02c6c14d6c Merge branch 'sb/submodule-clone-rr'
"git clone --resurse-submodules --reference $path $URL" is a way to
reduce network transfer cost by borrowing objects in an existing
$path repository when cloning the superproject from $URL; it
learned to also peek into $path for presense of corresponding
repositories of submodules and borrow objects from there when able.

* sb/submodule-clone-rr:
  clone: recursive and reference option triggers submodule alternates
  clone: implement optional references
  clone: clarify option_reference as required
  clone: factor out checking for an alternate path
  submodule--helper update-clone: allow multiple references
  submodule--helper module-clone: allow multiple references
  t7408: merge short tests, factor out testing method
  t7408: modernize style
2016-09-08 21:49:50 -07:00
Junio C Hamano
00d27937bf Merge branch 'jh/status-v2-porcelain'
Enhance "git status --porcelain" output by collecting more data on
the state of the index and the working tree files, which may
further be used to teach git-prompt (in contrib/) to make fewer
calls to git.

* jh/status-v2-porcelain:
  status: unit tests for --porcelain=v2
  test-lib-functions.sh: add lf_to_nul helper
  git-status.txt: describe --porcelain=v2 format
  status: print branch info with --porcelain=v2 --branch
  status: print per-file porcelain v2 status data
  status: collect per-file data for --porcelain=v2
  status: support --porcelain[=<version>]
  status: cleanup API to wt_status_print
  status: rename long-format print routines
2016-09-08 21:49:50 -07:00
Junio C Hamano
d7ed183a91 Merge branch 'rt/help-unknown'
"git nosuchcommand --help" said "No manual entry for gitnosuchcommand",
which was not intuitive, given that "git nosuchcommand" said "git:
'nosuchcommand' is not a git command".

* rt/help-unknown:
  help: make option --help open man pages only for Git commands
  help: introduce option --exclude-guides
2016-09-08 21:49:48 -07:00
Junio C Hamano
da3b6f06e1 Merge branch 'cc/receive-pack-limit'
An incoming "git push" that attempts to push too many bytes can now
be rejected by setting a new configuration variable at the receiving
end.

* cc/receive-pack-limit:
  receive-pack: allow a maximum input size to be specified
  unpack-objects: add --max-input-size=<size> option
  index-pack: add --max-input-size=<size> option
2016-09-08 21:49:47 -07:00
Junio C Hamano
452a9073ba Merge branch 'jk/format-patch-number-singleton-patch-with-cover'
"git format-patch --cover-letter HEAD^" to format a single patch
with a separate cover letter now numbers the output as [PATCH 0/1]
and [PATCH 1/1] by default.

* jk/format-patch-number-singleton-patch-with-cover:
  format-patch: show 0/1 and 1/1 for singleton patch with cover letter
2016-09-08 21:49:47 -07:00
Junio C Hamano
c4071eace9 Merge branch 'jk/delta-base-cache'
The delta-base-cache mechanism has been a key to the performance in
a repository with a tightly packed packfile, but it did not scale
well even with a larger value of core.deltaBaseCacheLimit.

* jk/delta-base-cache:
  t/perf: add basic perf tests for delta base cache
  delta_base_cache: use hashmap.h
  delta_base_cache: drop special treatment of blobs
  delta_base_cache: use list.h for LRU
  release_delta_base_cache: reuse existing detach function
  clear_delta_base_cache_entry: use a more descriptive name
  cache_or_unpack_entry: drop keep_cache parameter
2016-09-08 21:49:46 -07:00
Junio C Hamano
b5abd302ef Merge branch 'sg/reflog-past-root' into maint
A small test clean-up for a topic introduced in v2.9.1 and later.

* sg/reflog-past-root:
  t1410: remove superfluous 'git reflog' from the 'walk past root' test
2016-09-08 21:36:02 -07:00
Junio C Hamano
c0e8b3b444 Merge branch 'bw/mingw-avoid-inheriting-fd-to-lockfile' into maint
The tempfile (hence its user lockfile) API lets the caller to open
a file descriptor to a temporary file, write into it and then
finalize it by first closing the filehandle and then either
removing or renaming the temporary file.  When the process spawns a
subprocess after obtaining the file descriptor, and if the
subprocess has not exited when the attempt to remove or rename is
made, the last step fails on Windows, because the subprocess has
the file descriptor still open.  Open tempfile with O_CLOEXEC flag
to avoid this (on Windows, this is mapped to O_NOINHERIT).

* bw/mingw-avoid-inheriting-fd-to-lockfile:
  mingw: ensure temporary file handles are not inherited by child processes
  t6026-merge-attr: child processes must not inherit index.lock handles
2016-09-08 21:35:56 -07:00
Junio C Hamano
bde42f081e Merge branch 'jk/difftool-command-not-found' into maint
"git difftool" by default ignores the error exit from the backend
commands it spawns, because often they signal that they found
differences by exiting with a non-zero status code just like "diff"
does; the exit status codes 126 and above however are special in
that they are used to signal that the command is not executable,
does not exist, or killed by a signal.  "git difftool" has been
taught to notice these exit status codes.

* jk/difftool-command-not-found:
  difftool: always honor fatal error exit codes
2016-09-08 21:35:54 -07:00
Junio C Hamano
7c96471947 Merge branch 'sb/checkout-explit-detach-no-advice' into maint
"git checkout --detach <branch>" used to give the same advice
message as that is issued when "git checkout <tag>" (or anything
that is not a branch name) is given, but asking with "--detach" is
an explicit enough sign that the user knows what is going on.  The
advice message has been squelched in this case.

* sb/checkout-explit-detach-no-advice:
  checkout: do not mention detach advice for explicit --detach option
2016-09-08 21:35:54 -07:00
Junio C Hamano
69307312d1 Merge branch 'rs/pull-signed-tag' into maint
When "git merge-recursive" works on history with many criss-cross
merges in "verbose" mode, the names the command assigns to the
virtual merge bases could have overwritten each other by unintended
reuse of the same piece of memory.

* rs/pull-signed-tag:
  commit: use FLEX_ARRAY in struct merge_remote_desc
  merge-recursive: fix verbose output for multiple base trees
  commit: factor out set_merge_remote_desc()
  commit: use xstrdup() in get_merge_parent()
2016-09-08 21:35:54 -07:00
Junio C Hamano
86df11b1a4 Merge branch 'js/test-lint-pathname' into maint
The "t/" hierarchy is prone to get an unusual pathname; "make test"
has been taught to make sure they do not contain paths that cannot
be checked out on Windows (and the mechanism can be reusable to
catch pathnames that are not portable to other platforms as need
arises).

* js/test-lint-pathname:
  t/Makefile: ensure that paths are valid on platforms we care
2016-09-08 21:35:54 -07:00
Junio C Hamano
f34d900aa7 Merge branch 'jk/push-force-with-lease-creation' into maint
"git push --force-with-lease" already had enough logic to allow
ensuring that such a push results in creation of a ref (i.e. the
receiving end did not have another push from sideways that would be
discarded by our force-pushing), but didn't expose this possibility
to the users.  It does so now.

* jk/push-force-with-lease-creation:
  t5533: make it pass on case-sensitive filesystems
  push: allow pushing new branches with --force-with-lease
  push: add shorthand for --force-with-lease branch creation
  Documentation/git-push: fix placeholder formatting
2016-09-08 21:35:53 -07:00
Junio C Hamano
f59c6e6ccb Merge branch 'jk/reflog-date' into maint
The reflog output format is documented better, and a new format
--date=unix to report the seconds-since-epoch (without timezone)
has been added.

* jk/reflog-date:
  date: clarify --date=raw description
  date: add "unix" format
  date: document and test "raw-local" mode
  doc/pretty-formats: explain shortening of %gd
  doc/pretty-formats: describe index/time formats for %gd
  doc/rev-list-options: explain "-g" output formats
  doc/rev-list-options: clarify "commit@{Nth}" for "-g" option
2016-09-08 21:35:52 -07:00
Junio C Hamano
7f5885ad2a Merge branch 'jc/renormalize-merge-kill-safer-crlf' into maint
"git merge" with renormalization did not work well with
merge-recursive, due to "safer crlf" conversion kicking in when it
shouldn't.

* jc/renormalize-merge-kill-safer-crlf:
  merge: avoid "safer crlf" during recording of merge results
  convert: unify the "auto" handling of CRLF
2016-09-08 21:35:52 -07:00
Junio C Hamano
faacc8efe5 Merge branch 'jk/common-main' into maint
There are certain house-keeping tasks that need to be performed at
the very beginning of any Git program, and programs that are not
built-in commands had to do them exactly the same way as "git"
potty does.  It was easy to make mistakes in one-off standalone
programs (like test helpers).  A common "main()" function that
calls cmd_main() of individual program has been introduced to
make it harder to make mistakes.

* jk/common-main:
  mingw: declare main()'s argv as const
  common-main: call git_setup_gettext()
  common-main: call restore_sigpipe_to_default()
  common-main: call sanitize_stdfds()
  common-main: call git_extract_argv0_path()
  add an extra level of indirection to main()
2016-09-08 21:35:51 -07:00
Jeff King
d63ed6ef24 remote-curl: handle URLs without protocol
Generally remote-curl would never see a URL that did not
have "proto:" at the beginning, as that is what tells git to
run the "git-remote-proto" helper (and git-remote-http, etc,
are aliases for git-remote-curl).

However, the special syntax "proto::something" will run
git-remote-proto with only "something" as the URL. So a
malformed URL like:

  http::/example.com/repo.git

will feed the URL "/example.com/repo.git" to
git-remote-http. The resulting URL has no protocol, but the
code added by 372370f (http: use credential API to handle
proxy authentication, 2016-01-26) does not handle this case
and segfaults.

For the purposes of this code, we don't really care what the
exact protocol; only whether or not it is https. So let's
just assume that a missing protocol is not, and curl will
handle the real error (which is that the URL is nonsense).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-08 11:23:43 -07:00
brian m. carlson
99d1a9861a cache: convert struct cache_entry to use struct object_id
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:

@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash

@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 12:59:42 -07:00
Ralf Thielow
37875b4733 rebase -i: improve advice on bad instruction lines
If we found bad instruction lines in the instruction sheet
of interactive rebase, we give the user advice on how to
fix it.  However, we don't tell the user what to do afterwards.
Give the user advice to run 'git rebase --continue' after
the fix.

Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:56:05 -07:00
Jeff King
b773ddea2c pack-objects: walk tag chains for --include-tag
When pack-objects is given --include-tag, it peels each tag
ref down to a non-tag object, and if that non-tag object is
going to be packed, we include the tag, too. But what
happens if we have a chain of tags (e.g., tag "A" points to
tag "B", which points to commit "C")?

We'll peel down to "C" and realize that we want to include
tag "A", but we do not ever consider tag "B", leading to a
broken pack (assuming "B" was not otherwise selected).
Instead, we have to walk the whole chain, adding any tags we
find to the pack.

Interestingly, it doesn't seem possible to trigger this
problem with "git fetch", but you can with "git clone
--single-branch". The reason is that we generate the correct
pack when the client explicitly asks for "A" (because we do
a real reachability analysis there), and "fetch" is more
willing to do so. There are basically two cases:

  1. If "C" is already a ref tip, then the client can deduce
     that it needs "A" itself (via find_non_local_tags), and
     will ask for it explicitly rather than relying on the
     include-tag capability. Everything works.

  2. If "C" is not already a ref tip, then we hope for
     include-tag to send us the correct tag. But it doesn't;
     it generates a broken pack. However, the next step is
     to do a follow-up run of find_non_local_tags(),
     followed by fetch_refs() to backfill any tags we
     learned about.

     In the normal case, fetch_refs() calls quickfetch(),
     which does a connectivity check and sees we have no
     new objects to fetch. We just write the refs.

     But for the broken-pack case, the connectivity check
     fails, and quickfetch will follow-up with the remote,
     asking explicitly for each of the ref tips. This picks
     up the missing object in a new pack.

For a regular "git clone", we are similarly OK, because we
explicitly request all of the tag refs, and get a correct
pack. But with "--single-branch", we kick in tag
auto-following via "include-tag", but do _not_ do a
follow-up backfill. We just take whatever the server sent us
via include-tag and write out tag refs for any tag objects
we were sent. So prior to c6807a4 (clone: open a shortcut
for connectivity check, 2013-05-26), we actually claimed the
clone was a success, but the result was silently
corrupted!  Since c6807a4, index-pack's connectivity
check catches this case, and we correctly complain.

The included test directly checks that pack-objects does not
generate a broken pack, but also confirms that "clone
--single-branch" does not hit the bug.

Note that tag chains introduce another interesting question:
if we are packing the tag "B" but not the commit "C", should
"A" be included?

Both before and after this patch, we do not include "A",
because the initial peel_ref() check only knows about the
bottom-most level, "C". To realize that "B" is involved at
all, we would have to switch to an incremental peel, in
which we examine each tagged object, asking if it is being
packed (and including the outer tag if so).

But that runs contrary to the optimizations in peel_ref(),
which avoid accessing the objects at all, in favor of using
the value we pull from packed-refs. It's OK to walk the
whole chain once we know we're going to include the tag (we
have to access it anyway, so the effort is proportional to
the pack we're generating). But for the initial selection,
we have to look at every ref. If we're only packing a few
objects, we'd still have to parse every single referenced
tag object just to confirm that it isn't part of a tag
chain.

This could be addressed if packed-refs stored the complete
tag chain for each peeled ref (in most cases, this would be
the same cost as now, as each "chain" is only a single
link). But given the size of that project, it's out of scope
for this fix (and probably nobody cares enough anyway, as
it's such an obscure situation). This commit limits itself
to just avoiding the creation of a broken pack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:45:31 -07:00
Jeff King
ab5178356c t5305: simplify packname handling
We generate a series of packfiles test-1-$pack,
test-2-$pack, with different properties and then examine
them. However we always store the packname generated by
pack-objects in the variable packname_1. This probably was
meant to be packname_2 in the second test, but it turns out
that it doesn't matter: once we are done with the first
pack, we can just keep using the same $packname variable.

So let's drop the confusing "_1" parameter. At the same
time, let's give test-1 and test-2 more descriptive names,
which can help keep them straight (note that we _could_
likewise overwrite the packfiles in each test, but by using
separate filenames, we are sure that test 2 does not
accidentally use the packfile from test 1).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:45:29 -07:00
Jeff King
948a7fd242 t5305: use "git -C"
This test unpacks objects into a separate repository, and
accesses it by setting GIT_DIR in a subshell. We can do the
same thing these days by using "git init <repo>" and "git
-C". In most cases this is shorter, though when there are
multiple commands, we may end up repeating the "-C".

However, this repetition can actually be a good thing. This
patch also fixes a bug introduced by 512477b (tests: use
"env" to run commands with temporary env-var settings,
2014-03-18). That commit essentially converted:

   (GIT_DIR=...; export GIT_DIR
    cmd1 &&
    cmd2)

into:

   (GIT_DIR=... cmd1 &&
    cmd2)

which obviously loses the GIT_DIR setting for cmd2 (we never
noticed the bug because it simply runs "cmd2" in the parent
repo, which means we were simply failing to test anything
interesting). By using "git -C" rather than a subshell, it
becomes quite obvious where each command is supposed to be
running.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:45:28 -07:00
Jeff King
2076353f47 t5305: drop "dry-run" of unpack-objects
For each test we do a dry-run of unpack-objects, followed by
a real run, followed by confirming that it contained the
objects we expected. The dry-run is telling us nothing, as
any errors it encounters would be found in the real run.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:45:27 -07:00
Jeff King
1962d9fbe3 t5305: move cleanup into test block
We usually try to avoid doing any significant actions
outside of test blocks. Although "rm -rf" is unlikely to
either fail or to generate output, moving these to the
point of use makes it more clear that they are part of the
overall setup of "clone.git".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:45:26 -07:00
Elia Pinto
14e24114d9 t5551-http-fetch-smart.sh: use the GIT_TRACE_CURL environment var
Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:41:45 -07:00
Elia Pinto
81590bf77d t5550-http-fetch-dumb.sh: use the GIT_TRACE_CURL environment var
Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:41:42 -07:00
Elia Pinto
4527aa10a6 test-lib.sh: preserve GIT_TRACE_CURL from the environment
Turning on this variable can be useful when debugging http
tests. It can break a few tests in t5541 if not set
to an absolute path but it is not a variable
that the user is likely to have enabled accidentally.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:41:40 -07:00
Elia Pinto
4eee6c6ddc t5541-http-push-smart.sh: use the GIT_TRACE_CURL environment var
Use the new GIT_TRACE_CURL environment variable instead
of the deprecated GIT_CURL_VERBOSE.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:41:39 -07:00
Johannes Sixt
5babb5bdb3 t6026-merge-attr: clean up background process at end of test case
The process spawned in the hook uses the test's trash directory as CWD.
As long as it is alive, the directory cannot be removed on Windows.
Although the test succeeds, the 'test_done' that follows produces an
error message and leaves the trash directory around. Kill the process
before the test case advances.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:40:22 -07:00
Johannes Sixt
c00bfc9d1b t9903: fix broken && chain
We might wonder why our && chain check does not catch this case:
The && chain check uses a strange exit code with the expectation that
the second or later part of a broken && chain would not exit with this
particular code.

This expectation does not work in this case because __git_ps1, being
the first command in the second part of the broken && chain, records
the current exit code, does its work, and finally returns to the caller
with the recorded exit code. This fools our && chain check.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07 11:35:08 -07:00
Junio C Hamano
12cfa792b8 symbolic-ref -d: do not allow removal of HEAD
If you delete the symbolic-ref HEAD from a repository, Git no longer
considers the repository valid, and even "git symbolic-ref HEAD
refs/heads/master" would not be able to recover from that state
(although "git init" can, but that is a sure sign that you are
talking about a "broken" repository).

In the spirit similar to afe5d3d5 ("symbolic ref: refuse non-ref
targets in HEAD", 2009-01-29), forbid removal of HEAD to avoid
corrupting a repository.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-02 09:01:38 -07:00
Junio C Hamano
10f5c52656 submodule: avoid auto-discovery in prepare_submodule_repo_env()
The function is used to set up the environment variable used in a
subprocess we spawn in a submodule directory.  The callers set up a
child_process structure, find the working tree path of one submodule
and set .dir field to it, and then use start_command() API to spawn
the subprocess like "status", "fetch", etc.

When this happens, we expect that the ".git" (either a directory or
a gitfile that points at the real location) in the current working
directory of the subprocess MUST be the repository for the submodule.

If this ".git" thing is a corrupt repository, however, because
prepare_submodule_repo_env() unsets GIT_DIR and GIT_WORK_TREE, the
subprocess will see ".git", thinks it is not a repository, and
attempt to find one by going up, likely to end up in finding the
repository of the superproject.  In some codepaths, this will cause
a command run with the "--recurse-submodules" option to recurse
forever.

By exporting GIT_DIR=.git, disable the auto-discovery logic in the
subprocess, which would instead stop it and report an error.

The test illustrates existing problems in a few callsites of this
function.  Without this fix, "git fetch --recurse-submodules", "git
status" and "git diff" keep recursing forever.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-01 14:01:29 -07:00
Jacob Keller
fd47ae6a5b diff: teach diff to display submodule difference with an inline diff
Teach git-diff and friends a new format for displaying the difference
of a submodule. The new format is an inline diff of the contents of the
submodule between the commit range of the update. This allows the user
to see the actual code change caused by a submodule update.

Add tests for the new format and option.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31 18:07:10 -07:00
Jacob Keller
99b43a61f2 allow do_submodule_path to work even if submodule isn't checked out
Currently, do_submodule_path will attempt locating the .git directory by
using read_gitfile on <path>/.git. If this fails it just assumes the
<path>/.git is actually a git directory.

This is good because it allows for handling submodules which were cloned
in a regular manner first before being added to the superproject.

Unfortunately this fails if the <path> is not actually checked out any
longer, such as by removing the directory.

Fix this by checking if the directory we found is actually a gitdir. In
the case it is not, attempt to lookup the submodule configuration and
find the name of where it is stored in the .git/modules/ directory of
the superproject.

If we can't locate the submodule configuration, this might occur because
for example a submodule gitlink was added but the corresponding
.gitmodules file was not properly updated.  A die() here would not be
pleasant to the users of submodule diff formats, so instead, modify
do_submodule_path() to return an error code:

 - git_pathdup_submodule() returns NULL when we fail to find a path.
 - strbuf_git_path_submodule() propagates the error code to the caller.

Modify the callers of these functions to check the error code and fail
properly. This ensures we don't attempt to use a bad path that doesn't
match the corresponding submodule.

Because this change fixes add_submodule_odb() to work even if the
submodule is not checked out, update the wording of the submodule log
diff format to correctly display that the submodule is "not initialized"
instead of "not checked out"

Add tests to ensure this change works as expected.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31 18:07:10 -07:00
Jacob Keller
660e113ce1 graph: add support for --line-prefix on all graph-aware output
Add an extension to git-diff and git-log (and any other graph-aware
displayable output) such that "--line-prefix=<string>" will print the
additional line-prefix on every line of output.

To make this work, we have to fix a few bugs in the graph API that force
graph_show_commit_msg to be used only when you have a valid graph.
Additionally, we extend the default_diff_output_prefix handler to work
even when no graph is enabled.

This is somewhat of a hack on top of the graph API, but I think it
should be acceptable here.

This will be used by a future extension of submodule display which
displays the submodule diff as the actual diff between the pre and post
commit in the submodule project.

Add some tests for both git-log and git-diff to ensure that the prefix
is honored correctly.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31 18:07:09 -07:00
Junio C Hamano
4762bf36d9 Merge branch 'mh/blame-worktree'
* mh/blame-worktree:
  blame: fix segfault on untracked files
2016-08-31 10:03:50 -07:00
Junio C Hamano
9010077be2 Merge branch 'kw/patch-ids-optim'
* kw/patch-ids-optim:
  p3400: make test script executable
2016-08-31 10:03:49 -07:00
Ralf Thielow
2c6b6d9f7d help: make option --help open man pages only for Git commands
If option --help is passed to a Git command, we try to open
the man page of that command.  However, we do it for both commands
and concepts.  Make sure it is an actual command.

This makes "git <concept> --help" not working anymore, while
"git help <concept>" still works.

Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-30 16:09:41 -07:00
Ralf Thielow
af74128f4a help: introduce option --exclude-guides
Introduce option --exclude-guides to the help command.  With this option
being passed, "git help" will open man pages only for actual commands.

Since we know it is a command, we can use function help_unknown_command
to give the user advice on typos.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-30 16:09:41 -07:00
Jeff King
5c885c1b53 test-lib: drop PID from test-results/*.count
Each test run generates a "count" file in t/test-results
that stores the number of successful, failed, etc tests.
If you run "t1234-foo.sh", that file is named as
"t/test-results/t1234-foo-$$.count"

The addition of the PID there is serving no purpose, and
makes analysis of the count files harder.

The presence of the PID dates back to 2d84e9f (Modify
test-lib.sh to output stats to t/test-results/*,
2008-06-08), but no reasoning is given there. Looking at the
current code, we can see that other files we write to
test-results (like *.exit and *.out) do _not_ have the PID
included. So the presence of the PID does not meaningfully
allow one to store the results from multiple runs anyway.

Moreover, anybody wishing to read the *.count files to
aggregate results has to deal with the presence of multiple
files for a given test (and figure out which one is the most
recent based on their timestamps!). The only consumer of
these files is the aggregate.sh script, which arguably gets
this wrong. If a test is run multiple times, its counts will
appear multiple times in the total (I say arguably only
because the desired semantics aren't documented anywhere,
but I have trouble seeing how this behavior could be
useful).

So let's just drop the PID, which fixes aggregate.sh, and
will make new features based around the count files easier
to write.

Note that since the count-file may already exist (when
re-running a test), we also switch the "cat" from appending
to truncating. The use of append here was pointless in the
first place, as we expected to always write to a unique file.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-30 12:08:58 -07:00
René Scharfe
ba67504fa8 p3400: make test script executable
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-29 12:57:16 -07:00
Thomas Gummerer
bc6b13a7d2 blame: fix segfault on untracked files
Since 3b75ee9 ("blame: allow to blame paths freshly added to the index",
2016-07-16) git blame also looks at the index to determine if there is a
file that was freshly added to the index.

cache_name_pos returns -pos - 1 in case there is no match is found, or
if the name matches, but the entry has a stage other than 0.  As git
blame should work for unmerged files, it uses strcmp to determine
whether the name of the returned position matches, in which case the
file exists, but is merely unmerged, or if the file actually doesn't
exist in the index.

If the repository is empty, or if the file would lexicographically be
sorted as the last file in the repository, -cache_name_pos - 1 is
outside of the length of the active_cache array, causing git blame to
segfault.  Guard against that, and die() normally to restore the old
behaviour.

Reported-by: Simon Ruderich <simon@ruderich.org>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-29 11:57:33 -07:00
Junio C Hamano
3dc01702df Merge branch 'bw/mingw-avoid-inheriting-fd-to-lockfile'
The tempfile (hence its user lockfile) API lets the caller to open
a file descriptor to a temporary file, write into it and then
finalize it by first closing the filehandle and then either
removing or renaming the temporary file.  When the process spawns a
subprocess after obtaining the file descriptor, and if the
subprocess has not exited when the attempt to remove or rename is
made, the last step fails on Windows, because the subprocess has
the file descriptor still open.  Open tempfile with O_CLOEXEC flag
to avoid this (on Windows, this is mapped to O_NOINHERIT).

* bw/mingw-avoid-inheriting-fd-to-lockfile:
  mingw: ensure temporary file handles are not inherited by child processes
  t6026-merge-attr: child processes must not inherit index.lock handles
2016-08-25 13:55:07 -07:00
Jeff King
c08db5a2d0 receive-pack: allow a maximum input size to be specified
Receive-pack feeds its input to either index-pack or
unpack-objects, which will happily accept as many bytes as
a sender is willing to provide. Let's allow an arbitrary
cutoff point where we will stop writing bytes to disk.

Cleaning up what has already been written to disk is a
related problem that is not addressed by this patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-24 12:31:05 -07:00
Jacob Keller
957ed3a56c format-patch: show 0/1 and 1/1 for singleton patch with cover letter
Change the default behavior of git-format-patch to generate numbered
sequence of 0/1 and 1/1 when generating both a cover-letter and a single
patch. This standardizes the cover letter to have 0/N which helps
distinguish the cover letter from the patch itself. Since the behavior
is easily changed via configuration as well as the use of -n and -N this
should be acceptable default behavior.

Add tests for the new default behavior.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23 15:59:11 -07:00
Jeff King
c7df68cbca t/perf: add basic perf tests for delta base cache
This just shows off the improvements done by the last few
patches, and gives us a baseline for noticing regressions in
the future. Here are the results with linux.git as the perf
"large repo":

Test                origin                HEAD
-------------------------------------------------------------------
0003.1: log --raw   43.41(40.36+2.69)     33.86(30.96+2.41) -22.0%
0003.2: log -S      313.61(309.74+3.78)   298.75(295.58+3.00) -4.7%

(for a large repo, the "log -S" improvements are greater if
you bump the delta base cache limit, but I think it makes
sense to test the "stock" behavior, since that is what most
people will see).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-23 15:26:16 -07:00