When creating a pack using objects that reside in existing packs, we try
to avoid recomputing futile delta between an object (trg) and a candidate
for its base object (src) if they are stored in the same packfile, and trg
is not recorded as a delta already. This heuristics makes sense because it
is likely that we tried to express trg as a delta based on src but it did
not produce a good delta when we created the existing pack.
As the pack heuristics prefer producing delta to remove data, and Linus's
law dictates that the size of a file grows over time, we tend to record
the newest version of the file as inflated, and older ones as delta
against it.
When creating a thin-pack to transfer recent history, it is likely that we
will try to send an object that is recorded in full, as it is newer. But
the heuristics to avoid recomputing futile delta effectively forbids us
from attempting to express such an object as a delta based on another
object. Sending an object in full is often more expensive than sending a
suboptimal delta based on other objects, and it is even more so if we
could use an object we know the receiving end already has (i.e. preferred
base object) as the delta base.
Tweak the recomputation avoidance logic, so that we do not punt on
computing delta against a preferred base object.
The effect of this change can be seen on two simulated upload-pack
workloads. The first is based on 44 reflog entries from my git.git
origin/master reflog, and represents the packs that kernel.org sent me git
updates for the past month or two. The second workload represents much
larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to
v1.2.0, and so on.
The table below shows the average generated pack size and the average CPU
time consumed for each dataset, both before and after the patch:
dataset
| reflog | tags
---------------------------------
before | 53358 | 2750977
size after | 32398 | 2668479
change | -39% | -3%
---------------------------------
before | 0.18 | 1.12
CPU after | 0.18 | 1.15
change | +0% | +3%
This patch makes a much bigger difference for packs with a shorter slice
of history (since its effect is seen at the boundaries of the pack) though
it has some benefit even for larger packs.
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When showing the raw timestamp, we format the numeric
seconds-since-epoch into a buffer, followed by the timezone
string. This string has come straight from the commit
object. A well-formed object should have a timezone string
of only a few bytes, but we could be operating on data
pushed by a malicious user.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we fetch from a remote, we print a status table like:
From url
* [new branch] foo -> origin/foo
We create this table in a static buffer using sprintf. If
the remote refnames are long, they can overflow this buffer
and smash the stack.
Instead, let's use a strbuf to build the string.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The comment on top of stripspace() claims that the buffer
will no longer be NUL-terminated. However, this has not been
the case at least since the move to using strbuf in 2007.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mz/remote-rename:
remote: only update remote-tracking branch if updating refspec
remote rename: warn when refspec was not updated
remote: "rename o foo" should not rename ref "origin/bar"
remote: write correct fetch spec when renaming remote 'remote'
When running git describe --dirty the index should be refreshed. Previously
the cached index would cause describe to think that the index was dirty when,
in reality, it was just stale.
The issue was exposed by python setuptools which hardlinks files into another
directory when building a distribution.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git remote rename' will only update the remote's fetch refspec if it
looks like a default one. If the remote has no default fetch refspec,
as in
[remote "origin"]
url = git://git.kernel.org/pub/scm/git/git.git
fetch = +refs/heads/*:refs/remotes/upstream/*
we would not update the fetch refspec and even if there is a ref
called "refs/remotes/origin/master", we should not rename it, since it
was not created by fetching from the remote.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When renaming a remote, we also try to update the fetch refspec
accordingly, but only if it has the default format. For others, such
as refs/heads/master:refs/heads/origin, we are conservative and leave
it untouched. Let's give the user a warning about refspecs that are
not updated, so he can manually update the config if necessary.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When renaming a remote called 'o' using 'git remote rename o foo', git
should also rename any remote-tracking branches for the remote. This
does happen, but any remote-tracking branches starting with
'refs/remotes/o', such as 'refs/remotes/origin/bar', will also be
renamed (to 'refs/remotes/foorigin/bar' in this case).
Fix it by simply matching one more character, up to the slash
following the remote name.
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When renaming a remote whose name is contained in a configured fetch
refspec for that remote, we currently replace the first occurrence of
the remote name in the refspec. This is correct in most cases, but
breaks if the remote name occurs in the fetch refspec before the
expected place. For example, we currently change
[remote "remote"]
url = git://git.kernel.org/pub/scm/git/git.git
fetch = +refs/heads/*:refs/remotes/remote/*
into
[remote "origin"]
url = git://git.kernel.org/pub/scm/git/git.git
fetch = +refs/heads/*:refs/origins/remote/*
Reduce the risk of changing incorrect sections of the refspec by
matching the entire ":refs/remotes/<name>/" instead of just "<name>".
Signed-off-by: Martin von Zweigbergk <martin.von.zweigbergk@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It makes no sense to do the - possibly very expensive - call to "rev-list
<new-ref-sha1> --not --all" in check_for_new_submodule_commits() when
there aren't any submodules configured.
Leave check_for_new_submodule_commits() early when no name <-> path
mappings for submodules are found in the configuration. To make that work
reading the configuration had to be moved further up in cmd_fetch(), as
doing that after the actual fetch of the superproject was too late.
Reported-by: Martin Fick <mfick@codeaurora.org>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit ffa69e61d3, reversing
changes made to 4a13c4d148.
Adding a new command line option to receive-pack and feed it from
send-pack is not an acceptable way to add features, as there is no
guarantee that your updated send-pack will be talking to updated
receive-pack. New features need to be added via the capability mechanism
negotiated over the protocol.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When asked if "refs///heads/master" is valid, check-ref-format says "Yes,
it is well formed", and when asked to print canonical form, it shows
"refs/heads/master". This is so that it can be tucked after "$GIT_DIR/"
to form a valid pathname for a loose ref, and we normalize a pathname like
"$GIT_DIR/refs///heads/master" to de-dup the slashes in it.
Similarly, when asked if "/refs/heads/master" is valid, check-ref-format
says "Yes, it is Ok", but the leading slash is not removed when printing,
leading to "$GIT_DIR//refs/heads/master".
Fix it to make sure such leading slashes are removed. Add tests that such
refnames are accepted and normalized correctly.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* cb/maint-quiet-push:
receive-pack: do not overstep command line argument array
propagate --quiet to send-pack/receive-pack
Conflicts:
Documentation/git-receive-pack.txt
Documentation/git-send-pack.txt
Cloning from a local repository blindly copies or hardlinks all the files
under objects/ hierarchy. This results in two issues:
- If the repository cloned has an "objects/info/alternates" file, and the
command line of clone specifies --reference, the ones specified on the
command line get overwritten by the copy from the original repository.
- An entry in a "objects/info/alternates" file can specify the object
stores it borrows objects from as a path relative to the "objects/"
directory. When cloning a repository with such an alternates file, if
the new repository is not sitting next to the original repository, such
relative paths needs to be adjusted so that they can be used in the new
repository.
This updates add_to_alternates_file() to take the path to the alternate
object store, including the "/objects" part at the end (earlier, it was
taking the path to $GIT_DIR and was adding "/objects" itself), as it is
technically possible to specify in objects/info/alternates file the path
of a directory whose name does not end with "/objects".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Also add a test to expose a long-standing bug that is triggered when
cloning with --reference option from a local repository that has its own
alternates. The alternate object stores specified on the command line
are lost, and only alternates copied from the source repository remain.
The bug will be fixed in the next patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function was not gentle at all to the callers and died without giving
them a chance to deal with possible errors. Rename it to read_gitfile(),
and update all the callers.
As no existing caller needs a true "gently" variant, we do not bother
adding one at this point.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A malicious server can return ACK with non-existent SHA-1 or not a
commit. lookup_commit() in this case may return NULL. Do not let
fetch-pack crash by accessing NULL address in this case.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The first paragraph about flag order is no longer true and is
mentioned in git-checkout-index.txt. The rest is also mentioned in
git-checkout-index.txt.
Remove it and keep uptodate document in one place.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/ls-tree-error:
Ensure git ls-tree exits with a non-zero exit code if read_tree_recursive fails.
Add a test to check that git ls-tree sets non-zero exit code on error.
* jc/zlib-wrap:
zlib: allow feeding more than 4GB in one go
zlib: zlib can only process 4GB at a time
zlib: wrap deflateBound() too
zlib: wrap deflate side of the API
zlib: wrap inflateInit2 used to accept only for gzip format
zlib: wrap remaining calls to direct inflate/inflateEnd
zlib wrapper: refactor error message formatter
The following sequence of commands reveals an issue with error
reporting of relative paths:
$ mkdir sub
$ cd sub
$ git ls-files --error-unmatch ../bbbbb
error: pathspec 'b' did not match any file(s) known to git.
$ git commit --error-unmatch ../bbbbb
error: pathspec 'b' did not match any file(s) known to git.
This bug is visible only if the normalized path (i.e., the relative
path from the repository root) is longer than the prefix.
Otherwise, the code skips over the normalized path and reads from
an unused memory location which still contains a leftover of the
original command line argument.
So instead, use the existing facilities to deal with relative paths
correctly.
Also fix inconsistency between "checkout" and "commit", e.g.
$ cd Documentation
$ git checkout nosuch.txt
error: pathspec 'Documentation/nosuch.txt' did not match...
$ git commit nosuch.txt
error: pathspec 'nosuch.txt' did not match...
by propagating the prefix down the codepath that reports the error.
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many pathnames in a fast-import stream need to be quoted. In
particular:
1. Pathnames at the end of an "M" or "D" line need quoting
if they contain a LF or start with double-quote.
2. Pathnames on a "C" or "R" line need quoting as above,
but also if they contain spaces.
For (1), we weren't quoting at all. For (2), we put
double-quotes around the paths to handle spaces, but ignored
the possibility that they would need further quoting.
This patch checks whether each pathname needs c-style
quoting, and uses it. This is slightly overkill for (1),
which doesn't actually need to quote many characters that
vanilla c-style quoting does. However, it shouldn't hurt, as
any implementation needs to be ready to handle quoted
strings anyway.
In addition to adding a test, we have to tweak a test which
blindly assumed that case (2) would always use
double-quotes, whether it needed to or not.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reflog manpage says:
git reflog [show] [log-options] [<ref>]
the subcommand 'show' is the default "in the absence of any
subcommands". Currently this is only true if the user provided either
at least one option or no additional argument at all. For example:
git reflog master
won't work. Change this by actually calling cmd_log_reflog in
absence of any subcommand.
Signed-off-by: Michael Schubert <mschub@elegosoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Currently, git push --quiet produces some non-error output, e.g.:
$ git push --quiet
Unpacking objects: 100% (3/3), done.
Add the --quiet option to send-pack/receive-pack and pass it to
unpack-objects in the receive-pack codepath and to receive-pack in
the push codepath.
This fixes a bug reported for the fedora git package:
https://bugzilla.redhat.com/show_bug.cgi?id=725593
Reported-by: Jesse Keating <jkeating@redhat.com>
Cc: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the case of a corrupt repository, git ls-tree may report an error but
presently it exits with a code of 0.
This change uses the return code of read_tree_recursive instead.
Improved-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Jon Seymour <jon.seymour@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The reset command creates its reflog entry from argv.
However, it does so after having run parse_options, which
means the only thing left in argv is any non-option
arguments. Thus you would end up with confusing reflog
entries like:
$ git reset --hard HEAD^
$ git reset --soft HEAD@{1}
$ git log -2 -g --oneline
8e46cad HEAD@{0}: HEAD@{1}: updating HEAD
1eb9486 HEAD@{1}: HEAD^: updating HEAD
However, we must also consider that some scripts may set
GIT_REFLOG_ACTION before calling reset, and we need to show
their reflog action (with our text appended). For example:
rebase -i (squash): updating HEAD
On top of that, we also set the ORIG_HEAD reflog action
(even though it doesn't generally exist). In that case, the
reset argument is somewhat meaningless, as it has nothing to
do with what's in ORIG_HEAD.
This patch changes the reset reflog code to show:
$GIT_REFLOG_ACTION: updating {HEAD,ORIG_HEAD}
as before, but only if GIT_REFLOG_ACTION is set. Otherwise,
show:
reset: moving to $rev
for HEAD, and:
reset: updating ORIG_HEAD
for ORIG_HEAD (this is still somewhat superfluous, since we
are in the ORIG_HEAD reflog, obviously, but at least we now
mention which command was used to update it).
While we're at it, we can clean up the code a bit:
- Use strbufs to make the message.
- Use the "rev" parameter instead of showing all options.
This makes more sense, since it is the only thing
impacting the writing of the ref.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Because "diff --cached HEAD" showed an incorrect blob object name on the
LHS of the diff, we ended up updating the index entry with bogus value,
not what we read from the tree.
Noticed by John Nowak.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* commit 'v1.7.6': (3211 commits)
Git 1.7.6
completion: replace core.abbrevguard to core.abbrev
Git 1.7.6-rc3
Documentation: git diff --check respects core.whitespace
gitweb: 'pickaxe' and 'grep' features requires 'search' to be enabled
t7810: avoid unportable use of "echo"
plug a few coverity-spotted leaks
builtin/gc.c: add missing newline in message
tests: link shell libraries into valgrind directory
t/Makefile: pass test opts to valgrind target properly
sh-i18n--envsubst.c: do not #include getopt.h
Fix typo: existant->existent
Git 1.7.6-rc2
gitweb: do not misparse nonnumeric content tag files that contain a digit
Git 1.7.6-rc1
fetch: do not leak a refspec
t3703: skip more tests using colons in file names on Windows
gitweb: Fix usability of $prevent_xss
gitweb: Move "Requirements" up in gitweb/INSTALL
gitweb: Describe CSSMIN and JSMIN in gitweb/INSTALL
...
Since 03feddd (git-check-ref-format: reject funny ref names, 2005-10-13),
"git branch -d" can take more than one branch names to remove.
The documentation was correct, but the usage string was not.
Signed-off-by: Junio C Hamano <gitster@pobox.com>