1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-11-17 06:25:13 +01:00
Commit graph

31403 commits

Author SHA1 Message Date
Junio C Hamano
918d4e1c90 revisions: initialize revs->grep_filter using grep_init()
Instead of using the hand-rolled initialization sequence,
use grep_init() to populate the necessary bits.  This opens
the door to allow the calling commands to optionally read
grep.* configuration variables via git_config() if they
want to.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 23:21:29 -07:00
Junio C Hamano
c5c31d3381 grep: move pattern-type bits support to top-level grep.[ch]
Switching between -E/-G/-P/-F correctly needs a lot more than just
flipping opt->regflags bit these days, and we have a nice helper
function buried in builtin/grep.c for the sole use of "git grep".

Extract it so that "log --grep" family can also use it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 23:21:29 -07:00
Junio C Hamano
7687a0541e grep: move the configuration parsing logic to grep.[ch]
The configuration handling is a library-ish part of this program,
that is not specific to "git grep" command.  It should be reusable
by "log" and others.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 16:17:50 -07:00
Junio C Hamano
15fabd1bbd builtin/grep.c: make configuration callback more reusable
The grep_config() function takes one instance of grep_opt as its
callback parameter, and populates it by running git_config().

This has three practical implications:

 - You have to have an instance of grep_opt already when you call
   the configuration, but that is not necessarily always true.  You
   may be trying to initialize the grep_filter member of rev_info,
   but are not ready to call init_revisions() on it yet.

 - It is not easy to enhance grep_config() in such a way to make it
   cascade to other callback functions to grab other variables in
   one call of git_config(); grep_config() can be cascaded into from
   other callbacks, but it has to be at the leaf level of a cascade.

 - If you ever need to use more than one instance of grep_opt, you
   will have to open and read the configuration file(s) every time
   you initialize them.

Rearrange the configuration mechanism and model it after how diff
configuration variables are handled.  An early call to git_config()
reads and remembers the values taken from the configuration in the
default "template", and a separate call to grep_init() uses this
template to instantiate a grep_opt.

The next step will be to move some of this out of this file so that
the other user of the grep machinery (i.e. "log") can use it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 16:04:12 -07:00
Junio C Hamano
d64383ab14 Merge branch 'maint'
* maint:
  l10n: de.po: fix a few minor typos
2012-10-09 14:23:45 -07:00
Øyvind A. Holm
9979a507c5 configure.ac: Add missing comma to CC_LD_DYNPATH
40bfbde ("build: don't duplicate substitution of make variables",
2012-09-11) by mistake removed a necessary comma at the end of
"CC_LD_DYNPATH=-Wl,rpath," in line 414.

When executing "./configure --with-zlib=PATH", this resulted in

      [...]
      CC xdiff/xhistogram.o
      AR xdiff/lib.a
      LINK git-credential-store
  /usr/bin/ld: bad -rpath option
  collect2: ld returned 1 exit status
  make: *** [git-credential-store] Error 1
  $

during make.

Signed-off-by: Øyvind A. Holm <sunny@sunbase.org>
Acked-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-09 14:19:37 -07:00
Junio C Hamano
7bfffdc8a0 Merge branch 'maint' of git://github.com/git-l10n/git-po into maint
* 'maint' of git://github.com/git-l10n/git-po:
  l10n: de.po: fix a few minor typos
2012-10-09 11:48:53 -07:00
Ben Walton
d4a7ffaae3 tests: "cp -a" is a GNUism
These tests just want a bit-for-bit identical copy; they do not need
even -H (there is no symbolic link involved) nor -p (there is no
funny permission or ownership issues involved).

Just use "cp -R" instead.

Signed-off-by: Ben Walton <bdwalton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-08 14:37:43 -07:00
Ramkumar Ramachandra
6347e71619 Git url doc: mark ftp/ftps as read-only and deprecate them
It is not even worth mentioning their removal; just discourage
people from using them.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-08 14:18:19 -07:00
Junio C Hamano
4c6c949c7d Git 1.8.0-rc1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-08 11:45:41 -07:00
Junio C Hamano
d519e4594c Merge branch 'jc/maint-t1450-fsck-order-fix'
The fsck test assumed too much on what kind of error it will
detect. The only important thing is the inconsistency is detected
as an error.

* jc/maint-t1450-fsck-order-fix:
  t1450: the order the objects are checked is undefined
2012-10-08 11:43:10 -07:00
Junio C Hamano
683a820d51 Merge branch 'jc/merge-bases-paint-fix'
"git fmt-merge-msg" (an internal helper reduce_heads() it uses) had
a severe performance regression; an empty "git pull" took forever to
finish as the result.

* jc/merge-bases-paint-fix:
  paint_down_to_common(): parse commit before relying on its timestamp
2012-10-08 11:42:15 -07:00
Junio C Hamano
5a333adeb5 Sync with 1.7.12.3 2012-10-08 11:41:21 -07:00
Junio C Hamano
234cd45662 Git 1.7.12.3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-08 11:40:43 -07:00
Junio C Hamano
ff5702c52d Merge branch 'os/commit-submodule-ignore' into maint
"git status" honored the ignore=dirty settings in .gitmodules but
"git commit" didn't.

* os/commit-submodule-ignore:
  commit: pay attention to submodule.$name.ignore in .gitmodules
2012-10-08 11:34:34 -07:00
Junio C Hamano
25c08907a0 Merge branch 'jk/receive-pack-unpack-error-to-pusher' into maint
"git receive-pack" (the counterpart to "git push") did not give
progress output while processing objects it received to the puser
when run over the smart-http protocol.

* jk/receive-pack-unpack-error-to-pusher:
  receive-pack: drop "n/a" on unpacker errors
  receive-pack: send pack-processing stderr over sideband
  receive-pack: redirect unpack-objects stdout to /dev/null
2012-10-08 11:34:19 -07:00
Junio C Hamano
9b4030cd98 Merge branch 'rt/maint-clone-single' into maint
A repository created with "git clone --single" had its fetch
refspecs set up just like a clone without "--single", leading the
subsequent "git fetch" to slurp all the other branches, defeating
the whole point of specifying "only this branch".

* rt/maint-clone-single:
  clone --single: limit the fetch refspec to fetched branch
2012-10-08 11:34:02 -07:00
Junio C Hamano
63c0c2c8a0 Merge branch 'jc/blame-follows-renames' into maint
It was unclear in the documentation for "git blame" that it is
unnecessary for users to use the "--follow" option.

* jc/blame-follows-renames:
  git blame: document that it always follows origin across whole-file renames
2012-10-08 11:33:35 -07:00
Junio C Hamano
6e2035715e Merge branch 'lt/mailinfo-handle-attachment-more-sanely' into maint
A patch attached as application/octet-stream (e.g. not text/*) were
mishandled, not correctly honoring Content-Transfer-Encoding
(e.g. base64).

* lt/mailinfo-handle-attachment-more-sanely:
  mailinfo: don't require "text" mime type for attachments
2012-10-08 11:33:00 -07:00
Nguyễn Thái Ngọc Duy
866f5f82b9 gitignore.txt: suggestions how to get literal # or ! at the beginning
We support backslash escape, but we hide the details behind the phrase
"a shell glob suitable for consumption by fnmatch(3)". So it may not
be obvious how one can get literal # or ! at the beginning of pattern.
Add a few lines on how to work around the magic characters.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 16:15:19 -07:00
Florian Achleitner
e99d012a6b Add a test script for remote-svn
Use svnrdump_sim.py to emulate svnrdump without an svn server.
Tests fetching, incremental fetching, fetching from file://,
and the regeneration of fast-import's marks file.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
5bfc76b5b2 remote-svn: add marks-file regeneration
fast-import mark files are stored outside the object database and are
therefore not fetched and can be lost somehow else.  marks provide a
svn revision --> git sha1 mapping, while the notes that are attached
to each commit when it is imported provide a git sha1 --> svn revision
mapping.

If the marks file is not available or not plausible, regenerate it by
walking through the notes tree.  , i.e.  The plausibility check tests
if the highest revision in the marks file matches the revision of the
top ref. It doesn't ensure that the mark file is completely correct.
This could only be done with an effort equal to unconditional
regeneration.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
16a7185447 Add a svnrdump-simulator replaying a dump file for testing
To ease testing without depending on a reachable svn server, this
compact python script mimics parts of svnrdumps behaviour.  It
requires the remote url to start with sim://.

Start and end revisions are evaluated.  If the requested revision
doesn't exist, as it is the case with incremental imports, if no new
commit was added, it returns 1 (like svnrdump).

To allow using the same dump file for simulating multiple incremental
imports, the highest revision can be limited by setting the environment
variable SVNRMAX to that value. This simulates the situation where
higher revs don't exist yet.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
8e43a1d010 remote-svn: add incremental import
Search for a note attached to the ref to update and read it's
'Revision-number:'-line. Start import from the next svn revision.

If there is no next revision in the svn repo, svnrdump terminates with
a message on stderr an non-zero return value. This looks a little
weird, but there is no other way to know whether there is a new
revision in the svn repo.

On the start of an incremental import, the parent of the first commit
in the fast-import stream is set to the branch name to update. All
following commits specify their parent by a mark number. Previous mark
files are currently not reused.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
8d7cd8eb3b remote-svn: Activate import/export-marks for fast-import
Enable import and export of a marks file by sending the appropriate
feature commands to fast-import before sending data.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
a9a55613cb Create a note for every imported commit containing svn metadata
To provide metadata from svn dumps for further processing, e.g.
branch detection, attach a note to each imported commit that stores
additional information.  The notes are currently hard-coded in
refs/notes/svn/revs.  Currently the following lines from the svn dump
are directly accumulated in the note. This can be refined as needed.

 - "Revision-number"
 - "Node-path"
 - "Node-kind"
 - "Node-action"
 - "Node-copyfrom-path"
 - "Node-copyfrom-rev"

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Dmitry Ivankov
3c23953fb2 vcs-svn: add fast_export_note to create notes
fast_export lacked a method to writes notes to fast-import stream.
Add two new functions fast_export_note which is similar to
fast_export_modify. And also add fast_export_buf_to_data to be able to
write inline blobs that don't come from a line_buffer or from delta
application.

To be used like this:

  fast_export_begin_commit("refs/notes/somenotes", ...)
  fast_export_note("refs/heads/master", "inline")
  fast_export_buf_to_data(&data)

or maybe

  fast_export_note("refs/heads/master", sha1)

Signed-off-by: Dmitry Ivankov <divanorama@gmail.com>
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
f6529de9f4 Allow reading svn dumps from files via file:// urls
For testing as well as for importing large, already available dumps,
it's useful to bypass svnrdump and replay the svndump from a file
directly.

Add support for file:// urls in the remote url, e.g.

  svn::file:///path/to/dump

When the remote helper finds an url starting with file:// it tries to
open that file instead of invoking svnrdump.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
271fd1fc2a remote-svn, vcs-svn: Enable fetching to private refs
The reference to update by the fast-import stream is hard-coded.  When
fetching from a remote the remote-helper shall update refs in a
private namespace, i.e. a private subdir of refs/.  This namespace is
defined by the 'refspec' capability, that the remote-helper advertises
as a reply to the 'capabilities' command.

Extend svndump and fast-export to allow passing the target ref.
Update svn-fe to be compatible.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
19ba02af47 When debug==1, start fast-import with "--stats" instead of "--quiet"
fast-import prints statistics that could be interesting to the
developer of remote helpers.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
271bfd678b Add documentation for the 'bidi-import' capability of remote-helpers
Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:17 -07:00
Florian Achleitner
bfc366d931 Connect fast-import to the remote-helper via pipe, adding 'bidi-import' capability
The fast-import commands 'cat-blob' and 'ls' can be used by
remote-helpers to retrieve information about blobs and trees that
already exist in fast-import's memory. This requires a channel from
fast-import to the remote-helper.

remote-helpers that use these features shall advertise the new
'bidi-import' capability to signal that they require the communication
channel.  When forking fast-import in transport-helper.c connect it to
a dup of the remote-helper's stdin-pipe. The additional file
descriptor is passed to fast-import via its command line
(--cat-blob-fd).  It follows that git and fast-import are connected to
the remote-helpers's stdin.

Because git can send multiple commands to the remote-helper on it's
stdin, it is required that helpers that advertise 'bidi-import' buffer
all input commands until the batch of 'import' commands is ended by a
newline before sending data to fast-import.  This is to prevent mixing
commands and fast-import responses on the helper's stdin.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:16 -07:00
Florian Achleitner
df7428eca4 Add argv_array_detach and argv_array_free_detached
Allow detaching of ownership of the argv_array's contents and add a
function to free those detached argv_arrays later.

This makes it possible to use argv_array efficiently with the exiting
struct child_process which only contains a member char **argv.

Add to documentation.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:16 -07:00
Florian Achleitner
fd871b94f6 Add svndump_init_fd to allow reading dumps from arbitrary FDs
The existing function only allows reading from a filename or from
stdin. Allow passing of a FD and an additional FD for the back report
pipe. This allows us to retrieve the name of the pipe in the caller.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:16 -07:00
Florian Achleitner
48ea9f955f Add git-remote-testsvn to Makefile
The link-rule is a copy of the standard git$X rule but adds VCSSVN_LIB.
Add executable to .gitignore.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:16 -07:00
Florian Achleitner
68f64ff8b4 Implement a remote helper for svn in C
Enable basic fetching from subversion repositories. When processing
remote URLs starting with testsvn::, git invokes this remote-helper.
It starts svnrdump to extract revisions from the subversion repository
in the 'dump file format', and converts them to a git-fast-import stream
using the functions of vcs-svn/.

Imported refs are created in a private namespace at
refs/svn/<remote-name>/master.  The revision history is imported
linearly (no branch detection) and completely, i.e. from revision 0 to
HEAD.

The 'bidi-import' capability is used. The remote-helper expects data
from fast-import on its stdin. It buffers a batch of 'import' command
lines in a string_list before starting to process them.

Signed-off-by: Florian Achleitner <florian.achleitner.2.6.31@gmail.com>
Acked-by: David Michael Barr <b@rr-dav.id.au>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-07 14:10:16 -07:00
Jonathan Nieder
dc01f880a5 git-svn: keep leading slash when canonicalizing paths (fallback case)
Subversion's svn_dirent_canonicalize() and svn_path_canonicalize()
APIs keep a leading slash in the return value if one was present on
the argument, which can be useful since it allows relative and
absolute paths to be distinguished.

When git-svn's canonicalize_path() learned to use these functions if
available, its semantics changed in the corresponding way.  Some new
callers rely on the leading slash --- for example, if the slash is
stripped out then _canonicalize_url_ourselves() will transform
"proto://host/path/to/resource" to "proto://hostpath/to/resource".

Unfortunately the fallback _canonicalize_path_ourselves(), used when
the appropriate SVN APIs are not usable, still follows the old
semantics, so if that code path is exercised then it breaks.  Fix it
to follow the new convention.

Noticed by forcing the fallback on and running tests.  Without this
patch, t9101.4 fails:

 Bad URL passed to RA layer: Unable to open an ra_local session to \
 URL: Local URL 'file://homejrnsrcgit-scratch/t/trash%20directory.\
 t9101-git-svn-props/svnrepo' contains unsupported hostname at \
 /home/jrn/src/git-scratch/perl/blib/lib/Git/SVN.pm line 148

With it, the git-svn tests pass again.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:52:52 +00:00
Jonathan Nieder
52de6fa2c7 Git::SVN: rename private path field
All users of $gs->{path} should have been converted to use the
accessor by now.  Check our work by renaming the underlying variable
to break callers that try to use it directly.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Eric Wong
f3045919d1 git-svn: use path accessor for Git::SVN objects
The accessors should improve maintainability and enforce
consistent access to Git::SVN objects.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
2012-10-05 22:48:12 +00:00
Ammon Riley
9478b11968 Make git-svn branch patterns match complete URL
When using the {word,[...]} style of configuration for tags and branches,
it appears the intent is to only match whole path parts, since the words
in the {} pattern are meta-character quoted.

When the pattern word appears in the beginning or middle of the url,
it's matched completely, since the left side, pattern, and (non-empty)
right side are joined together with path separators.

However, when the pattern word appears at the end of the URL, the
right side is an empty pattern, and the resulting regex matches
more than just the specified pattern.

For example, if you specify something along the lines of

    branches = branches/project/{release_1,release_2}

and your repository also contains "branches/project/release_1_2", you
will also get the release_1_2 branch.  By restricting the match regex
with anchors, this is avoided.

Signed-off-by: Ammon Riley <ammon.riley@gmail.com>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Robert Luberda
a967cb15d3 t9164: Add missing quotes in test
This fixes `ambiguous redirect' error given by bash.

[ew: fix misspelled test name,
     also eliminate space after ">>" to conform to guidelines]

Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Steven Walter
14d3ce1120 git-svn.perl: keep processing all commits in parents_exclude
This fixes a bug where git finds the incorrect merge parent.  Consider a
repository with trunk, branch1 of trunk, and branch2 of branch1.
Without this change, git interprets a merge of branch2 into trunk as a
merge of branch1 into trunk.

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Reviewed-by: Sam Vilain <sam@vilain.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Steven Walter
f271fad266 git-svn.perl: consider all ranges for a given merge, instead of only tip-by-tip
Consider the case where you have trunk, branch1 of trunk, and branch2 of
branch1.  trunk is merged back into branch2, and then branch2 is
reintegrated into trunk.  The merge of branch2 into trunk will have
svn:mergeinfo property references to both branch1 and branch2.  When
git-svn fetches the commit that merges branch2 (check_cherry_pick),
it is necessary to eliminate the merged contents of branch1 as well as
branch2, or else the merge will be incorrectly ignored as a cherry-pick.

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Reviewed-by: Sam Vilain <sam@vilain.net>
Signed-off-by: Eric Wong <normalperson@yhbt.net>
2012-10-05 22:48:12 +00:00
Junio C Hamano
68bdfd7cdc Merge commit 'f9f6e2c' into nd/attr-match-optim-more
* commit 'f9f6e2c':
  exclude: do strcmp as much as possible before fnmatch
  dir.c: get rid of the wildcard symbol set in no_wildcard()
  Unindent excluded_from_list()
2012-10-05 12:45:30 -07:00
Nguyễn Thái Ngọc Duy
4742d136e2 attr: avoid searching for basename on every match
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-05 12:27:48 -07:00
Nguyễn Thái Ngọc Duy
cd6a0b265e attr: avoid strlen() on every match
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-05 12:27:35 -07:00
Jeff King
435c833237 upload-pack: use peel_ref for ref advertisements
When upload-pack advertises refs, we attempt to peel tags
and advertise the peeled version. We currently hand-roll the
tag dereferencing, and use as many optimizations as we can
to avoid loading non-tag objects into memory.

Not only has peel_ref recently learned these optimizations,
too, but it also contains an even more important one: it
has access to the "peeled" data from the pack-refs file.
That means we can avoid not only loading annotated tags
entirely, but also avoid doing any kind of object lookup at
all.

This cut the CPU time to advertise refs by 50% in the
linux-2.6 repo, as measured by:

  echo 0000 | git-upload-pack . >/dev/null

best-of-five, warm cache, objects and refs fully packed:

  [before]             [after]
  real    0m0.026s     real    0m0.013s
  user    0m0.024s     user    0m0.008s
  sys     0m0.000s     sys     0m0.000s

Those numbers are irrelevantly small compared to an actual
fetch. Here's a larger repo (400K refs, of which 12K are
unique, and of which only 107 are unique annotated tags):

  [before]             [after]
  real    0m0.704s     real    0m0.596s
  user    0m0.600s     user    0m0.496s
  sys     0m0.096s     sys     0m0.092s

This shows only a 15% speedup (mostly because it has fewer
actual tags to parse), but a larger absolute value (100ms,
which isn't a lot compared to a real fetch, but this
advertisement happens on every fetch, even if the client is
just finding out they are completely up to date).

In truly pathological cases, where you have a large number
of unique annotated tags, it can make an even bigger
difference. Here are the numbers for a linux-2.6 repository
that has had every seventh commit tagged (so about 50K
tags):

  [before]             [after]
  real    0m0.443s     real    0m0.097s
  user    0m0.416s     user    0m0.080s
  sys     0m0.024s     sys     0m0.012s

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-04 20:34:29 -07:00
Jeff King
6c4a060d7d peel_ref: check object type before loading
The point of peel_ref is to dereference tags; if the base
object is not a tag, then we can return early without even
loading the object into memory.

This patch accomplishes that by checking sha1_object_info
for the type. For a packed object, we can get away with just
looking in the pack index. For a loose object, we only need
to inflate the first couple of header bytes.

This is a bit of a gamble; if we do find a tag object, then
we will end up loading the content anyway, and the extra
lookup will have been wasteful. However, if it is not a tag
object, then we save loading the object entirely. Depending
on the ratio of non-tags to tags in the input, this can be a
minor win or minor loss.

However, it does give us one potential major win: if a ref
points to a large blob (e.g., via an unannotated tag), then
we can avoid looking at it entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-04 20:34:28 -07:00
Jeff King
e6dbffa67b peel_ref: do not return a null sha1
The idea of the peel_ref function is to dereference tag
objects recursively until we hit a non-tag, and return the
sha1. Conceptually, it should return 0 if it is successful
(and fill in the sha1), or -1 if there was nothing to peel.

However, the current behavior is much more confusing. For a
regular loose ref, the behavior is as described above. But
there is an optimization to reuse the peeled-ref value for a
ref that came from a packed-refs file. If we have such a
ref, we return its peeled value, even if that peeled value
is null (indicating that we know the ref definitely does
_not_ peel).

It might seem like such information is useful to the caller,
who would then know not to bother loading and trying to peel
the object. Except that they should not bother loading and
trying to peel the object _anyway_, because that fallback is
already handled by peel_ref. In other words, the whole point
of calling this function is that it handles those details
internally, and you either get a sha1, or you know that it
is not peel-able.

This patch catches the null sha1 case internally and
converts it into a -1 return value (i.e., there is nothing
to peel). This simplifies callers, which do not need to
bother checking themselves.

Two callers are worth noting:

  - in pack-objects, a comment indicates that there is a
    difference between non-peelable tags and unannotated
    tags. But that is not the case (before or after this
    patch). Whether you get a null sha1 has to do with
    internal details of how peel_ref operated.

  - in show-ref, if peel_ref returns a failure, the caller
    tries to decide whether to try peeling manually based on
    whether the REF_ISPACKED flag is set. But this doesn't
    make any sense. If the flag is set, that does not
    necessarily mean the ref came from a packed-refs file
    with the "peeled" extension. But it doesn't matter,
    because even if it didn't, there's no point in trying to
    peel it ourselves, as peel_ref would already have done
    so. In other words, the fallback peeling is guaranteed
    to fail.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-04 20:34:28 -07:00
Jeff King
44da6f69ec peel_ref: use faster deref_tag_noverify
When we are asked to peel a ref to a sha1, we internally call
deref_tag, which will recursively parse each tagged object
until we reach a non-tag. This has the benefit that we will
verify our ability to load and parse the pointed-to object.

However, there is a performance downside: we may not need to
load that object at all (e.g., if we are listing peeled
simply listing peeled refs), or it may be a large object
that should follow a streaming code path (e.g., an annotated
tag of a large blob).

It makes more sense for peel_ref to choose the fast thing
rather than performing the extra check, for two reasons:

  1. We will already sometimes short-circuit the tag parsing
     in favor of a peeled entry from a packed-refs file. So
     we are already favoring speed in some cases, and it is
     not wise for a caller to rely on peel_ref to detect
     corruption.

  2. We already silently ignore much larger corruptions,
     like a ref that points to a non-existent object, or a
     tag object that exists but is corrupted.

  2. peel_ref is not the right place to check for such a
     database corruption. It is returning only the sha1
     anyway, not the actual object. Any callers which use
     that sha1 to load an object will soon discover the
     corruption anyway, so we are really just pushing back
     the discovery to later in the program.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-04 20:34:28 -07:00