mirror of
https://github.com/git/git.git
synced 2024-11-08 02:03:12 +01:00
Mirror of https://github.com/git/git
6859de45a9
Recent versions of git can be slow to fetch repositories with a large number of refs (or when they already have a large number of refs). For example, GitHub makes pull-requests available as refs, which can lead to a large number of available refs. This slowness goes away when submodule recursion is turned off: $ git ls-remote git://github.com/rails/rails.git | wc -l 3034 [this takes ~10 seconds of CPU time to complete] git fetch --recurse-submodules=no \ git://github.com/rails/rails.git "refs/*:refs/*" [this still isn't done after 10 _minutes_ of pegging the CPU] git fetch \ git://github.com/rails/rails.git "refs/*:refs/*" You can produce a quicker and simpler test case like this: doit() { head=`git rev-parse HEAD` for i in `seq 1 $1`; do echo $head refs/heads/ref$i done >.git/packed-refs echo "==> $1" rm -rf dest git init -q --bare dest && (cd dest && time git.compile fetch -q .. refs/*:refs/*) } rm -rf repo git init -q repo && cd repo && >file && git add file && git commit -q -m one doit 100 doit 200 doit 400 doit 800 doit 1600 doit 3200 Which yields timings like: # refs seconds of CPU 100 0.06 200 0.24 400 0.95 800 3.39 1600 13.66 3200 54.09 Notice that although the number of refs doubles in each trial, the CPU time spent quadruples. The problem is that the submodule recursion code works something like: - for each ref we fetch - for each commit in git rev-list $new_sha1 --not --all - add modified submodules to list - fetch any newly referenced submodules But that means if we fetch N refs, we start N revision walks. Worse, because we use "--all", the number of refs we must process that constitute "--all" keeps growing, too. And you end up doing O(N^2) ref resolutions. Instead, this patch structures the code like this: - for each sha1 we already have - add $old_sha1 to list $old - for each ref we fetch - add $new_sha1 to list $new - for each commit in git rev-list $new --not $old - add modified submodules to list - fetch any newly referenced submodules This yields timings like: # refs seconds of CPU 100 0.00 200 0.04 400 0.04 800 0.10 1600 0.21 3200 0.39 Note that the amount of effort doubles as the number of refs doubles. Similarly, the fetch of rails.git takes about as much time as it does with --recurse-submodules=no. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> |
||
---|---|---|
block-sha1 | ||
builtin | ||
compat | ||
contrib | ||
Documentation | ||
git-gui | ||
git_remote_helpers | ||
gitk-git | ||
gitweb | ||
perl | ||
po | ||
ppc | ||
t | ||
templates | ||
vcs-svn | ||
xdiff | ||
.gitattributes | ||
.gitignore | ||
.mailmap | ||
abspath.c | ||
aclocal.m4 | ||
advice.c | ||
advice.h | ||
alias.c | ||
alloc.c | ||
archive-tar.c | ||
archive-zip.c | ||
archive.c | ||
archive.h | ||
attr.c | ||
attr.h | ||
base85.c | ||
bisect.c | ||
bisect.h | ||
blob.c | ||
blob.h | ||
branch.c | ||
branch.h | ||
builtin.h | ||
bundle.c | ||
bundle.h | ||
cache-tree.c | ||
cache-tree.h | ||
cache.h | ||
check-builtins.sh | ||
check-racy.c | ||
check_bindir | ||
color.c | ||
color.h | ||
combine-diff.c | ||
command-list.txt | ||
commit.c | ||
commit.h | ||
config.c | ||
config.mak.in | ||
configure.ac | ||
connect.c | ||
convert.c | ||
copy.c | ||
COPYING | ||
csum-file.c | ||
csum-file.h | ||
ctype.c | ||
daemon.c | ||
date.c | ||
decorate.c | ||
decorate.h | ||
delta.h | ||
diff-delta.c | ||
diff-lib.c | ||
diff-no-index.c | ||
diff.c | ||
diff.h | ||
diffcore-break.c | ||
diffcore-delta.c | ||
diffcore-order.c | ||
diffcore-pickaxe.c | ||
diffcore-rename.c | ||
diffcore.h | ||
dir.c | ||
dir.h | ||
editor.c | ||
entry.c | ||
environment.c | ||
exec_cmd.c | ||
exec_cmd.h | ||
fast-import.c | ||
fetch-pack.h | ||
fixup-builtins | ||
fsck.c | ||
fsck.h | ||
generate-cmdlist.sh | ||
gettext.c | ||
gettext.h | ||
git-add--interactive.perl | ||
git-am.sh | ||
git-archimport.perl | ||
git-bisect.sh | ||
git-compat-util.h | ||
git-cvsexportcommit.perl | ||
git-cvsimport.perl | ||
git-cvsserver.perl | ||
git-difftool--helper.sh | ||
git-difftool.perl | ||
git-filter-branch.sh | ||
git-instaweb.sh | ||
git-lost-found.sh | ||
git-merge-octopus.sh | ||
git-merge-one-file.sh | ||
git-merge-resolve.sh | ||
git-mergetool--lib.sh | ||
git-mergetool.sh | ||
git-parse-remote.sh | ||
git-pull.sh | ||
git-quiltimport.sh | ||
git-rebase--am.sh | ||
git-rebase--interactive.sh | ||
git-rebase--merge.sh | ||
git-rebase.sh | ||
git-relink.perl | ||
git-remote-testgit.py | ||
git-repack.sh | ||
git-request-pull.sh | ||
git-send-email.perl | ||
git-sh-i18n.sh | ||
git-sh-setup.sh | ||
git-stash.sh | ||
git-submodule.sh | ||
git-svn.perl | ||
GIT-VERSION-GEN | ||
git-web--browse.sh | ||
git.c | ||
git.spec.in | ||
graph.c | ||
graph.h | ||
grep.c | ||
grep.h | ||
hash.c | ||
hash.h | ||
help.c | ||
help.h | ||
hex.c | ||
http-backend.c | ||
http-fetch.c | ||
http-push.c | ||
http-walker.c | ||
http.c | ||
http.h | ||
ident.c | ||
imap-send.c | ||
INSTALL | ||
levenshtein.c | ||
levenshtein.h | ||
LGPL-2.1 | ||
list-objects.c | ||
list-objects.h | ||
ll-merge.c | ||
ll-merge.h | ||
lockfile.c | ||
log-tree.c | ||
log-tree.h | ||
mailmap.c | ||
mailmap.h | ||
Makefile | ||
match-trees.c | ||
merge-file.c | ||
merge-file.h | ||
merge-recursive.c | ||
merge-recursive.h | ||
name-hash.c | ||
notes-cache.c | ||
notes-cache.h | ||
notes-merge.c | ||
notes-merge.h | ||
notes.c | ||
notes.h | ||
object.c | ||
object.h | ||
pack-check.c | ||
pack-refs.c | ||
pack-refs.h | ||
pack-revindex.c | ||
pack-revindex.h | ||
pack-write.c | ||
pack.h | ||
pager.c | ||
parse-options.c | ||
parse-options.h | ||
patch-delta.c | ||
patch-ids.c | ||
patch-ids.h | ||
path.c | ||
pkt-line.c | ||
pkt-line.h | ||
preload-index.c | ||
pretty.c | ||
progress.c | ||
progress.h | ||
quote.c | ||
quote.h | ||
reachable.c | ||
reachable.h | ||
read-cache.c | ||
README | ||
reflog-walk.c | ||
reflog-walk.h | ||
refs.c | ||
refs.h | ||
RelNotes | ||
remote-curl.c | ||
remote.c | ||
remote.h | ||
replace_object.c | ||
rerere.c | ||
rerere.h | ||
resolve-undo.c | ||
resolve-undo.h | ||
revision.c | ||
revision.h | ||
run-command.c | ||
run-command.h | ||
send-pack.h | ||
server-info.c | ||
setup.c | ||
sh-i18n--envsubst.c | ||
sha1-array.c | ||
sha1-array.h | ||
sha1-lookup.c | ||
sha1-lookup.h | ||
sha1_file.c | ||
sha1_name.c | ||
shallow.c | ||
shell.c | ||
shortlog.h | ||
show-index.c | ||
sideband.c | ||
sideband.h | ||
sigchain.c | ||
sigchain.h | ||
strbuf.c | ||
strbuf.h | ||
string-list.c | ||
string-list.h | ||
submodule.c | ||
submodule.h | ||
symlinks.c | ||
tag.c | ||
tag.h | ||
tar.h | ||
test-chmtime.c | ||
test-ctype.c | ||
test-date.c | ||
test-delta.c | ||
test-dump-cache-tree.c | ||
test-genrandom.c | ||
test-index-version.c | ||
test-line-buffer.c | ||
test-match-trees.c | ||
test-mktemp.c | ||
test-obj-pool.c | ||
test-parse-options.c | ||
test-path-utils.c | ||
test-run-command.c | ||
test-sha1.c | ||
test-sha1.sh | ||
test-sigchain.c | ||
test-string-pool.c | ||
test-subprocess.c | ||
test-svn-fe.c | ||
test-treap.c | ||
thread-utils.c | ||
thread-utils.h | ||
trace.c | ||
transport-helper.c | ||
transport.c | ||
transport.h | ||
tree-diff.c | ||
tree-walk.c | ||
tree-walk.h | ||
tree.c | ||
tree.h | ||
unimplemented.sh | ||
unpack-trees.c | ||
unpack-trees.h | ||
upload-pack.c | ||
url.c | ||
url.h | ||
usage.c | ||
userdiff.c | ||
userdiff.h | ||
utf8.c | ||
utf8.h | ||
walker.c | ||
walker.h | ||
wrap-for-bin.sh | ||
wrapper.c | ||
write_or_die.c | ||
ws.c | ||
wt-status.c | ||
wt-status.h | ||
xdiff-interface.c | ||
xdiff-interface.h | ||
zlib.c |
//////////////////////////////////////////////////////////////// GIT - the stupid content tracker //////////////////////////////////////////////////////////////// "git" can mean anything, depending on your mood. - random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant. - stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang. - "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room. - "goddamn idiotic truckload of sh*t": when it breaks Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals. Git is an Open Source project covered by the GNU General Public License. It was originally written by Linus Torvalds with help of a group of hackers around the net. It is currently maintained by Junio C Hamano. Please read the file INSTALL for installation instructions. See Documentation/gittutorial.txt to get started, then see Documentation/everyday.txt for a useful minimum set of commands, and Documentation/git-commandname.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with "man gittutorial" or "git help tutorial", and the documentation of each command with "man git-commandname" or "git help commandname". CVS users may also want to read Documentation/gitcvs-migration.txt ("man gitcvs-migration" or "git help cvs-migration" if git is installed). Many Git online resources are accessible from http://git-scm.com/ including full documentation and Git related tools. The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org. To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at http://marc.theaimsgroup.com/?l=git and other archival sites. The messages titled "A note from the maintainer", "What's in git.git (stable)" and "What's cooking in git.git (topics)" and the discussion following them on the mailing list give a good reference for project status, development direction and remaining tasks.