We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.
* bp/fsmonitor:
fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
fsmonitor: read entirety of watchman output
fsmonitor: MINGW support for watchman integration
fsmonitor: add a performance test
fsmonitor: add a sample integration script for Watchman
fsmonitor: add test cases for fsmonitor extension
split-index: disable the fsmonitor extension when running the split index test
fsmonitor: add a test tool to dump the index extension
update-index: add fsmonitor support to update-index
ls-files: Add support in ls-files to display the fsmonitor valid bit
fsmonitor: add documentation for the fsmonitor extension.
fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
update-index: add a new --force-write-index option
preload-index: add override to enable testing preload-index
bswap: add 64 bit endianness helper get_be64
Add a new perf test for testing the performance of log while computing
OID abbreviations. Using --oneline --raw and --parents options maximizes
the number of OIDs to abbreviate while still spending some time computing
diffs.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test utility (test-drop-caches) that flushes all changes to disk
then drops file system cache on Windows, Linux, and OSX.
Add a perf test (p7519-fsmonitor.sh) for fsmonitor.
By default, the performance test will utilize the Watchman file system
monitor if it is installed. If Watchman is not installed, it will use a
dummy integration script that does not report any new or modified files.
The dummy script has very little overhead which provides optimistic results.
The performance test will also use the untracked cache feature if it is
available as fsmonitor uses it to speed up scanning for untracked files.
There are 4 environment variables that can be used to alter the default
behavior of the performance test:
GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache
GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex
GIT_PERF_7519_FSMONITOR: used to configure core.fsmonitor
GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests
The big win for using fsmonitor is the elimination of the need to scan the
working directory looking for changed and untracked files. If the file
information is all cached in RAM, the benefits are reduced.
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A performance test for writing the index to be able to
determine if changes to allocating ondisk structure help.
Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Optimize "what are the object names already taken in an alternate
object database?" query that is used to derive the length of prefix
an object name is uniquely abbreviated to.
* rs/sha1-name-readdir-optim:
sha1_file: guard against invalid loose subdirectory numbers
sha1_file: let for_each_file_in_obj_subdir() handle subdir names
p4205: add perf test script for pretty log formats
sha1_name: cache readdir(3) results in find_short_object_filename()
Add simple performance tests for expanded log format placeholders.
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
perf-test update.
* jh/memihash-opt:
p0004: don't error out if test repo is too small
p0004: don't abort if multi-threaded is too slow
p0004: use test_perf
p0004: avoid using pipes
p0004: simplify calls of test-lazy-init-name-hash
When the tested repo has an index.lock file it should be removed. This
file may be present if e.g. git-status previously crashed in that
repo, and it will make a lot of git commands fail. Let's try harder
and remove the lock.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internal implementation of "git grep" has seen some clean-up.
* ab/grep-preparatory-cleanup: (31 commits)
grep: assert that threading is enabled when calling grep_{lock,unlock}
grep: given --threads with NO_PTHREADS=YesPlease, warn
pack-objects: fix buggy warning about threads
pack-objects & index-pack: add test for --threads warning
test-lib: add a PTHREADS prerequisite
grep: move is_fixed() earlier to avoid forward declaration
grep: change internal *pcre* variable & function names to be *pcre1*
grep: change the internal PCRE macro names to be PCRE1
grep: factor test for \0 in grep patterns into a function
grep: remove redundant regflags assignments
grep: catch a missing enum in switch statement
perf: add a comparison test of log --grep regex engines with -F
perf: add a comparison test of log --grep regex engines
perf: add a comparison test of grep regex engines with -F
perf: add a comparison test of grep regex engines
perf: emit progress output when unpacking & building
perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
grep: add tests to fix blind spots with \0 patterns
grep: prepare for testing binary regexes containing rx metacharacters
grep: add a test helper function for less verbose -f \0 tests
...
perf-test update.
* jh/memihash-opt:
p0004: don't error out if test repo is too small
p0004: don't abort if multi-threaded is too slow
p0004: use test_perf
p0004: avoid using pipes
p0004: simplify calls of test-lazy-init-name-hash
Add perf-test for wildmatch.
* ab/perf-wildmatch:
perf: add test showing exponential growth in path globbing
perf: add function to setup a fresh test repo
Add a performance comparison test of grep regex engines given fixed
strings.
The current logic in compile_regexp() ignores the engine parameter and
uses kwset() to search for these, so this test shows no difference
between engines right now:
$ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7821-grep-engines-fixed.sh
[...]
Test this tree
------------------------------------------------
7821.1: fixed grep int 0.56(1.67+0.68)
7821.2: basic grep int 0.57(1.70+0.57)
7821.3: extended grep int 0.59(1.76+0.51)
7821.4: perl grep int 1.08(1.71+0.55)
7821.6: fixed grep uncommon 0.23(0.55+0.50)
7821.7: basic grep uncommon 0.24(0.55+0.50)
7821.8: extended grep uncommon 0.26(0.55+0.52)
7821.9: perl grep uncommon 0.24(0.58+0.47)
7821.11: fixed grep æ 0.36(1.30+0.42)
7821.12: basic grep æ 0.36(1.32+0.40)
7821.13: extended grep æ 0.38(1.30+0.42)
7821.14: perl grep æ 0.35(1.24+0.48)
Only when run with -i via GIT_PERF_7821_GREP_OPTS=' -i' do we avoid
avoid going through the same kwset.[ch] codepath, see the "Even when
-F..." comment in grep.c. This only kicks for the non-ASCII case:
$ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7821_GREP_OPTS=' -i' ./run p7821-grep-engines-fixed.sh
[...]
Test this tree
---------------------------------------------------
7821.1: fixed grep -i int 0.62(2.10+0.57)
7821.2: basic grep -i int 0.68(1.90+0.61)
7821.3: extended grep -i int 0.78(1.94+0.57)
7821.4: perl grep -i int 0.98(1.78+0.74)
7821.6: fixed grep -i uncommon 0.24(0.44+0.64)
7821.7: basic grep -i uncommon 0.25(0.56+0.54)
7821.8: extended grep -i uncommon 0.27(0.62+0.45)
7821.9: perl grep -i uncommon 0.24(0.59+0.49)
7821.11: fixed grep -i æ 0.30(0.96+0.39)
7821.12: basic grep -i æ 0.27(0.92+0.44)
7821.13: extended grep -i æ 0.28(0.90+0.46)
7821.14: perl grep -i æ 0.28(0.74+0.49)
I'm planning to change how fixed-string searching happens. This test
gives a baseline for comparing performance before & after any such
change.
See commit ("perf: add a comparison test of grep regex engines",
2017-04-19) for details on the machine the above test run was executed
on.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a very basic performance comparison test comparing the POSIX
basic, extended and perl engines.
In theory the "basic" and "extended" engines should be implemented
using the same underlying code with a slightly different pattern
parser, but some implementations may not do this. Jump through some
slight hoops to test both, which is worthwhile since "basic" is the
default.
Running this on an i7 3.4GHz Linux 4.9.0-2 Debian testing against a
checkout of linux.git & latest upstream PCRE, both PCRE and git
compiled with -O3 using gcc 7.1.1:
$ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh
[...]
Test this tree
---------------------------------------------------------------
7820.1: basic grep 'how.to' 0.34(1.24+0.53)
7820.2: extended grep 'how.to' 0.33(1.23+0.45)
7820.3: perl grep 'how.to' 0.31(1.05+0.56)
7820.5: basic grep '^how to' 0.32(1.24+0.42)
7820.6: extended grep '^how to' 0.33(1.20+0.44)
7820.7: perl grep '^how to' 0.57(2.67+0.42)
7820.9: basic grep '[how] to' 0.51(2.16+0.45)
7820.10: extended grep '[how] to' 0.49(2.20+0.43)
7820.11: perl grep '[how] to' 0.56(2.60+0.43)
7820.13: basic grep '\(e.t[^ ]*\|v.ry\) rare' 0.66(3.25+0.40)
7820.14: extended grep '(e.t[^ ]*|v.ry) rare' 0.65(3.19+0.46)
7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 1.05(5.74+0.34)
7820.17: basic grep 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.34(1.28+0.47)
7820.18: extended grep 'm(ú|u)lt.b(æ|y)te' 0.34(1.38+0.38)
7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.39(1.56+0.44)
Options can also be passed to git-grep via the GIT_PERF_7820_GREP_OPTS
environment variable. There are various modes such as "-v" that have
very different performance profiles, but handling the combinatorial
explosion of testing all those options would make this script much
more complex and harder to maintain. Instead just add the ability to
do one-shot runs with arbitrary options, e.g.:
$ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7820_GREP_OPTS=" -i" ./run p7820-grep-engines.sh
[...]
Test this tree
------------------------------------------------------------------
7820.1: basic grep -i 'how.to' 0.49(1.72+0.38)
7820.2: extended grep -i 'how.to' 0.46(1.64+0.42)
7820.3: perl grep -i 'how.to' 0.44(1.45+0.45)
7820.5: basic grep -i '^how to' 0.47(1.76+0.38)
7820.6: extended grep -i '^how to' 0.47(1.70+0.42)
7820.7: perl grep -i '^how to' 0.65(2.72+0.37)
7820.9: basic grep -i '[how] to' 0.86(3.64+0.42)
7820.10: extended grep -i '[how] to' 0.84(3.62+0.46)
7820.11: perl grep -i '[how] to' 0.73(3.06+0.39)
7820.13: basic grep -i '\(e.t[^ ]*\|v.ry\) rare' 1.63(8.13+0.36)
7820.14: extended grep -i '(e.t[^ ]*|v.ry) rare' 1.64(8.01+0.44)
7820.15: perl grep -i '(e.t[^ ]*|v.ry) rare' 1.44(6.88+0.44)
7820.17: basic grep -i 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.66(2.67+0.44)
7820.18: extended grep -i 'm(ú|u)lt.b(æ|y)te' 0.66(2.67+0.43)
7820.19: perl grep -i 'm(ú|u)lt.b(æ|y)te' 0.59(2.31+0.37)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Amend the t/perf/run output so that in addition to the "Running N
tests" heading currently being emitted, it also emits "Unpacking $rev"
and "Building $rev" when setting up the build/$rev directory & when
building it, respectively.
This makes it easier to see what's going on and what revision is being
tested as the output scrolls by.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a git GIT_PERF_MAKE_COMMAND variable to compliment the existing
GIT_PERF_MAKE_OPTS facility. This allows specifying an arbitrary shell
command to execute instead of 'make'.
This is useful e.g. in cases where the name, semantics or defaults of
a Makefile flag have changed over time. It can even be used to change
the contents of the tree, useful for monkeypatching ancient versions
of git to get them to build.
This opens Pandora's box in some ways, it's now possible to
"jailbreak" the perf environment and e.g. modify the source tree via
this arbitrary instead of just issuing a custom "make" command, such a
command has to be re-entrant in the sense that subsequent perf runs
will re-use the possibly modified tree.
It would be pointless to try to mitigate or work around that caveat in
a tool purely aimed at Git developers, so this change makes no attempt
to do so.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Repositories with less than 4000 entries are always handled using a
single thread, causing test-lazy-init-name-hash --multi to error out.
Don't abort the whole test script in that case, but simply skip the
multi-threaded performance check. We can still use it to compare the
single-threaded speed of different versions in that case.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the single-threaded variant beats the multi-threaded one then we may
have a performance bug, but that doesn't justify aborting the test.
Drop that check; we can compare the results for --single and --multi
using the actual performance tests.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The perf test suite (more specifically: t/perf/aggregate.perl) requires
each test script to write test results into a file, otherwise it aborts
when aggregating. Add actual performance tests with test_perf to allow
p0004 to be run together with other perf scripts.
Calibrate the value for the parameter --count based on the size of the
test repository, in order to get meaningful results with smaller repos
yet still be able to finish the script against huge ones without having
to wait for hours.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The return code of commands on the producing end of a pipe is ignored.
Evaluate the outcome of test-lazy-init-name-hash by calling sort
separately.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test library puts helpers into $PATH, so we can simply call them
without specifying their location.
The suffix $X is also not necessary because .exe files on Windows can be
started without specifying their extension, and on other platforms it's
empty anyway.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff Hostetler <git@jeffhostetler.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a test showing that runtimes of the wildmatch() function used for
globbing in git grow exponentially in the face of some pathological
globs.
This issue affects both globs matching filenames via e.g. ls-files,
and globs matching refnames via e.g. for-each-ref.
As noted in the test description this is a test to see whether Git
suffers from the issue noted in an article Russ Cox posted today about
common bugs in various glob implementations:
https://research.swtch.com/glob
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a function to setup a fresh test repo via 'git init' to compliment
the existing functions to copy over a normal & large repo.
Some performance tests don't need any existing repository data at all
to be significant, e.g. tests which stress glob matches against single
pathological revisions or files, which I'm about to add in a
subsequent commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rebasing onto many changes is interesting, but it's also
interesting to see what happens when rebasing many changes.
And while at it, let's also look at the impact of using a
split index.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git checkout" that handles a lot of paths has been optimized by
reducing the number of unnecessary checks of paths in the
has_dir_name() function.
* jh/add-index-entry-optim:
read-cache: speed up has_dir_name (part 2)
read-cache: speed up has_dir_name (part 1)
read-cache: speed up add_index_entry during checkout
p0006-read-tree-checkout: perf test to time read-tree
read-cache: add strcmp_offset function
The string-list API used a custom reallocation strategy that was
very inefficient, instead of using the usual ALLOC_GROW() macro,
which has been fixed.
* jh/string-list-micro-optim:
string-list: use ALLOC_GROW macro when reallocing string_list
Change the test descriptions from being treated as binary blobs by
perl to being treated as UTF-8. This ensures that e.g. a test
description like "æ" is counted as 1 character, not 2.
I have WIP performance tests for non-ASCII grep patterns on another
topic that are affected by this.
Now instead of:
$ ./run p0000-perf-lib-sanity.sh
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
We emit:
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
Fixes code originally added in 342e9ef2d9 ("Introduce a performance
testing framework", 2012-02-17).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hotfix for a topic that is already in 'master'.
* jh/memihash-opt:
p0004: make perf test executable
t3008: skip lazy-init test on a single-core box
test-online-cpus: helper to return cpu count
name-hash: fix buffer overrun
Created t/perf/repos/many-files.sh to generate large, but
artificial repositories.
Created t/perf/inflate-repo.sh to alter an EXISTING repo
to have a set of large commits. This can be used to create
a branch with 1M+ files in repositories like git.git or
linux.git, but with more realistic content. It does this
by making multiple copies of the entire worktree in a series
of sub-directories.
The branch name and ballast structure created by both scripts
match, so either script can be used to generate very large
test repositories for the following perf test.
Created t/perf/p0006-read-tree-checkout.sh to measure
performance on various read-tree, checkout, and update-index
operations. This test can run using either normal repos or
ones from the above scripts.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It looks like in 89c3b0ad43 (name-hash: add perf test for lazy_init_name_hash,
2017-03-23) p0004 was not created with the execute unix rights.
Let's fix that.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use ALLOC_GROW() macro when reallocing a string_list array
rather than simply increasing it by 32. This is a performance
optimization.
During status on a very large repo and there are many changes,
a significant percentage of the total run time is spent
reallocing the wt_status.changes array.
This change decreases the time in wt_status_collect_changes_worktree()
from 125 seconds to 45 seconds on my very large repository.
This produced a modest gain on my 1M file artificial repo, but
broke even on linux.git.
Test HEAD^^ HEAD
---------------------------------------------------------------------------------------
0005.2: read-tree status br_ballast (1000001) 8.29(5.62+2.62) 8.22(5.57+2.63) -0.8%
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name-hash used for detecting paths that are different only in
cases (which matter on case insensitive filesystems) has been
optimized to take advantage of multi-threading when it makes sense.
* jh/memihash-opt:
name-hash: add test-lazy-init-name-hash to .gitignore
name-hash: add perf test for lazy_init_name_hash
name-hash: add test-lazy-init-name-hash
name-hash: perf improvement for lazy_init_name_hash
hashmap: document memihash_cont, hashmap_disallow_rehash api
hashmap: add disallow_rehash setting
hashmap: allow memihash computation to be continued
name-hash: specify initial size for istate.dir_hash table
Created t/perf/p0004-lazy-init-name-hash.sh test
to demonstrate correctness and performance gains
with the multithreaded version of lazy_init_name_hash().
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git filter-branch --prune-empty" drops a single-parent commit that
becomes a no-op, but did not drop a root commit whose tree is empty.
* dp/filter-branch-prune-empty:
p7000: add test for filter-branch with --prune-empty
filter-branch: fix --prune-empty on parentless commits
t7003: ensure --prune-empty removes entire branch when applicable
t7003: ensure --prune-empty can prune root commit
It's tempting to say:
./run v1.0.0 HEAD
to see how we've sped up Git over the years. Unfortunately,
this doesn't quite work because versions of Git prior to
v1.7.0 lack bin-wrappers, so our "run" script doesn't
correctly put them in the PATH.
Worse, it means we silently find whatever other "git" is in
the PATH, and produce test results that have no bearing on
what we asked for.
Let's fallback to the main git directory when bin-wrappers
isn't present. Many modern perf scripts won't run with such
an antique version of Git, of course, but at least those
failures are detected and reported (and you're free to write
a limited perf script that works across many versions).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since 1a0962dee (t/perf: fix regression in testing older
versions of git, 2016-06-22), we point "$MODERN_GIT" to a
copy of git that matches the t/perf script itself, and which
can be used for tasks outside of the actual timings. This is
needed because the setup done by perf scripts keeps moving
forward in time, and may use features that the older
versions of git we are testing do not have.
That commit used $MODERN_GIT to fix a case where we relied
on the relatively recent --git-path option. But if you go
back further still, there are more problems.
Since 7501b5921 (perf: make the tests work in worktrees,
2016-05-13), we use "git -C", but versions of git older than
44e1e4d67 (git: run in a directory given with -C option,
2013-09-09) don't know about "-C". So testing an old version
of git with a new version of t/perf will fail the setup
step.
We can fix this by using $MODERN_GIT during the setup;
there's no need to use the antique version, since it doesn't
affect the timings. Likewise, we'll adjust the "init"
invocation; antique versions of git called this "init-db".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In p0001, a variable was created in a test_expect_success block to be
used in later test_perf blocks, but was not exported. This caused the
variable to not appear in those blocks (this can be verified by writing
'test -n "$commit"' in those blocks), resulting in a slightly different
invocation than what was intended. Export that variable.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Adjust a perf test to new world order where commands that do
require a repository are really strict about having a repository.
* rs/p5302-create-repositories-before-tests:
p5302: create repositories for index-pack results explicitly
Before 7176a314 (index-pack: complain when --stdin is used outside of a
repo) index-pack silently created a non-existing target directory; now
the command refuses to work unless it's used against a valid repository.
That causes p5302 to fail, which relies on the former behavior. Fix it
by setting up the destinations for its performance tests using git init.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a sort command to test-string-list that reads lines from stdin,
stores them in a string_list and then sorts it. Use it in a simple
perf test script to measure the performance of string_list_sort().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching from a remote that has many tags that are irrelevant
to branches we are following, we used to waste way too many cycles
when checking if the object pointed at by a tag (that we are not
going to fetch!) exists in our repository too carefully.
* jk/fetch-quick-tag-following:
fetch: use "quick" has_sha1_file for tag following
When we auto-follow tags in a fetch, we look at all of the
tags advertised by the remote and fetch ones where we don't
already have the tag, but we do have the object it peels to.
This involves a lot of calls to has_sha1_file(), some of
which we can reasonably expect to fail. Since 45e8a74
(has_sha1_file: re-check pack directory before giving up,
2013-08-30), this may cause many calls to
reprepare_packed_git(), which is potentially expensive.
This has gone unnoticed for several years because it
requires a fairly unique setup to matter:
1. You need to have a lot of packs on the client side to
make reprepare_packed_git() expensive (the most
expensive part is finding duplicates in an unsorted
list, which is currently quadratic).
2. You need a large number of tag refs on the server side
that are candidates for auto-following (i.e., that the
client doesn't have). Each one triggers a re-read of
the pack directory.
3. Under normal circumstances, the client would
auto-follow those tags and after one large fetch, (2)
would no longer be true. But if those tags point to
history which is disconnected from what the client
otherwise fetches, then it will never auto-follow, and
those candidates will impact it on every fetch.
So when all three are true, each fetch pays an extra
O(nr_tags * nr_packs^2) cost, mostly in string comparisons
on the pack names. This was exacerbated by 47bf4b0
(prepare_packed_git_one: refactor duplicate-pack check,
2014-06-30) which uses a slightly more expensive string
check, under the assumption that the duplicate check doesn't
happen very often (and it shouldn't; the real problem here
is how often we are calling reprepare_packed_git()).
This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice
accuracy for speed, in cases where we might be racy with a
simultaneous repack. This is similar to the fix in 0eeb077
(index-pack: avoid excessive re-reading of pack directory,
2015-06-09). As with that case, it's OK for has_sha1_file()
occasionally say "no I don't have it" when we do, because
the worst case is not a corruption, but simply that we may
fail to auto-follow a tag that points to it.
Here are results from the included perf script, which sets
up a situation similar to the one described above:
Test HEAD^ HEAD
----------------------------------------------------------
5550.4: fetch 11.21(10.42+0.78) 0.08(0.04+0.02) -99.3%
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Performance tests done via "t/perf" did not use the same set of
build configuration if the user relied on autoconf generated
configuration.
* ks/perf-build-with-autoconf:
t/perf/run: copy config.mak.autogen & friends to build area
Some codepaths in "git pack-objects" were not ready to use an
existing pack bitmap; now they are and as the result they have
become faster.
* ks/pack-objects-bitmap:
pack-objects: use reachability bitmap index when generating non-stdout pack
pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use
Otherwise for people who use autotools-based configure in main worktree,
the performance testing results will be inconsistent as work and build
trees could be using e.g. different optimization levels.
See e.g.
http://public-inbox.org/git/20160818175222.bmm3ivjheokf2qzl@sigill.intra.peff.net/
for example.
NOTE config.status has to be copied because otherwise without it the build
would want to run reconfigure this way loosing just copied config.mak.autogen.
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Starting from 6b8fda2d (pack-objects: use bitmaps when packing objects)
if a repository has bitmap index, pack-objects can nicely speedup
"Counting objects" graph traversal phase. That however was done only for
case when resultant pack is sent to stdout, not written into a file.
The reason here is for on-disk repack by default we want:
- to produce good pack (with bitmap index not-yet-packed objects are
emitted to pack in suboptimal order).
- to use more robust pack-generation codepath (avoiding possible
bugs in bitmap code and possible bitmap index corruption).
Jeff King further explains:
The reason for this split is that pack-objects tries to determine how
"careful" it should be based on whether we are packing to disk or to
stdout. Packing to disk implies "git repack", and that we will likely
delete the old packs after finishing. We want to be more careful (so
as not to carry forward a corruption, and to generate a more optimal
pack), and we presumably run less frequently and can afford extra CPU.
Whereas packing to stdout implies serving a remote via "git fetch" or
"git push". This happens more frequently (e.g., a server handling many
fetching clients), and we assume the receiving end takes more
responsibility for verifying the data.
But this isn't always the case. One might want to generate on-disk
packfiles for a specialized object transfer. Just using "--stdout" and
writing to a file is not optimal, as it will not generate the matching
pack index.
So it would be useful to have some way of overriding this heuristic:
to tell pack-objects that even though it should generate on-disk
files, it is still OK to use the reachability bitmaps to do the
traversal.
So we can teach pack-objects to use bitmap index for initial object
counting phase when generating resultant pack file too:
- if we take care to not let it be activated under git-repack:
See above about repack robustness and not forward-carrying corruption.
- if we know bitmap index generation is not enabled for resultant pack:
The current code has singleton bitmap_git, so it cannot work
simultaneously with two bitmap indices.
We also want to avoid (at least with current implementation)
generating bitmaps off of bitmaps. The reason here is: when generating
a pack, not-yet-packed objects will be emitted into pack in
suboptimal order and added to tail of the bitmap as "extended entries".
When the resultant pack + some new objects in associated repository
are in turn used to generate another pack with bitmap, the situation
repeats: new objects are again not emitted optimally and just added to
bitmap tail - not in recency order.
So the pack badness can grow over time when at each step we have
bitmapped pack + some other objects. That's why we want to avoid
generating bitmaps off of bitmaps, not to let pack badness grow.
- if we keep pack reuse enabled still only for "send-to-stdout" case:
Because pack-to-file needs to generate index for destination pack, and
currently on pack reuse raw entries are directly written out to the
destination pack by write_reused_pack(), bypassing needed for pack index
generation bookkeeping done by regular codepath in write_one() and
friends.
( In the future we might teach pack-reuse code about cases when index
also needs to be generated for resultant pack and remove
pack-reuse-only-for-stdout limitation )
This way for pack-objects -> file we get nice speedup:
erp5.git[1] (~230MB) extracted from ~ 5GB lab.nexedi.com backup
repository managed by git-backup[2] via
time echo 0186ac99 | git pack-objects --revs erp5pack
before: 37.2s
after: 26.2s
And for `git repack -adb` packed git.git
time echo 5c589a73 | git pack-objects --revs gitpack
before: 7.1s
after: 3.6s
i.e. it can be 30% - 50% speedup for pack extraction.
git-backup extracts many packs on repositories restoration. That was my
initial motivation for the patch.
[1] https://lab.nexedi.com/nexedi/erp5
[2] https://lab.nexedi.com/kirr/git-backup
NOTE
Jeff also suggests that pack.useBitmaps was probably a mistake to
introduce originally. This way we are not adding another config point,
but instead just always default to-file pack-objects not to use bitmap
index: Tools which need to generate on-disk packs with using bitmap, can
pass --use-bitmap-index explicitly. And git-repack does never pass
--use-bitmap-index, so this way we can be sure regular on-disk repacking
remains robust.
NOTE2
`git pack-objects --stdout >file.pack` + `git index-pack file.pack` is much slower
than `git pack-objects file.pack`. Extracting erp5.git pack from
lab.nexedi.com backup repository:
$ time echo 0186ac99 | git pack-objects --stdout --revs >erp5pack-stdout.pack
real 0m22.309s
user 0m21.148s
sys 0m0.932s
$ time git index-pack erp5pack-stdout.pack
real 0m50.873s <-- more than 2 times slower than time to generate pack itself!
user 0m49.300s
sys 0m1.360s
So the time for
`pack-object --stdout >file.pack` + `index-pack file.pack` is 72s,
while
`pack-objects file.pack` which does both pack and index is 27s.
And even
`pack-objects --no-use-bitmap-index file.pack` is 37s.
Jeff explains:
The packfile does not carry the sha1 of the objects. A receiving
index-pack has to compute them itself, including inflating and applying
all of the deltas.
that's why for `git-backup restore` we want to teach `git pack-objects
file.pack` to use bitmaps instead of using `git pack-objects --stdout
>file.pack` + `git index-pack file.pack`.
NOTE3
The speedup is now tracked via t/perf/p5310-pack-bitmaps.sh
Test 56dfeb62 this tree
--------------------------------------------------------------------------------
5310.2: repack to disk 8.98(8.05+0.29) 9.05(8.08+0.33) +0.8%
5310.3: simulated clone 2.02(2.27+0.09) 2.01(2.25+0.08) -0.5%
5310.4: simulated fetch 0.81(1.07+0.02) 0.81(1.05+0.04) +0.0%
5310.5: pack to file 7.58(7.04+0.28) 7.60(7.04+0.30) +0.3%
5310.6: pack to file (bitmap) 7.55(7.02+0.28) 3.25(2.82+0.18) -57.0%
5310.8: clone (partial bitmap) 1.83(2.26+0.12) 1.82(2.22+0.14) -0.5%
5310.9: pack to file (partial bitmap) 6.86(6.58+0.30) 2.87(2.74+0.20) -58.2%
More context:
http://marc.info/?t=146792101400001&r=1&w=2http://public-inbox.org/git/20160707190917.20011-1-kirr@nexedi.com/T/#t
Cc: Vicent Marti <tanoku@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The delta-base-cache mechanism has been a key to the performance in
a repository with a tightly packed packfile, but it did not scale
well even with a larger value of core.deltaBaseCacheLimit.
* jk/delta-base-cache:
t/perf: add basic perf tests for delta base cache
delta_base_cache: use hashmap.h
delta_base_cache: drop special treatment of blobs
delta_base_cache: use list.h for LRU
release_delta_base_cache: reuse existing detach function
clear_delta_base_cache_entry: use a more descriptive name
cache_or_unpack_entry: drop keep_cache parameter