"git submodule" script does not work well with strange pathnames.
Protect it from a path with slashes in them, at least.
* bw/submodule-with-bs-path:
submodule: prevent backslash expantion in submodule names
The index file has a trailing SHA-1 checksum to detect file
corruption, and historically we checked it every time the index
file is used. Omit the validation during normal use, and instead
verify only in "git fsck".
* jh/verify-index-checksum-only-in-fsck:
read-cache: force_verify_index_checksum
In a 2- and 3-way merge of trees, more than one source trees often
end up sharing an identical subtree; optimize by not reading the
same tree multiple times in such a case.
* jh/unpack-trees-micro-optim:
unpack-trees: avoid duplicate ODB lookups during checkout
The string-list API used a custom reallocation strategy that was
very inefficient, instead of using the usual ALLOC_GROW() macro,
which has been fixed.
* jh/string-list-micro-optim:
string-list: use ALLOC_GROW macro when reallocing string_list
$GIT_DIR may in some cases be normalized with all symlinks resolved
while "gitdir" path expansion in the pattern does not receive the
same treatment, leading to incorrect mismatch. This has been fixed.
* nd/conditional-config-include:
config: resolve symlinks in conditional include's patterns
path.c: and an option to call real_path() in expand_user_path()
Allow the http.postbuffer configuration variable to be set to a
size that can be expressed in size_t, which can be larger than
ulong on some platforms.
* dt/http-postbuffer-can-be-large:
http.postbuffer: allow full range of ssize_t values
"http.proxy" set to an empty string is used to disable the usage of
proxy. We broke this early last year.
* sr/http-proxy-configuration-fix:
http: fix the silent ignoring of proxy misconfiguraion
http: honor empty http.proxy option to bypass proxy
Change the test descriptions from being treated as binary blobs by
perl to being treated as UTF-8. This ensures that e.g. a test
description like "æ" is counted as 1 character, not 2.
I have WIP performance tests for non-ASCII grep patterns on another
topic that are affected by this.
Now instead of:
$ ./run p0000-perf-lib-sanity.sh
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
We emit:
[...]
0000.4: export a weird var 0.00(0.00+0.00)
0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00)
0000.7: important variables available in subshells 0.00(0.00+0.00)
[...]
Fixes code originally added in 342e9ef2d9 ("Introduce a performance
testing framework", 2012-02-17).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we complete branch names for "git checkout", we also
complete remote branch names that could trigger the DWIM
behavior. Depending on your workflow and project, this can
be either convenient or annoying.
For instance, my clone of gitster.git contains 74 local
"jk/*" branches, but origin contains another 147. When I
want to checkout a local branch but can't quite remember the
name, tab completion shows me 251 entries. And worse, for a
topic that has been picked up for pu, the upstream branch
name is likely to be similar to mine, leading to a high
probability that I pick the wrong one and accidentally
create a new branch.
This patch adds a way for the user to tell the completion
code not to include DWIM suggestions for checkout. This can
already be done by typing:
git checkout --no-guess jk/<TAB>
but that's rather cumbersome. The downside, of course, is
that you no longer get completion support when you _do_ want
to invoke the DWIM behavior. But depending on your workflow,
that may not be a big loss (for instance, in git.git I am
much more likely to want to detach, so I'd type "git
checkout origin/jk/<TAB>" anyway).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the completion of "push --delete <remote> <ref>" to complete
refs on that <remote>, not all refs.
Before this cloning git.git and doing "git push --delete origin
p<TAB>" will complete nothing, since a fresh clone of git.git will
have no "pu" branch, whereas origin/p<TAB> will uselessly complete
origin/pu, but fully qualified references aren't accepted by
"--delete".
Now p<TAB> will complete as "pu". The completion of giving --delete
later, e.g. "git push origin --delete p<TAB>" remains unchanged, this
is a bug, but is a general existing limitation of the bash completion,
and not how git-push is documented, so I'm not fixing that case, but
adding a failing TODO test for it.
The testing code was supplied by SZEDER Gábor in
<20170421122832.24617-1-szeder.dev@gmail.com> with minor setup
modifications on my part.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Reviewed-by: SZEDER Gábor <szeder.dev@gmail.com>
Test-code-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old link just redirects to a big index page. I was able
to find a new link for the original document via Google.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The original NIST press release linked here is no longer
available. But it was just a one-page summary of a larger
planning report; we can link to the report and point people
to the executive summary, which contains the same
information.
Ideally we'd cite it with a DOI, but I couldn't dig one up
for this particular document. I found many URLs pointing to
this report, but they all end up redirecting to this one
(and it looks somewhat official).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-archimport has an option to register archives at
mirrors.sourcecontrol.net. The sourcecontrol.net domain
still exists, but that hostname no longer exists.
That means this feature is presumably broken. I'll leave the
examination and modification of that to people who might
actually use archimport. But in the meantime, let's wrap the
reference in the documentation in backticks, which will
avoid turning it into a broken link (and thus polluting
linkchecker results).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The slides for the Linux-mentoring presentation are no
longer available. Let's point to the wayback version of the
page, which works.
Note that the referenced diagram is also available on page
15 of [1]. We could link to that instead, but it's not clear
from the URL scheme ("uploads") whether it's going to stick
around forever.
[1] https://www.linuxfoundation.jp/jp_uploads/seminar20070313/Randy.pdf
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The or.cz version of the Git wiki went away long ago, and
now just redirects to kernel.org.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many sites these days unconditionally redirect http requests
to their https equivalents. Let's make our links https in
the first place to save the client a redirect.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we see an error from split_cmdline(), we exit the
function without freeing the copy of the command string we
made.
This was sort-of introduced by 22e5ae5c8 (connect.c: handle
errors from split_cmdline, 2017-04-10). The leak existed
before that, but before that commit fixed the bug, we could
never trigger this else clause in the first place.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The only caller of this function passes in a static buffer
returned from git_path(). This looks dangerous at first
glance, but turns out to be OK because the first thing we do
is xstrdup() the result.
Let's turn this into a git_pathdup(). That's slightly more
efficient (no extra copy), and makes it easier to audit for
dangerous git_path() invocations.
Since there's only a single caller, let's just set this
default path inside the init function. That makes the memory
ownership clear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Writing directly into the strbuf avoids a useless copy of
the data, and dropping calls to git_path() makes it easier
to audit for dangerous calls.
Note that git_path() does an implicit strbuf_reset(), but in
each of these cases we were either already doing that reset,
or writing into a fresh strbuf anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's more efficient to use git_pathdup(), as it skips an
extra copy of the path. And by removing some calls to
git_path(), it makes it easier to audit for dangerous uses.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Long ago we added functions like git_path_merge_msg() to
replace the more dangerous git_path("MERGE_MSG"). Over time
some new calls to the latter have crept it. Let's convert
them to use the safer form.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rather than have a variable with a short name that is fed to
git_path(), let's add a helper function that returns the
full path. This avoids the dangerous git_path() function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This avoids using the dangerous git_path(). Right now
there's only one call site (because the writing half is
still part of the shell script), but it may come in handy in
the future as more of bisect is written in C. It also
matches how we access the other BISECT_* files.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When performing an interactive rebase in split-index mode,
the commit message that one should rework when squashing commits
can contain some garbage instead of the usual concatenation of
both of the commit messages.
The code uses git_path() to compute the shared index filename, and
passes it to check_and_freshen_file() as its argument; there is no
guarantee that the rotating pathname buffer passed as argument will
stay valid during the life of this call. Make our own copy before
calling the function and pass the copy as its argument to avoid this
risky pattern.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As explained in the document. This option has an advantage over the
command sequence "git worktree add && git worktree lock": there will be
no gap that somebody can accidentally "prune" the new worktree (or soon,
explicitly "worktree remove" it).
"worktree add" does keep a lock on while it's preparing the worktree.
If --lock is specified, this lock remains after the worktree is created.
Suggested-by: David Taylor <David.Taylor@dell.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Hotfix for a topic that is already in 'master'.
* jh/memihash-opt:
p0004: make perf test executable
t3008: skip lazy-init test on a single-core box
test-online-cpus: helper to return cpu count
name-hash: fix buffer overrun
"git p4" used "name-rev HEAD" when it wants to learn what branch is
checked out; it should use "symbolic-ref HEAD".
* ld/p4-current-branch-fix:
git-p4: don't use name-rev to get current branch
git-p4: add read_pipe_text() internal function
git-p4: add failing test for name-rev rather than symbolic-ref
Call clear_pathspec() to release resources immediately before the
cmd_grep() function returns.
* ab/grep-plug-pathspec-leak:
grep: plug a trivial memory leak
Clean up fallouts from recent tightening of the set-up sequence,
where Git barfs when repository information is accessed without
first ensuring that it was started in a repository.
* jk/no-looking-at-dotgit-outside-repo:
test-read-cache: setup git dir
has_sha1_file: don't bother if we are not in a repository
The "submodule" specific field in the ref_store structure is
replaced with a more generic "gitdir" that can later be used also
when dealing with ref_store that represents the set of refs visible
from the other worktrees.
* nd/files-backend-git-dir: (28 commits)
refs.h: add a note about sorting order of for_each_ref_*
t1406: new tests for submodule ref store
t1405: some basic tests on main ref store
t/helper: add test-ref-store to test ref-store functions
refs: delete pack_refs() in favor of refs_pack_refs()
files-backend: avoid ref api targeting main ref store
refs: new transaction related ref-store api
refs: add new ref-store api
refs: rename get_ref_store() to get_submodule_ref_store() and make it public
files-backend: replace submodule_allowed check in files_downcast()
refs: move submodule code out of files-backend.c
path.c: move some code out of strbuf_git_path_submodule()
refs.c: make get_main_ref_store() public and use it
refs.c: kill register_ref_store(), add register_submodule_ref_store()
refs.c: flatten get_ref_store() a bit
refs: rename lookup_ref_store() to lookup_submodule_ref_store()
refs.c: introduce get_main_ref_store()
files-backend: remove the use of git_path()
files-backend: add and use files_ref_path()
files-backend: add and use files_reflog_path()
...
The diff options "--ours", "--theirs" exist for quite some time.
But so far they were not documented. Now they are.
* ah/diff-files-ours-theirs-doc:
diff-files: document --ours etc.
If a patch e-mail had its first paragraph after an in-body header
indented (even after a blank line after the in-body header line),
the indented line was mistook as a continuation of the in-body
header. This has been fixed.
* lt/mailinfo-in-body-header-continuation:
mailinfo: fix in-body header continuations
"git push --recurse-submodules --push-option=<string>" learned to
propagate the push option recursively down to pushes in submodules.
* bw/push-options-recursively-to-submodules:
push: propagate remote and refspec with --recurse-submodules
submodule--helper: add push-check subcommand
remote: expose parse_push_refspec function
push: propagate push-options with --recurse-submodules
push: unmark a local variable as static
Conversion from unsigned char [40] to struct object_id continues.
* bc/object-id:
Documentation: update and rename api-sha1-array.txt
Rename sha1_array to oid_array
Convert sha1_array_for_each_unique and for_each_abbrev to object_id
Convert sha1_array_lookup to take struct object_id
Convert remaining callers of sha1_array_lookup to object_id
Make sha1_array_append take a struct object_id *
sha1-array: convert internal storage for struct sha1_array to object_id
builtin/pull: convert to struct object_id
submodule: convert check_for_new_submodule_commits to object_id
sha1_name: convert disambiguate_hint_fn to take object_id
sha1_name: convert struct disambiguate_state to object_id
test-sha1-array: convert most code to struct object_id
parse-options-cb: convert sha1_array_append caller to struct object_id
fsck: convert init_skiplist to struct object_id
builtin/receive-pack: convert portions to struct object_id
builtin/pull: convert portions to struct object_id
builtin/diff: convert to struct object_id
Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
Define new hash-size constants for allocating memory
The output from "git status --short" has been extended to show
various kinds of dirtyness in submodules differently; instead of to
"M" for modified, 'm' and '?' can be shown to signal changes only
to the working tree of the submodule but not the commit that is
checked out.
* sb/submodule-short-status:
submodule.c: correctly handle nested submodules in is_submodule_modified
short status: improve reporting for submodule changes
submodule.c: stricter checking for submodules in is_submodule_modified
submodule.c: port is_submodule_modified to use porcelain 2
submodule.c: convert is_submodule_modified to use strbuf_getwholeline
submodule.c: factor out early loop termination in is_submodule_modified
submodule.c: use argv_array in is_submodule_modified
Teach has_dir_name() to see if the path of the new item
is greater than the last path in the index array before
attempting to search for it.
has_dir_name() is looking for file/directory collisions
in the index and has to consider each sub-directory
prefix in turn. This can cause multiple binary searches
for each path.
During operations like checkout, merge_working_tree()
populates the new index in sorted order, so we expect
to be able to append in many cases.
This commit is part 2 of 2. This commit handles the
additional possible short-cuts as we look at each
sub-directory prefix.
The net-net gains for add_index_entry_with_check() and
both had_dir_name() commits are best seen for very
large repos.
Here are results for an INFLATED version of linux.git
with 1M files.
$ GIT_PERF_REPO=/mnt/test/linux_inflated.git/ ./run upstream/base HEAD ./p0006-read-tree-checkout.sh
Test upstream/base HEAD
0006.2: read-tree br_base br_ballast (1043893) 3.79(3.63+0.15) 2.68(2.52+0.15) -29.3%
0006.3: switch between br_base br_ballast (1043893) 7.55(6.58+0.44) 6.03(4.60+0.43) -20.1%
0006.4: switch between br_ballast br_ballast_plus_1 (1043893) 10.84(9.26+0.59) 8.44(7.06+0.65) -22.1%
0006.5: switch between aliases (1043893) 10.93(9.39+0.58) 10.24(7.04+0.63) -6.3%
Here are results for a synthetic repo with 4.2M files.
$ GIT_PERF_REPO=~/work/gfw/t/perf/repos/gen-many-files-10.4.3.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (4194305) 29.96(19.26+10.50) 23.76(13.42+10.12) -20.7%
0006.3: switch between br_base br_ballast (4194305) 56.95(36.08+16.83) 45.54(25.94+15.68) -20.0%
0006.4: switch between br_ballast br_ballast_plus_1 (4194305) 90.94(51.50+31.52) 78.22(39.39+30.70) -14.0%
0006.5: switch between aliases (4194305) 93.72(51.63+34.09) 77.94(39.00+30.88) -16.8%
Results for medium repos (like linux.git) are mixed and have
more variance (probably do to disk IO unrelated to this test.
$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (57994) 0.25(0.21+0.03) 0.20(0.17+0.02) -20.0%
0006.3: switch between br_base br_ballast (57994) 10.67(6.06+2.92) 10.51(5.94+2.91) -1.5%
0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.59(0.47+0.16) 0.52(0.40+0.13) -11.9%
0006.5: switch between aliases (57994) 0.59(0.44+0.17) 0.51(0.38+0.14) -13.6%
$ GIT_PERF_REPO=/mnt/test/linux.git/ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (57994) 0.24(0.21+0.02) 0.21(0.18+0.02) -12.5%
0006.3: switch between br_base br_ballast (57994) 10.42(5.98+2.91) 10.66(5.86+3.09) +2.3%
0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.59(0.49+0.13) 0.53(0.37+0.16) -10.2%
0006.5: switch between aliases (57994) 0.59(0.43+0.17) 0.50(0.37+0.14) -15.3%
Results for smaller repos (like git.git) are not significant.
$ ./run HEAD~3 HEAD ./p0006-read-tree-checkout.sh
Test HEAD~3 HEAD
0006.2: read-tree br_base br_ballast (3043) 0.01(0.00+0.00) 0.01(0.00+0.00) +0.0%
0006.3: switch between br_base br_ballast (3043) 0.31(0.17+0.11) 0.29(0.19+0.08) -6.5%
0006.4: switch between br_ballast br_ballast_plus_1 (3043) 0.03(0.02+0.00) 0.03(0.02+0.00) +0.0%
0006.5: switch between aliases (3043) 0.03(0.02+0.00) 0.03(0.02+0.00) +0.0%
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach has_dir_name() to see if the path of the new item
is greater than the last path in the index array before
attempting to search for it.
has_dir_name() is looking for file/directory collisions
in the index and has to consider each sub-directory
prefix in turn. This can cause multiple binary searches
for each path.
During operations like checkout, merge_working_tree()
populates the new index in sorted order, so we expect
to be able to append in many cases.
This commit is part 1 of 2. This commit handles the top
of has_dir_name() and the trivial optimization.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach add_index_entry_with_check() to see if the path
of the new item is greater than the last path in the
index array before attempting to search for it.
During checkout, merge_working_tree() populates the new
index in sorted order, so this change will save a binary
lookups per file. This preserves the original behavior
but simply checks the last element before starting the
search.
This helps performance on very large repositories.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Created t/perf/repos/many-files.sh to generate large, but
artificial repositories.
Created t/perf/inflate-repo.sh to alter an EXISTING repo
to have a set of large commits. This can be used to create
a branch with 1M+ files in repositories like git.git or
linux.git, but with more realistic content. It does this
by making multiple copies of the entire worktree in a series
of sub-directories.
The branch name and ballast structure created by both scripts
match, so either script can be used to generate very large
test repositories for the following perf test.
Created t/perf/p0006-read-tree-checkout.sh to measure
performance on various read-tree, checkout, and update-index
operations. This test can run using either normal repos or
ones from the above scripts.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>