Command line completion script (in contrib/) learned to work better
with the reftable backend.
* sh/completion-with-reftable:
completion: support pseudoref existence checks for reftables
completion: refactor existence checks for pseudorefs
"git fetch --atomic" issued an unnecessary empty error message,
which has been corrected.
cf. <ZX__e7VjyLXIl-uV@tanuki>
* jx/fetch-atomic-error-message-fix:
fetch: no redundant error message for atomic fetch
t5574: test porcelain output of atomic fetch
The code to parse the From e-mail header has been updated to avoid
recursion.
* jk/mailinfo-iterative-unquote-comment:
mailinfo: avoid recursion when unquoting From headers
t5100: make rfc822 comment test more careful
mailinfo: fix out-of-bounds memory reads in unquote_quoted_pair()
Code clean-up for sanity checking of command line options for "git
show-ref".
* rs/show-ref-incompatible-options:
show-ref: use die_for_incompatible_opt3()
Some codepaths did not correctly parse configuration variables
specified with valueless "true", which has been corrected.
* jk/implicit-true:
fsck: handle NULL value when parsing message config
trailer: handle NULL value when parsing trailer-specific config
submodule: handle NULL value when parsing submodule.*.branch
help: handle NULL value for alias.* config
trace2: handle NULL values in tr2_sysenv config callback
setup: handle NULL value when parsing extensions
config: handle NULL value when parsing non-bools
"git bisect reset" has been taught to clean up state files and refs
even when BISECT_START file is gone.
* jk/bisect-reset-fix:
bisect: always clean on reset
"git $cmd --end-of-options --rev -- --path" for some $cmd failed
to interpret "--rev" as a rev, and "--path" as a path. This was
fixed for many programs like "reset" and "checkout".
* jk/end-of-options:
parse-options: decouple "--end-of-options" and "--"
The command line parser for the "log" family of commands was too
loose when parsing certain numbers, e.g., silently ignoring the
extra 'q' in "git log -n 1q" without complaining, which has been
tightened up.
* jc/revision-parse-int:
revision: parse integer arguments to --max-count, --skip, etc., more carefully
The sample pre-commit hook that tries to catch introduction of new
paths that use potentially non-portable characters did not notice
an existing path getting renamed to such a problematic path, when
rename detection was enabled.
* jp/use-diff-index-in-pre-commit-sample:
hooks--pre-commit: detect non-ASCII when renaming
trace2 streams used to record the URLs that potentially embed
authentication material, which has been corrected.
* jh/trace2-redact-auth:
t0212: test URL redacting in EVENT format
t0211: test URL redacting in PERF format
trace2: redact passwords from https:// URLs by default
trace2: fix signature of trace2_def_param() macro
Stale URLs have been updated to their current counterparts (or
archive.org) and HTTP links are replaced with working HTTPS links.
* js/update-urls-in-doc-and-comment:
doc: refer to internet archive
doc: update links for andre-simon.de
doc: switch links to https
doc: update links to current pages
Earlier we stopped relying on commit-graph that (still) records
information about commits that are lost from the object store,
which has negative performance implications. The default has been
flipped to disable this pessimization.
* ps/commit-graph-less-paranoid:
commit-graph: disable GIT_COMMIT_GRAPH_PARANOIA by default
Newer versions of Getopt::Long started giving warnings against our
(ab)use of it in "git send-email". Bump the minimum version
requirement for Perl to 5.8.1 (from September 2002) to allow
simplifying our implementation.
* tz/send-email-negatable-options:
send-email: avoid duplicate specification warnings
perl: bump the required Perl version to 5.8.1 from 5.8.0
The way CI testing used "prove" could lead to running the test
suite twice needlessly, which has been corrected.
* js/ci-discard-prove-state:
ci: avoid running the test suite _twice_
ci: add support for GitLab CI
ci: install test dependencies for linux-musl
ci: squelch warnings when testing with unusable Git repo
ci: unify setup of some environment variables
ci: split out logic to set up failed test artifacts
ci: group installation of Docker dependencies
ci: make grouping setup more generic
ci: reorder definitions for grouping functions
In contrib/completion/git-completion.bash, there are a bunch of
instances where we read pseudorefs, such as HEAD, MERGE_HEAD,
REVERT_HEAD, and others via the filesystem. However, the upcoming
reftable refs backend won't use '.git/HEAD' at all but instead will
write an invalid refname as placeholder for backwards compatibility,
which will break the git-completion script.
Update the '__git_pseudoref_exists' function to:
1. Recognize the placeholder '.git/HEAD' written by the reftable
backend (its content is specified in the reftable specs).
2. If reftable is in use, use 'git rev-parse' to determine whether the
given ref exists.
3. Otherwise, continue to use 'test -f' to check for the ref's filename.
Signed-off-by: Stan Hu <stanhu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for the reftable backend, this commit introduces a
'__git_pseudoref_exists' function that continues to use 'test -f' to
determine whether a given pseudoref exists in the local filesystem.
Signed-off-by: Stan Hu <stanhu@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If an error occurs during an atomic fetch, a redundant error message
will appear at the end of do_fetch(). It was introduced in b3a804663c
(fetch: make `--atomic` flag cover backfilling of tags, 2022-02-17).
Because a failure message is displayed before setting retcode in the
function do_fetch(), calling error() on the err message at the end of
this function may result in redundant or empty error message to be
displayed.
We can remove the redundant error() function, because we know that
the function ref_transaction_abort() never fails. While we can find a
common pattern for calling ref_transaction_abort() by running command
"git grep -A1 ref_transaction_abort", e.g.:
if (ref_transaction_abort(transaction, &error))
error("abort: %s", error.buf);
Following this pattern, we can tolerate the return value of the function
ref_transaction_abort() being changed in the future. We also delay the
output of the err message to the end of do_fetch() to reduce redundant
code. With these changes, the test case "fetch porcelain output
(atomic)" in t5574 will also be fixed.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test case "fetch porcelain output" checks output of the fetch
command. The error output must be empty with the follow assertion:
test_must_be_empty stderr
But this assertion fails if using atomic fetch. Refactor this test case
to use different fetch options by splitting it into three test cases.
1. "setup for fetch porcelain output".
2. "fetch porcelain output": for non-atomic fetch.
3. "fetch porcelain output (atomic)": for atomic fetch.
Add new command "test_commit ..." in the first test case, so that if we
run these test cases individually (--run=4-6), "git rev-parse HEAD~"
command will work properly. Run the above test cases, we can find that
one test case has a known breakage, as shown below:
ok 4 - setup for fetch porcelain output
ok 5 - fetch porcelain output # TODO known breakage vanished
not ok 6 - fetch porcelain output (atomic) # TODO known breakage
The failed test case has an error message with only the error prompt but
no message body, as follows:
'stderr' is not empty, it contains:
error:
In a later commit, we will fix this issue.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Acked-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our unquote_comment() function is recursive; when it sees a comment
within a comment, like:
(this is an (embedded) comment)
it recurses to handle the inner comment. This is fine for practical use,
but it does mean that you can easily run out of stack space with a
malicious header. For example:
perl -e 'print "From: ", "(" x 2**18;' |
git mailinfo /dev/null /dev/null
segfaults on my system. And since mailinfo is likely to be fed untrusted
input from the Internet (if not by human users, who might recognize a
garbage header, but certainly there are automated systems that apply
patches from a list) it may be possible for an attacker to trigger the
problem.
That said, I don't think there's an interesting security vulnerability
here. All an attacker can do is make it impossible to parse their email
and apply their patch, and there are lots of ways to generate bogus
emails. So it's more of an annoyance than anything.
But it's pretty easy to fix it. The recursion is not helping us preserve
any particular state from each level. The only flag in our parsing is
take_next_literally, and we can never recurse when it is set (since the
start of a new comment implies it was not backslash-escaped). So it is
really only useful for finding the end of the matched pair of
parentheses. We can do that easily with a simple depth counter.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When processing "From" headers in an email, mailinfo "unquotes" quoted
strings and rfc822 parenthesized comments. For quoted strings, we
actually remove the double-quotes, so:
From: "A U Thor" <someone@example.com>
become:
Author: A U Thor
Email: someone@example.com
But for comments, we leave the outer parentheses in place, so:
From: A U (this is a comment) Thor <someone@example.com>
becomes:
Author: A U (this is a comment) Thor
Email: someone@example.com
So what is the comment "unquoting" actually doing? In our code, being in
a comment section has exactly two effects:
1. We'll unquote backslash-escaped characters inside a comment
section.
2. We _won't_ unquote double-quoted strings inside a comment section.
Our test for comments in t5100 checks this:
From: "A U Thor" <somebody@example.com> (this is \(really\) a comment (honestly))
So it is covering (1), but not (2). Let's add in a quoted string to
cover this.
Moreover, because the comment appears at the end of the From header,
there's nothing to confirm that we correctly found the end of the
comment section (and not just the end-of-string). Let's instead move it
to the beginning of the header, which means we can confirm that the
existing quoted string is detected (which will only happen if we know
we've left the comment block).
As expected, the test continues to pass, but this will give us more
confidence as we refactor the code in the next patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When processing a header like a "From" line, mailinfo uses
unquote_quoted_pair() to handle double-quotes and rfc822 parenthesized
comments. It takes a NUL-terminated string on input, and loops over the
"in" pointer until it sees the NUL. When it finds the start of an
interesting block, it delegates to helper functions which also increment
"in", and return the updated pointer.
But there's a bug here: the helpers find the NUL with a post-increment
in the loop condition, like:
while ((c = *in++) != 0)
So when they do see a NUL (rather than the correct termination of the
quote or comment section), they return "in" as one _past_ the NUL
terminator. And thus the outer loop in unquote_quoted_pair() does not
realize we hit the NUL, and keeps reading past the end of the buffer.
We should instead make sure to return "in" positioned at the NUL, so
that the caller knows to stop their loop, too. A hacky way to do this is
to return "in - 1" after leaving the inner loop. But a slightly cleaner
solution is to avoid incrementing "in" until we are sure it contained a
non-NUL byte (i.e., doing it inside the loop body).
The two tests here show off the problem. Since we check the output,
they'll _usually_ report a failure in a normal build, but it depends on
what garbage bytes are found after the heap buffer. Building with
SANITIZE=address reliably notices the problem. The outcome (both the
exit code and the exact bytes) are just what Git happens to produce for
these cases today, and shouldn't be taken as an endorsement. It might be
reasonable to abort on an unterminated string, for example. The priority
for this patch is fixing the out-of-bounds memory access.
Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the standard message for reporting the use of multiple mutually
exclusive options by calling die_for_incompatible_opt3() instead of
rolling our own. This has the benefits of showing only the actually
given options, reducing the number of strings to translate and making
the UI slightly more consistent.
Adjust the test to no longer insist on a specific order of the
reported options, as this implementation detail does not affect the
usefulness of the error message.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "rev-list" and other commands in the "log" family, being the
oldest part of the system, use their own custom argument parsers,
and integer values of some options are parsed with atoi(), which
allows a non-digit after the number (e.g., "1q") to be silently
ignored. As a natural consequence, an argument that does not begin
with a digit (e.g., "q") silently becomes zero, too.
Switch to use strtol_i() and parse_timestamp() appropriately to
catch bogus input.
Note that one may naïvely expect that --max-count, --skip, etc., to
only take non-negative values, but we must allow them to also take
negative values, as an escape hatch to countermand a limit set by an
earlier option on the command line; the underlying variables are
initialized to (-1) and "--max-count=-1", for example, is a
legitimate way to reinitialize the limit.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing fsck.*, receive.fsck.*, or fetch.fsck.*, we don't check for
an implicit bool. So any of:
[fsck]
badTree
[receive "fsck"]
badTree
[fetch "fsck"]
badTree
will cause us to segfault. We can fix it with config_error_nonbool() in
the usual way, but we have to make a few more changes to get good error
messages. The problem is that all three spots do:
if (skip_prefix(var, "fsck.", &var))
to match and parse the actual message id. But that means that "var" now
just says "badTree" instead of "receive.fsck.badTree", making the
resulting message confusing. We can fix that by storing the parsed
message id in its own separate variable.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When parsing the "key", "command", and "cmd" trailer config, we just
make a copy of the value string. If we see an implicit bool like:
[trailer "foo"]
key
we'll segfault trying to copy a NULL pointer. We can fix this with the
usual config_error_nonbool() check.
I split this out from the other vanilla cases, because at first glance
it looks like a better fix here would be to move the NULL check out of
the switch statement. But it would change the behavior of other keys
like trailer.*.ifExists, where an implicit bool is interpreted as
EXISTS_DEFAULT.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We record the submodule branch config value as a string, so config that
uses an implicit bool like:
[submodule "foo"]
branch
will cause us to segfault. Note that unlike most other config-parsing
bugs of this class, this can be triggered by parsing a bogus .gitmodules
file (which we might do after cloning a malicious repository).
I don't think the security implications are important, though. It's
always a strict NULL dereference, not an out-of-bounds read or write. So
we should reliably kill the process. That may be annoying, but the
impact is limited to the attacker preventing the victim from
successfully using "git clone --recurse-submodules", etc, on the
malicious repo.
The "branch" entry is the only one with this problem; other strings like
"path" and "url" already check for NULL.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When showing all config with "git help --all", we print the list of
defined aliases. But our config callback to do so does not check for a
NULL value, meaning a config block like:
[alias]
foo
will cause us to segfault. We should detect and complain about this in
the usual way.
Since this command is purely informational (and we aren't trying to run
the alias), we could perhaps just generate a warning and continue. But
this sort of misconfiguration should be pretty rare, and the error
message we will produce points directly to the line of config that needs
to be fixed. So just generating the usual error should be OK.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If you have config with an implicit bool like:
[trace2]
envvars
we'll segfault, as we unconditionally try to xstrdup() the value. We
should instead detect and complain, as a boolean value has no meaning
here. The same is true for every variable in tr2_sysenv_settings (and
this patch covers them all, as we check them in a loop).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "partialclone" extension config records a string, and hence it is an
error to have an implicit bool like:
[extensions]
partialclone
in your config. We should recognize and reject this, rather than
segfaulting (which is the current behavior). Note that it's OK to use
config_error_nonbool() here, even though the return value is an enum. We
explicitly document EXTENSION_ERROR as -1 for compatibility with
error(), etc.
This is the only extension value that has this problem. Most of the
others are bools that interpret this value naturally. The exception is
extensions.objectformat, which does correctly check for NULL.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the config parser sees an "implicit" bool like:
[core]
someVariable
it passes NULL to the config callback. Any callback code which expects a
string must check for NULL. This usually happens via helpers like
git_config_string(), etc, but some custom code forgets to do so and will
segfault.
These are all fairly vanilla cases where the solution is just the usual
pattern of:
if (!value)
return config_error_nonbool(var);
though note that in a few cases we have to split initializers like:
int some_var = initializer();
into:
int some_var;
if (!value)
return config_error_nonbool(var);
some_var = initializer();
There are still some broken instances after this patch, which I'll
address on their own in individual patches after this one.
Reported-by: Carlos Andrés Ramírez Cataño <antaigroupltda@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Usually "bisect reset" cleans up any refs/bisect/ refs, along with
meta-files like .git/BISECT_LOG. But it only does so after deciding that
a bisection is active, which it does by reading BISECT_START. This is
usually fine, but it's possible to get into a confusing state if the
BISECT_START file is gone, but other cruft is left (this might be due to
a bug, or a system crash, etc).
And since "bisect reset" refuses to do anything in this state, the user
has no easy way to clean up the leftover cruft. While another "bisect
start" would clear the state, in the interim it can be annoying, as
other tools (like our bash prompt code) think we are bisecting, and
for-each-ref output may be polluted with refs/bisect/ entries.
Further adding to the confusion is that running "bisect reset $some_ref"
skips the BISECT_START check. So it never realizes that there's no
bisection active and does the cleanup anyway!
So let's just make sure we always do the cleanup, whether we looked at
BISECT_START or not. If the user doesn't give us a commit to reset to,
we'll still say "We are not bisecting" and skip the call to "git
checkout".
Reported-by: Janik Haag <janik@aq0.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we added generic end-of-options support in 51b4594b40
(parse-options: allow --end-of-options as a synonym for "--",
2019-08-06), we made them true synonyms. They both stop option parsing,
and they are both returned in the resulting argv if the KEEP_DASHDASH
flag is used.
The hope was that this would work for all callers:
- most generic callers would not pass KEEP_DASHDASH, and so would just
do the right thing (stop parsing there) without needing to know
anything more.
- callers with KEEP_DASHDASH were generally going to rely on
setup_revisions(), which knew to handle --end-of-options specially
But that turned out miss quite a few cases that pass KEEP_DASHDASH but
do their own manual parsing. For example, "git reset", "git checkout",
and so on want pass KEEP_DASHDASH so they can support:
git reset $revs -- $paths
but of course aren't going to actually do a traversal, so they don't
call setup_revisions(). And those cases currently get confused by
--end-of-options being left in place, like:
$ git reset --end-of-options HEAD
fatal: option '--end-of-options' must come before non-option arguments
We could teach each of these callers to handle the leftover option
explicitly. But let's try to be a bit more clever and see if we can
solve it centrally in parse-options.c.
The bogus assumption here is that KEEP_DASHDASH tells us the caller
wants to see --end-of-options in the result. But really, the callers
which need to know that --end-of-options was reached are those that may
potentially parse more options from argv. In other words, those that
pass the KEEP_UNKNOWN_OPT flag.
If such a caller is aware of --end-of-options (e.g., because they call
setup_revisions() with the result), then this will continue to do the
right thing, treating anything after --end-of-options as a non-option.
And if the caller is not aware of --end-of-options, they are better off
keeping it intact, because either:
1. They are just passing the options along to somebody else anyway, in
which case that somebody would need to know about the
--end-of-options marker.
2. They are going to parse the remainder themselves, at which point
choking on --end-of-options is much better than having it silently
removed. The point is to avoid option injection from untrusted
command line arguments, and bailing is better than quietly treating
the untrusted argument as an option.
This fixes bugs with --end-of-options across several commands, but I've
focused on two in particular here:
- t7102 confirms that "git reset --end-of-options --foo" now works.
This checks two things. One, that we no longer barf on
"--end-of-options" itself (which previously we did, even if the rev
was something vanilla like "HEAD" instead of "--foo"). And two, that
we correctly treat "--foo" as a revision rather than an option.
This fix applies to any other cases which pass KEEP_DASHDASH but not
KEEP_UNKNOWN_OPT, like "git checkout", "git check-attr", "git grep",
etc, which would previously choke on "--end-of-options".
- t9350 shows the opposite case: fast-export passed KEEP_UNKNOWN_OPT
but not KEEP_DASHDASH, but then passed the result on to
setup_revisions(). So it never saw --end-of-options, and would
erroneously parse "fast-export --end-of-options --foo" as having a
"--foo" option. This is now fixed.
Note that this does shut the door for callers which want to know if we
hit end-of-options, but don't otherwise need to keep unknown opts. The
obvious thing here is feeding it to the DWIM verify_filename()
machinery. And indeed, this is a problem even for commands which do
understand --end-of-options already. For example, without this patch,
you get:
$ git log --end-of-options --foo
fatal: option '--foo' must come before non-option arguments
because we refuse to accept "--foo" as a filename (because it starts
with a dash) even though we could know that we saw end-of-options. The
verify_filename() function simply doesn't accept this extra information.
So that is the status quo, and this patch doubles down further on that.
Commands like "git reset" have the same problem, but they won't even
know that parse-options saw --end-of-options! So even if we fixed
verify_filename(), they wouldn't have anything to pass to it.
But in practice I don't think this is a big deal. If you are being
careful enough to use --end-of-options, then you should also be using
"--" to disambiguate and avoid the DWIM behavior in the first place. In
other words, doing:
git log --end-of-options --this-is-a-rev -- --this-is-a-path
works correctly, and will continue to do so. And likewise, with this
patch now:
git reset --end-of-options --this-is-a-rev -- --this-is-a-path
will work, as well.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diff.renames is turned on, the diff-filter will not return renamed
files (or copied ones with diff.renames=copy) and potential non-ASCII
characters would not be caught by this hook.
Use the plumbing command diff-index instead of the porcelain one to not
be affected by diff.rename.
Signed-off-by: Julian Prein <druckdev@protonmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 7a5d604443 (commit: detect commits that exist in commit-graph but not
in the ODB, 2023-10-31), we have introduced a new object existence check
into `repo_parse_commit_internal()` so that we do not parse commits via
the commit-graph that don't have a corresponding object in the object
database. This new check of course comes with a performance penalty,
which the commit put at around 30% for `git rev-list --topo-order`. But
there are in fact scenarios where the performance regression is even
higher. The following benchmark against linux.git with a fully-build
commit-graph:
Benchmark 1: git.v2.42.1 rev-list --count HEAD
Time (mean ± σ): 658.0 ms ± 5.2 ms [User: 613.5 ms, System: 44.4 ms]
Range (min … max): 650.2 ms … 666.0 ms 10 runs
Benchmark 2: git.v2.43.0-rc1 rev-list --count HEAD
Time (mean ± σ): 1.333 s ± 0.019 s [User: 1.263 s, System: 0.069 s]
Range (min … max): 1.302 s … 1.361 s 10 runs
Summary
git.v2.42.1 rev-list --count HEAD ran
2.03 ± 0.03 times faster than git.v2.43.0-rc1 rev-list --count HEAD
While it's a noble goal to ensure that results are the same regardless
of whether or not we have a potentially stale commit-graph, taking twice
as much time is a tough sell. Furthermore, we can generally assume that
the commit-graph will be updated by git-gc(1) or git-maintenance(1) as
required so that the case where the commit-graph is stale should not at
all be common.
With that in mind, default-disable GIT_COMMIT_GRAPH_PARANOIA and restore
the behaviour and thus performance previous to the mentioned commit. In
order to not be inconsistent, also disable this behaviour by default in
`lookup_commit_in_graph()`, where the object existence check has been
introduced right at its inception via f559d6d45e (revision: avoid
hitting packfiles when commits are in commit-graph, 2021-08-09).
This results in another speedup in commands that end up calling this
function, even though it's less pronounced compared to the above
benchmark. The following has been executed in linux.git with ~1.2
million references:
Benchmark 1: GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted
Time (mean ± σ): 2.947 s ± 0.003 s [User: 2.412 s, System: 0.534 s]
Range (min … max): 2.943 s … 2.949 s 3 runs
Benchmark 2: GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted
Time (mean ± σ): 2.724 s ± 0.030 s [User: 2.207 s, System: 0.514 s]
Range (min … max): 2.704 s … 2.759 s 3 runs
Summary
GIT_COMMIT_GRAPH_PARANOIA=false git rev-list --all --no-walk=unsorted ran
1.08 ± 0.01 times faster than GIT_COMMIT_GRAPH_PARANOIA=true git rev-list --all --no-walk=unsorted
So whereas 7a5d604443 initially introduced the logic to start doing an
object existence check in `repo_parse_commit_internal()` by default, the
updated logic will now instead cause `lookup_commit_in_graph()` to stop
doing the check by default. This behaviour continues to be tweakable by
the user via the GIT_COMMIT_GRAPH_PARANOIA environment variable.
Note that this requires us to amend some tests to manually turn on the
paranoid checks again. This is because we cause repository corruption by
manually deleting objects which are part of the commit graph already.
These circumstances shouldn't usually happen in repositories.
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These pages are no longer reachable from their original locations,
which makes things difficult for readers. Instead, switch to linking to
the Internet Archive for the content.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Beyond the fact that it's somewhat traditional to respect sites'
self-identification, it's helpful for links to point to the things
that people expect them to reference. Here that means linking to
specific pages instead of a domain.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These sites offer https versions of their content.
Using the https versions provides some protection for users.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's somewhat traditional to respect sites' self-identification.
Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the added tests cases, skip testing the `GIT_TRACE2_REDACT=0` case
because we would need to exactly model the full JSON event stream like
we did in the preceding basic tests and I do not think it is worth it.
Furthermore, the Trace2 routines print the same content in normal, perf,
or event format, and in t0210 and t0211 we already tested the basic
functionality, so no need to repeat it here.
In this test, we use the test-helper to unit test each of the event
messages where URLs can appear and confirm that they are redacted in
each event.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This transmogrifies the test case that was just added to t0210, to also
cover the `GIT_TRACE2_PERF` backend.
Just like t0211, we now have to toggle the `TEST_PASSES_SANITIZE_LEAK`
annotation.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is an unsafe practice to call something like
git clone https://user:password@example.com/
This not only risks leaking the password "over the shoulder" or into the
readline history of the current Unix shell, it also gets logged via
Trace2 if enabled.
Let's at least avoid logging such secrets via Trace2, much like we avoid
logging secrets in `http.c`. Much like the code in `http.c` is guarded
via `GIT_TRACE_REDACT` (defaulting to `true`), we guard the new code via
`GIT_TRACE2_REDACT` (also defaulting to `true`).
The new tests added in this commit uncover leaks in `builtin/clone.c`
and `remote.c`. Therefore we need to turn off
`TEST_PASSES_SANITIZE_LEAK`. The reasons:
- We observed that `the_repository->remote_status` is not released
properly.
- We are using `url...insteadOf` and that runs into a code path where an
allocated URL is replaced with another URL, and the original URL is
never released.
- `remote_states` contains plenty of `struct remote`s whose refspecs
seem to be usually allocated by never released.
More investigation is needed here to identify the exact cause and
proper fixes for these leaks/bugs.
Co-authored-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add `struct key_value_info` argument to `trace2_def_param()`.
In dc90208497 (trace2: plumb config kvi, 2023-06-28) a `kvi`
argument was added to `trace2_def_param_fl()` but the macro
was not up updated. Let's fix that.
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"To dereference" and "to peel" were sometimes used in in-code
comments and documentation but without description in the glossary.
* vd/glossary-dereference-peel:
glossary: add definitions for dereference & peel