Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Once protocol_v2 is introduced requesting a fetch or a push will need to
be handled differently depending on the protocol version. Store the
protocol version the server is speaking in 'struct git_transport_data'
and use it to determine what to do in the case of a fetch or a push.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'. This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove code duplication and use the existing 'get_refs_via_connect()'
function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
'git_transport_push()'.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert find_unique_abbrev and find_unique_abbrev_r to each take a
pointer to struct object_id.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the declaration and definition of strbuf_add_unique_abbrev to
make it take a pointer to struct object_id. Predeclare the struct in
strbuf.h, as cache.h includes strbuf.h before it declares the struct,
and otherwise the struct declaration would have the wrong scope.
Apply the following semantic patch, along with the standard object_id
transforms, to adjust the callers:
@@
expression E1, E2, E3;
@@
- strbuf_add_unique_abbrev(E1, E2.hash, E3);
+ strbuf_add_unique_abbrev(E1, &E2, E3);
@@
expression E1, E2, E3;
@@
- strbuf_add_unique_abbrev(E1, E2->hash, E3);
+ strbuf_add_unique_abbrev(E1, E2, E3);
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The machinery to clone & fetch, which in turn involves packing and
unpacking objects, have been told how to omit certain objects using
the filtering mechanism introduced by the jh/object-filtering
topic, and also mark the resulting pack as a promisor pack to
tolerate missing objects, taking advantage of the mechanism
introduced by the jh/fsck-promisors topic.
* jh/partial-clone:
t5616: test bulk prefetch after partial fetch
fetch: inherit filter-spec from partial clone
t5616: end-to-end tests for partial clone
fetch-pack: restore save_commit_buffer after use
unpack-trees: batch fetching of missing blobs
clone: partial clone
partial-clone: define partial clone settings in config
fetch: support filters
fetch: refactor calculation of remote list
fetch-pack: test support excluding large blobs
fetch-pack: add --no-filter
fetch-pack, index-pack, transport: partial clone
upload-pack: add object filtering for partial clone
In preparation for implementing narrow/partial clone, the machinery
for checking object connectivity used by gc and fsck has been
taught that a missing object is OK when it is referenced by a
packfile specially marked as coming from trusted repository that
promises to make them available on-demand and lazily.
* jh/fsck-promisors:
gc: do not repack promisor packfiles
rev-list: support termination at promisor objects
sha1_file: support lazily fetching missing objects
introduce fetch-object: fetch one promisor object
index-pack: refactor writing of .keep files
fsck: support promisor objects as CLI argument
fsck: support referenced promisor objects
fsck: support refs pointing to promisor objects
fsck: introduce partialclone extension
extension.partialclone: introduce partial clone extension
Move the definition of the transport-specific functions provided by
transports, whether declared in transport.c or transport-helper.c, into
an internal header. This means that transport-using code (as opposed to
transport-declaring code) can no longer access these functions (without
importing the internal header themselves), making it clear that they
should use the transport_*() functions instead, and also allowing the
interface between the transport mechanism and an individual transport to
independently evolve.
This is superficially a reversal of commit 824d5776c3 ("Refactor
struct transport_ops inlined into struct transport", 2007-09-19).
However, the scope of the involved variables was neither affected nor
discussed in that commit, and I think that the advantages in making
those functions more private outweigh the advantages described in that
commit's commit message. A minor additional point is that the code has
gotten more complicated since then, in that the function-pointer
variables are potentially mutated twice (once initially and once if
transport_take_over() is invoked), increasing the value of corralling
them into their own struct.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After commit 0d0bac67ce ("transport: drop support for git-over-rsync",
2016-02-01), no transport in Git populates the "push" entry in the
transport vtable. Remove this entry.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce fetch-object, providing the ability to fetch one object from a
promisor remote.
This uses fetch-pack. To do this, the transport mechanism has been
updated with 2 flags, "from-promisor" to indicate that the resulting
pack comes from a promisor remote (and thus should be annotated as such
by index-pack), and "no-dependents" to indicate that only the objects
themselves need to be fetched (but fetching additional objects is
nevertheless safe).
Whenever "no-dependents" is used, fetch-pack will refrain from using any
object flags, because it is most likely invoked as part of a dynamic
object fetch by another Git command (which may itself use object flags).
An alternative to this is to leave fetch-pack alone, and instead update
the allocation of flags so that fetch-pack's flags never overlap with
any others, but this will end up shrinking the number of flags available
to nearly every other Git command (that is, every Git command that
accesses objects), so the approach in this commit was used instead.
This will be tested in a subsequent commit.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert update_ref, refs_update_ref, and write_pseudoref to use struct
object_id. Update the existing callers as well. Remove update_ref_oid,
as it is no longer needed.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to get rid of several write-only variables.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This allows us to get rid of some write-only variables, among them seven
SHA1 buffers.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the bundle code, plus the sole external user of struct
ref_list_entry, to use struct object_id. Include cache.h from within
bundle.h to provide the definition. Convert some of the hash parsing
code to use parse_oid_hex to avoid needing to hard-code constant values.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push --recurse-submodules --push-option=<string>" learned to
propagate the push option recursively down to pushes in submodules.
* bw/push-options-recursively-to-submodules:
push: propagate remote and refspec with --recurse-submodules
submodule--helper: add push-check subcommand
remote: expose parse_push_refspec function
push: propagate push-options with --recurse-submodules
push: unmark a local variable as static
Teach "push --recurse-submodules" to propagate, if given a name as remote, the
provided remote and refspec recursively to the pushes performed in the
submodules. The push will therefore only succeed if all submodules have a
remote with such a name configured.
Note that "push --recurse-submodules" with a path or URL as remote will not
propagate the remote or refspec and instead use the default remote and refspec
configured in the submodule, preserving the current behavior.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach push --recurse-submodules to propagate push-options recursively to
the pushes performed in the submodules.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this structure handles an array of object IDs, rename it to struct
oid_array. Also rename the accessor functions and the initialization
constant.
This commit was produced mechanically by providing non-Documentation
files to the following Perl one-liners:
perl -pi -E 's/struct sha1_array/struct oid_array/g'
perl -pi -E 's/\bsha1_array_/oid_array_/g'
perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g'
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the callers to pass struct object_id by changing the function
declaration and definition and applying the following semantic patch:
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2.hash)
+ sha1_array_append(E1, &E2)
@@
expression E1, E2;
@@
- sha1_array_append(E1, E2->hash)
+ sha1_array_append(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since we will likely be introducing a new hash function at some point,
and that hash function might be longer than 40 hex characters, use the
constant GIT_MAX_HEXSZ, which is designed to be suitable for
allocations, instead of GIT_SHA1_HEXSZ. This will ease the transition
down the line by distinguishing between places where we need to allocate
memory suitable for the largest hash from those where we need to handle
the current hash.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" that requests a commit by object name, when the other
side does not allow such an request, failed without much
explanation.
* mm/fetch-show-error-message-on-unadvertised-object:
fetch-pack: add specific error for fetching an unadvertised object
fetch_refs_via_pack: call report_unmatched_refs
fetch-pack: move code to report unmatched refs to a function
"git fetch" currently doesn't bother to check that it got all refs it
sought, because the common case of requesting a nonexistent ref triggers
a die() in get_fetch_map. However, there's at least one case that
slipped through: "git fetch REMOTE SHA1" if the server doesn't allow
requests for unadvertised objects. Make fetch_refs_via_pack (which is
on the "git fetch" code path) call report_unmatched_refs so that we at
least get an error message in that case.
Signed-off-by: Matt McCutchen <matt@mattmccutchen.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-ref -d" and other operations to delete references did
not leave any entry in HEAD's reflog when the reference being
deleted was the current branch. This is not a problem in practice
because you do not want to delete the branch you are currently on,
but caused renaming of the current branch to something else not to
be logged in a useful way.
* km/delete-ref-reflog-message:
branch: record creation of renamed branch in HEAD's log
rename_ref: replace empty message in HEAD's log
update-ref: pass reflog message to delete_ref()
delete_ref: accept a reflog message argument
There are a few leaf functions in various files that call
resolve_refdup. Convert these functions to use struct object_id
internally to prepare for transitioning resolve_refdup itself.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the current branch is renamed with 'git branch -m/-M' or deleted
with 'git update-ref -m<msg> -d', the event is recorded in HEAD's log
with an empty message. In preparation for adding a more meaningful
message to HEAD's log in these cases, update delete_ref() to take a
message argument and pass it along to ref_transaction_delete().
Modify all callers to pass NULL for the new message argument; no
change in behavior is intended.
Note that this is relevant for HEAD's log but not for the deleted
ref's log, which is currently deleted along with the ref. Even if it
were not, an entry for the deletion wouldn't be present in the deleted
ref's log. files_transaction_commit() writes to the log if
REF_NEEDS_COMMIT or REF_LOG_ONLY are set, but lock_ref_for_update()
doesn't set REF_NEEDS_COMMIT for the deleted ref because REF_DELETING
is set. In contrast, the update for HEAD has REF_LOG_ONLY set by
split_head_update(), resulting in the deletion being logged.
Signed-off-by: Kyle Meyer <kyle@kyleam.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current method for getting the refs from an alternate is
to run upload-pack in the alternate and parse its output
using the normal transport code. This works and is
reasonably short, but it has a very bad memory footprint
when there are a lot of refs in the alternate. There are two
problems:
1. It reads in all of the refs before passing any back to
us. Which means that our peak memory usage has to store
every ref (including duplicates for peeled variants),
even if our callback could determine that some are not
interesting (e.g., because they point to the same sha1
as another ref).
2. It allocates a "struct ref" for each one. Among other
things, this contains 3 separate 20-byte oids, along
with the name and various pointers. That can add up,
especially if the callback is only interested in the
sha1 (which it can store in a sha1_array as just 20
bytes).
On a particularly pathological case, where the alternate had
over 80 million refs pointing to only around 60,000 unique
objects, the peak heap usage of "git clone --reference" grew
to over 25GB.
This patch instead calls git-for-each-ref in the alternate
repository, and passes each line to the callback as we read
it. That drops the peak heap of the same command to 50MB.
I considered and rejected a few alternatives.
We could read all of the refs in the alternate using our own
ref code, just as we do with submodules. However, as memory
footprint is one of the concerns here, we want to avoid
loading those refs into our own memory as a whole.
It's possible that this will be a better technique in the
future when the ref code can more easily iterate without
loading all of packed-refs into memory.
Another option is to keep calling upload-pack, and just
parse its output ourselves in a streaming fashion. Besides
for-each-ref being simpler (we get to define the format
ourselves, and don't have to deal with speaking the git
protocol), it's more flexible for possible future changes.
For instance, it might be useful for the caller to be able
to limit the set of "interesting" alternate refs. The
motivating example is one where many "forks" of a particular
repository share object storage, and the shared storage has
refs for each fork (which is why so many of the refs are
duplicates; each fork has the same tags). A plausible
future optimization would be to ask for the alternate refs
for just _one_ fork (if you had some out-of-band way of
knowing which was the most interesting or important for the
current operation).
Similarly, no callbacks actually care about the symref value
of alternate refs, and as before, this patch ignores them
entirely. However, if we wanted to add them, for-each-ref's
"%(symref)" is going to be more flexible than upload-pack,
because the latter only handles the HEAD symref due to
historical constraints.
There is one potential downside, though: unlike upload-pack,
our for-each-ref command doesn't report the peeled value of
refs. The existing code calls the alternate_ref_fn callback
twice for tags: once for the tag, and once for the peeled
value with the refname set to "ref^{}".
For the callers in fetch-pack, this doesn't matter at all.
We immediately peel each tag down to a commit either way (so
there's a slight improvement, as do not bother passing the
redundant data over the pipe). For the caller in
receive-pack, it means we will not advertise the peeled
values of tags in our alternate. However, we also don't
advertise peeled values for our _own_ tags, so this is
actually making things more consistent.
It's unclear whether receive-pack advertising peeled values
is a win or not. On one hand, giving more information to the
other side may let it omit some objects from the push. On
the other hand, for tags which both sides have, they simply
bloat the advertisement. The upload-pack advertisement of
git.git is about 30% larger than the receive-pack
advertisement due to its peeled information.
This patch omits the peeled information from
for_each_alternate_ref entirely, and leaves it up to the
caller whether they want to dig up the information.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.
The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have a string with ".../objects" pointing to the
alternate object store, and overwrite bits of it to look at
other paths in the (potential) git repository holding it.
This works because the only path we care about is "refs",
which is shorter than "objects".
Using a strbuf to hold the path lets us get rid of some
magic numbers, and makes it more obvious that the memory
operations are safe.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The real_pathdup() function will have removed extra slashes
for us already (on top of the normalize_path() done when we
created the alternate_object_database struct in the first
place).
Incidentally, this also fixes the case where the path is
just "/", which would read off the start of the array.
That doesn't seem possible to trigger in practice, though,
as link_alt_odb_entry() blindly eats trailing slashes,
including a bare "/".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In older versions of git, if real_path() failed to resolve
the alternate object store path, we would die() with an
error. However, since 4ac9006f8 (real_path: have callers use
real_pathdup and strbuf_realpath, 2016-12-12) we use the
real_pathdup() function, which may return NULL. Since we
don't check the return value, we can segfault.
This is hard to trigger in practice, since we check that the
path is accessible before creating the alternate_object_database
struct. But it could be removed racily, or we could see a
transient filesystem error.
We could restore the original behavior by switching back to
xstrdup(real_path()). However, dying is probably not the
best option here. This whole function is best-effort
already; there might not even be a repository around the
shared objects at all. And if the alternate store has gone
away, there are no objects to show.
So let's just quietly return, as we would if we failed to
open "refs/", or if upload-pack failed to start, etc.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git submodule push" learned "--recurse-submodules=only option to
push submodules out without pushing the top-level superproject.
* bw/push-submodule-only:
push: add option to push only submodules
submodules: add RECURSE_SUBMODULES_ONLY value
transport: reformat flag #defines to be more readable
"git grep" has been taught to optionally recurse into submodules.
* bw/grep-recurse-submodules:
grep: search history of moved submodules
grep: enable recurse-submodules to work on <tree> objects
grep: optionally recurse into submodules
grep: add submodules as a grep source type
submodules: load gitmodules file from commit sha1
submodules: add helper to determine if a submodule is initialized
submodules: add helper to determine if a submodule is populated
real_path: canonicalize directory separators in root parts
real_path: have callers use real_pathdup and strbuf_realpath
real_path: create real_pathdup
real_path: convert real_path_internal to strbuf_realpath
real_path: resolve symlinks by hand
"git push --dry-run --recurse-submodule=on-demand" wasn't
"--dry-run" in the submodules.
* bw/push-dry-run:
push: fix --dry-run to not push submodules
push: --dry-run updates submodules when --recurse-submodules=on-demand
The code in "git push" to compute if any commit being pushed in the
superproject binds a commit in a submodule that hasn't been pushed
out was overly inefficient, making it unusable even for a small
project that does not have any submodule but have a reasonable
number of refs.
* hv/submodule-not-yet-pushed-fix:
submodule_needs_pushing(): explain the behaviour when we cannot answer
batch check whether submodule needs pushing into one call
serialize collection of refs that contain submodule changes
serialize collection of changed submodules
Finer-grained control of what protocols are allowed for transports
during clone/fetch/push have been enabled via a new configuration
mechanism.
* bw/transport-protocol-policy:
http: respect protocol.*.allow=user for http-alternates
transport: add from_user parameter to is_transport_allowed
http: create function to get curl allowed protocols
transport: add protocol policy config option
http: always warn if libcurl version is too old
lib-proto-disable: variable name fix
Teach push the --recurse-submodules=only option. This enables push
to recursively push all unpushed submodules while leaving the
superproject unpushed.
This is a desirable feature in a scenario where updates to the
superproject are handled automatically by some other means, perhaps
a tool like Gerrit code review. In this scenario, a developer could
make a change which spans multiple submodules and then push their
commits for code review. Upon completion of the code review, their
commits can be accepted and applied to their respective submodules
while the code review tool can then automatically update the
superproject to the most recent SHA1 of each submodule. This would
reduce the merge conflicts in the superproject that could occur if
multiple people are contributing to the same submodule.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push --dry-run --recurse-submodule=on-demand" wasn't
"--dry-run" in the submodules.
* bw/push-dry-run:
push: fix --dry-run to not push submodules
push: --dry-run updates submodules when --recurse-submodules=on-demand
The code in "git push" to compute if any commit being pushed in the
superproject binds a commit in a submodule that hasn't been pushed
out was overly inefficient, making it unusable even for a small
project that does not have any submodule but have a reasonable
number of refs.
* hv/submodule-not-yet-pushed-fix:
submodule_needs_pushing(): explain the behaviour when we cannot answer
batch check whether submodule needs pushing into one call
serialize collection of refs that contain submodule changes
serialize collection of changed submodules
Add a from_user parameter to is_transport_allowed() to allow http to be
able to distinguish between protocol restrictions for redirects versus
initial requests. CURLOPT_REDIR_PROTOCOLS can now be set differently
from CURLOPT_PROTOCOLS to disallow use of protocols with the "user"
policy in redirects.
This change allows callers to query if a transport protocol is allowed,
given that the caller knows that the protocol is coming from the user
(1) or not from the user (0) such as redirects in libcurl. If unknown a
-1 should be provided which falls back to reading
`GIT_PROTOCOL_FROM_USER` to determine if the protocol came from the
user.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously the `GIT_ALLOW_PROTOCOL` environment variable was used to
specify a whitelist of protocols to be used in clone/fetch/push
commands. This patch introduces new configuration options for more
fine-grained control for allowing/disallowing protocols. This also has
the added benefit of allowing easier construction of a protocol
whitelist on systems where setting an environment variable is
non-trivial.
Now users can specify a policy to be used for each type of protocol via
the 'protocol.<name>.allow' config option. A default policy for all
unconfigured protocols can be set with the 'protocol.allow' config
option. If no user configured default is made git will allow known-safe
protocols (http, https, git, ssh, file), disallow known-dangerous
protocols (ext), and have a default policy of `user` for all other
protocols.
The supported policies are `always`, `never`, and `user`. The `user`
policy can be used to configure a protocol to be usable when explicitly
used by a user, while disallowing it for commands which run
clone/fetch/push commands without direct user intervention (e.g.
recursive initialization of submodules). Commands which can potentially
clone/fetch/push from untrusted repositories without user intervention
can export `GIT_PROTOCOL_FROM_USER` with a value of '0' to prevent
protocols configured to the `user` policy from being used.
Fix remote-ext tests to use the new config to allow the ext
protocol to be tested.
Based on a patch by Jeff King <peff@peff.net>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Always warn if libcurl version is too old because:
1. Even without a protocol whitelist, newer versions of curl have all
non-standard protocols disabled by default.
2. A future patch will introduce default "known-good" and "known-bad"
protocols which are allowed/disallowed by 'is_transport_allowed'
which older version of libcurl can't respect.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Migrate callers of real_path() who duplicate the retern value to use
real_pathdup or strbuf_realpath.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach push to respect the --dry-run option when configured to
recursively push submodules 'on-demand'. This is done by passing the
--dry-run flag to the child process which performs a push for a
submodules when performing a dry-run.
In order to preserve good user experience, the additional check for
unpushed submodules is skipped during a dry-run when
--recurse-submodules=on-demand. The check is skipped because the submodule
pushes were performed as dry-runs and this check would always fail as the
submodules would still need to be pushed.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are iterating over each pushed ref and want to check whether it
contains changes to submodules. Instead of immediately checking each ref
lets first collect them and then do the check for all of them in one
revision walk.
Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push" and "git fetch" reports from what old object to what new
object each ref was updated, using abbreviated refnames, and they
attempt to align the columns for this and other pieces of
information. The way these codepaths compute how many display
columns to allocate for the object names portion of this output has
been updated to match the recent "auto scale the default
abbreviation length" change.
* jc/abbrev-auto:
transport: compute summary-width dynamically
transport: allow summary-width to be computed dynamically
fetch: pass summary_width down the callchain
transport: pass summary_width down the callchain
Now all that is left to do is to actually iterate over the refs
and measure the display width needed to show their abbreviation.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now we have identified three callchains that have a set of refs that
they want to show their <old, new> object names in an aligned output,
we can replace their reference to the constant TRANSPORT_SUMMARY_WIDTH
with a helper function call to transport_summary_width() that takes
the set of ref as a parameter. This step does not yet iterate over
the refs and compute, which is left as an exercise to the readers.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The callchain that originates at transport_print_push_status()
eventually hits a single leaf function, print_ref_status(), that is
used to show from what old object to what new object a ref got
updated, and the width of the part that shows old and new object
names used a constant TRANSPORT_SUMMARY_WIDTH.
Teach the callchain to pass the width down from the top instead.
This allows a future enhancement to compute the necessary display
width before calling down this callchain.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Codepaths involved in interacting alternate object store have
been cleaned up.
* jk/alt-odb-cleanup:
alternates: use fspathcmp to detect duplicates
sha1_file: always allow relative paths to alternates
count-objects: report alternates via verbose mode
fill_sha1_file: write into a strbuf
alternates: store scratch buffer as strbuf
fill_sha1_file: write "boring" characters
alternates: use a separate scratch space
alternates: encapsulate alt->base munging
alternates: provide helper for allocating alternate
alternates: provide helper for adding to alternates list
link_alt_odb_entry: refactor string handling
link_alt_odb_entry: handle normalize_path errors
t5613: clarify "too deep" recursion tests
t5613: do not chdir in main process
t5613: whitespace/style cleanups
t5613: use test_must_fail
t5613: drop test_valid_repo function
t5613: drop reachable_via function
The existing "git fetch --depth=<n>" option was hard to use
correctly when making the history of an existing shallow clone
deeper. A new option, "--deepen=<n>", has been added to make this
easier to use. "git clone" also learned "--shallow-since=<date>"
and "--shallow-exclude=<tag>" options to make it easier to specify
"I am interested only in the recent N months worth of history" and
"Give me only the history since that version".
* nd/shallow-deepen: (27 commits)
fetch, upload-pack: --deepen=N extends shallow boundary by N commits
upload-pack: add get_reachable_list()
upload-pack: split check_unreachable() in two, prep for get_reachable_list()
t5500, t5539: tests for shallow depth excluding a ref
clone: define shallow clone boundary with --shallow-exclude
fetch: define shallow boundary with --shallow-exclude
upload-pack: support define shallow boundary by excluding revisions
refs: add expand_ref()
t5500, t5539: tests for shallow depth since a specific date
clone: define shallow clone boundary based on time with --shallow-since
fetch: define shallow boundary with --shallow-since
upload-pack: add deepen-since to cut shallow repos based on time
shallow.c: implement a generic shallow boundary finder based on rev-list
fetch-pack: use a separate flag for fetch in deepening mode
fetch-pack.c: mark strings for translating
fetch-pack: use a common function for verbose printing
fetch-pack: use skip_prefix() instead of starts_with()
upload-pack: move rev-list code out of check_non_tip()
upload-pack: make check_non_tip() clean things up on error
upload-pack: tighten number parsing at "deepen" lines
...
The alternate_object_database struct uses a single buffer
both for storing the path to the alternate, and as a scratch
buffer for forming object names. This is efficient (since
otherwise we'd end up storing the path twice), but it makes
life hard for callers who just want to know the path to the
alternate. They have to remember to stop reading after
"alt->name - alt->base" bytes, and to subtract one for the
trailing '/'.
It would be much simpler if they could simply access a
NUL-terminated path string. We could encapsulate this in a
function which puts a NUL in the scratch buffer and returns
the string, but that opens up questions about the lifetime
of the result. The first time another caller uses the
alternate, the scratch buffer may get other data tacked onto
it.
Let's instead just store the root path separately from the
scratch buffer. There aren't enough alternates being stored
for the duplicated data to matter for performance, and this
keeps things simple and safe for the callers.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The surrounding advice is printed to stderr, but the list of submodules
is not. Make the report consistent by reporting everything to stderr.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git am -3" calls "git merge-recursive" when it needs to fall back
to a three-way merge; this call has been turned into an internal
subroutine call instead of spawning a separate subprocess.
* js/am-3-merge-recursive-direct:
merge-recursive: flush output buffer even when erroring out
merge_trees(): ensure that the callers release output buffer
merge-recursive: offer an option to retain the output in 'obuf'
merge-recursive: write the commit title in one go
merge-recursive: flush output buffer before printing error messages
am -3: use merge_recursive() directly again
merge-recursive: switch to returning errors instead of dying
merge-recursive: handle return values indicating errors
merge-recursive: allow write_tree_from_memory() to error out
merge-recursive: avoid returning a wholesale struct
merge_recursive: abort properly upon errors
prepare the builtins for a libified merge_recursive()
merge-recursive: clarify code in was_tracked()
die(_("BUG")): avoid translating bug messages
die("bug"): report bugs consistently
t5520: verify that `pull --rebase` shows the helpful advice when failing
"git fetch http://user:pass@host/repo..." scrubbed the userinfo
part, but "git push" didn't.
* jk/push-scrub-url:
t5541: fix url scrubbing test when GPG is not set
push: anonymize URL in status output
Call strbuf_add_unique_abbrev() to add abbreviated hashes to strbufs
instead of taking detours through find_unique_abbrev() and its static
buffer. This is shorter and a bit more efficient.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git push" learned to accept and pass extra options to the
receiving end so that hooks can read and react to them.
* sb/push-options:
add a test for push options
push: accept push options
receive-pack: implement advertising and receiving push options
push options: {pre,post}-receive hook learns about push options
The vast majority of error messages in Git's source code which report a
bug use the convention to prefix the message with "BUG:".
As part of cleaning up merge-recursive to stop die()ing except in case of
detected bugs, let's just make the remainder of the bug reports consistent
with the de facto rule.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch http://user:pass@host/repo..." scrubbed the userinfo
part, but "git push" didn't.
* jk/push-scrub-url:
t5541: fix url scrubbing test when GPG is not set
push: anonymize URL in status output
This implements everything that is required on the client side to make use
of push options from the porcelain push command.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 47abd85 (fetch: Strip usernames from url's before
storing them, 2009-04-17) taught fetch to anonymize URLs.
The primary purpose there was to avoid sticking passwords in
merge-commit messages, but as a side effect, we also avoid
printing them to stderr.
The push side does not have the merge-commit problem, but it
probably should avoid printing them to stderr. We can reuse
the same anonymizing function.
Note that for this to come up, the credentials would have to
appear either on the command line or in a git config file,
neither of which is particularly secure. So people _should_
be switching to using credential helpers instead, which
makes this problem go away. But that's no excuse not to
improve the situation for people who for whatever reason end
up using credentials embedded in the URL.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark one printf string and one error string for translation.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In git-fetch, --depth argument is always relative with the latest
remote refs. This makes it a bit difficult to cover this use case,
where the user wants to make the shallow history, say 3 levels
deeper. It would work if remote refs have not moved yet, but nobody
can guarantee that, especially when that use case is performed a
couple months after the last clone or "git fetch --depth". Also,
modifying shallow boundary using --depth does not work well with
clones created by --since or --not.
This patch fixes that. A new argument --deepen=<N> will add <N> more (*)
parent commits to the current history regardless of where remote refs
are.
Have/Want negotiation is still respected. So if remote refs move, the
server will send two chunks: one between "have" and "want" and another
to extend shallow history. In theory, the client could send no "want"s
in order to get the second chunk only. But the protocol does not allow
that. Either you send no want lines, which means ls-remote; or you
have to send at least one want line that carries deep-relative to the
server..
The main work was done by Dongcan Jiang. I fixed it up here and there.
And of course all the bugs belong to me.
(*) We could even support --deepen=<N> where <N> is negative. In that
case we can cut some history from the shallow clone. This operation
(and --depth=<shorter depth>) does not require interaction with remote
side (and more complicated to implement as a result).
Helped-by: Duy Nguyen <pclouds@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The two alternative ways to spell "ssh://" transport have been
deprecated for a long time. The last mention of them has finally
removed from the documentation.
* cn/deprecate-ssh-git-url:
Disown ssh+git and git+ssh
Some people argue that these were silly from the beginning (see
http://thread.gmane.org/gmane.comp.version-control.git/285590/focus=285601
for example), but we have to support them for compatibility.
That doesn't mean we have to show them in the documentation. These
were already left out of the main list, but a reference in the main
manpage was left, so remove that.
Also add a note to discourage their use if anybody goes looking for them
in the source code.
Signed-off-by: Carlos Martín Nieto <cmn@dwim.me>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
"git fetch" and friends that make network connections can now be
told to only use ipv4 (or ipv6).
* ew/force-ipv4:
connect & http: support -4 and -6 switches for remote operations
This function uses xcalloc and two memcpy calls to
concatenate two strings. We can do this as an xstrfmt
one-liner, and then it is more clear that we are allocating
the correct amount of memory.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:
1. It automatically checks the array-size multiplication
for overflow.
2. It always uses sizeof(*array) for the element-size,
so that it can never go out of sync with the declared
type of the array.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It turns out "git clone" over rsync transport has been broken when
the source repository has packed references for a long time, and
nobody noticed nor complained about it.
* jk/drop-rsync-transport:
transport: drop support for git-over-rsync
Sometimes it is necessary to force IPv4-only or IPv6-only operation
on networks where name lookups may return a non-routable address and
stall remote operations.
The ssh(1) command has an equivalent switches which we may pass when
we run them. There may be old ssh(1) implementations out there
which do not support these switches; they should report the
appropriate error in that case.
rsync support is untouched for now since it is deprecated and
scheduled to be removed.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-over-rsync protocol is inefficient and broken, and
has been for a long time. It transfers way more objects than
it needs (grabbing all of the remote's "objects/",
regardless of which objects we need). It does its own ad-hoc
parsing of loose and packed refs from the remote, but
doesn't properly override packed refs with loose ones,
leading to garbage results (e.g., expecting the other side
to have an object pointed to by a stale packed-refs entry,
or complaining that the other side has two copies of the
refs[1]).
This latter breakage means that nobody could have
successfully pulled from a moderately active repository
since cd547b4 (fetch/push: readd rsync support, 2007-10-01).
We never made an official deprecation notice in the release
notes for git's rsync protocol, but the tutorial has marked
it as such since 914328a (Update tutorial., 2005-08-30).
And on the mailing list as far back as Oct 2005, we can find
Junio mentioning it as having "been deprecated for quite
some time."[2,3,4]. So it was old news then; cogito had
deprecated the transport in July of 2005[5] (though it did
come back briefly when Linus broke git-http-pull!).
Of course some people professed their love of rsync through
2006, but Linus clarified in his usual gentle manner[6]:
> Thanks! This is why I still use rsync, even though
> everybody and their mother tells me "Linus says rsync is
> deprecated."
No. You're using rsync because you're actively doing
something _wrong_.
The deprecation sentiment was reinforced in 2008, with a
mention that cloning via rsync is broken (with no fix)[7].
Even the commit porting rsync over to C from shell (cd547b4)
lists it as deprecated! So between the 10 years of informal
warnings, and the fact that it has been severely broken
since 2007, it's probably safe to simply remove it without
further deprecation warnings.
[1] http://article.gmane.org/gmane.comp.version-control.git/285101
[2] http://article.gmane.org/gmane.comp.version-control.git/10093
[3] http://article.gmane.org/gmane.comp.version-control.git/17734
[4] http://article.gmane.org/gmane.comp.version-control.git/18911
[5] http://article.gmane.org/gmane.comp.version-control.git/5617
[6] http://article.gmane.org/gmane.comp.version-control.git/19354
[7] http://article.gmane.org/gmane.comp.version-control.git/103635
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
While working in connect.c to perform non-blocking connections,
I noticed calling "git fetch -v" was not causing the progress
messages inside git_tcp_connect_sock to be emitted as I
expected.
Looking at history, it seems connect_setup has never been called
with the verbose parameter. Since transport already has a
"verbose" field, use that field instead of another parameter
in connect_setup.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
More transition from "unsigned char[40]" to "struct object_id".
This needed a few merge fixups, but is mostly disentangled from other
topics.
* bc/object-id:
remote: convert functions to struct object_id
Remove get_object_hash.
Convert struct object to object_id
Add several uses of get_object_hash.
object: introduce get_object_hash macro.
ref_newer: convert to use struct object_id
push_refs_with_export: convert to struct object_id
get_remote_heads: convert to struct object_id
parse_fetch: convert to use struct object_id
add_sought_entry_mem: convert to struct object_id
Convert struct ref to use object_id.
sha1_file: introduce has_object_file helper.
* maint:
http: treat config options sslCAPath and sslCAInfo as paths
Documentation/diff: give --word-diff-regex=. example
filter-branch: deal with object name vs. pathname ambiguity in tree-filter
check-ignore: correct documentation about output
git-p4: clean up after p4 submit failure
git-p4: work with a detached head
git-p4: add option to system() to return subshell status
git-p4: add failing test for submit from detached head
remote-http(s): support SOCKS proxies
t5813: avoid creating urls that break on cygwin
Escape Git's exec path in contrib/rerere-train.sh script
allow hooks to ignore their standard input stream
rebase-i-exec: Allow space in SHELL_PATH
Documentation: make environment variable formatting more consistent
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Since ec7dbd145 (receive-pack: allow hooks to ignore its
standard input stream) the pre-receive and post-receive
hooks ignore SIGPIPE. Do the same for the remaining hooks
pre-push and post-rewrite, which read from standard input.
The same arguments for ignoring SIGPIPE apply.
Include test by Jeff King which checks that SIGPIPE does not
cause pre-push hook failure. With the use of git update-ref
--stdin it is fast enough to be enabled by default.
Signed-off-by: Clemens Buchacher <clemens.buchacher@intel.com>
Signed-off-by: Jeff King <peff@peff.net>
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.
Macintosh-specific breakage was noticed and corrected in this
reroll.
* jk/war-on-sprintf: (70 commits)
name-rev: use strip_suffix to avoid magic numbers
use strbuf_complete to conditionally append slash
fsck: use for_each_loose_file_in_objdir
Makefile: drop D_INO_IN_DIRENT build knob
fsck: drop inode-sorting code
convert strncpy to memcpy
notes: document length of fanout path with a constant
color: add color_set helper for copying raw colors
prefer memcpy to strcpy
help: clean up kfmclient munging
receive-pack: simplify keep_arg computation
avoid sprintf and strcpy with flex arrays
use alloc_ref rather than hand-allocating "struct ref"
color: add overflow checks for parsing colors
drop strcpy in favor of raw sha1_to_hex
use sha1_to_hex_r() instead of strcpy
daemon: use cld->env_array when re-spawning
stat_tracking_info: convert to argv_array
http-push: use an argv_array for setup_revisions
fetch-pack: use argv_array for index-pack / unpack-objects
...
We generate range strings like "1234abcd...5678efab" for use
in the the fetch and push status tables. We use fixed-size
buffers along with strcat to do so. These aren't buggy, as
our manual size computation is correct, but there's nothing
checking that this is so. Let's switch them to strbufs
instead, which are obviously correct, and make it easier to
audit the code base for problematic calls to strcat().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current callers only want to die when their transport is
prohibited. But future callers want to query the mechanism
without dying.
Let's break out a few query functions, and also save the
results in a static list so we don't have to re-parse for
each query.
Based-on-a-patch-by: Blake Burkhart <bburky@bburky.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we are cloning an untrusted remote repository into a
sandbox, we may also want to fetch remote submodules in
order to get the complete view as intended by the other
side. However, that opens us up to attacks where a malicious
user gets us to clone something they would not otherwise
have access to (this is not necessarily a problem by itself,
but we may then act on the cloned contents in a way that
exposes them to the attacker).
Ideally such a setup would sandbox git entirely away from
high-value items, but this is not always practical or easy
to set up (e.g., OS network controls may block multiple
protocols, and we would want to enable some but not others).
We can help this case by providing a way to restrict
particular protocols. We use a whitelist in the environment.
This is more annoying to set up than a blacklist, but
defaults to safety if the set of protocols git supports
grows). If no whitelist is specified, we continue to default
to allowing all protocols (this is an "unsafe" default, but
since the minority of users will want this sandboxing
effect, it is the only sensible one).
A note on the tests: ideally these would all be in a single
test file, but the git-daemon and httpd test infrastructure
is an all-or-nothing proposition rather than a test-by-test
prerequisite. By putting them all together, we would be
unable to test the file-local code on machines without
apache.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The client side codepaths in "git push" have been cleaned up
and the user can request to perform an optional "signed push",
i.e. sign only when the other end accepts signed push.
* db/push-sign-if-asked:
push: add a config option push.gpgSign for default signed pushes
push: support signing pushes iff the server supports it
builtin/send-pack.c: use parse_options API
config.c: rename git_config_maybe_bool_text and export it as git_parse_maybe_bool
transport: remove git_transport_options.push_cert
gitremote-helpers.txt: document pushcert option
Documentation/git-send-pack.txt: document --signed
Documentation/git-send-pack.txt: wrap long synopsis line
Documentation/git-push.txt: document when --signed may fail
All of the callsites covered by this change call write_file() or
write_file_gently() to create a one-liner file. Drop the caller
supplied LF and let these callees to append it as necessary.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All callers except three passed 1 for the "fatal" parameter to ask
this function to die upon error, but to a casual reader of the code,
it was not all obvious what that 1 meant. Instead, split the
function into two based on a common write_file_v() that takes the
flag, introduce write_file_gently() as a new way to attempt creating
a file without dying on error, and make three callers to call it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new flag --sign=true (or --sign=false), which means the same
thing as the original --signed (or --no-signed). Give it a third
value --sign=if-asked to tell push and send-pack to send a push
certificate if and only if the server advertised a push cert nonce.
If not, warn the user that their push may not be as secure as they
thought.
Signed-off-by: Dave Borowitz <dborowitz@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>