1
0
Fork 0
mirror of https://github.com/git/git.git synced 2024-10-28 04:49:43 +01:00
Commit graph

551 commits

Author SHA1 Message Date
Xing Xin
094f78a16a transport.c:🤝 make use of server options from remote
Utilize the `server_options` from the corresponding remote during the
handshake in `transport.c` when Git protocol v2 is detected. This helps
initialize the `server_options` in `transport.h:transport` if no server
options are set for the transport (typically via `--server-option` or
`-o`).

While another potential place to incorporate server options from the
remote is in `transport.c:transport_get`, setting server options for a
transport using a protocol other than v2 could lead to unexpected errors
(see `transport.c:die_if_server_options`).

Relevant tests and documentation have been updated accordingly.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 10:22:08 -07:00
Xing Xin
06708ce180 transport: introduce parse_transport_option() method
Add the `parse_transport_option()` method to parse the `push.pushOption`
configuration. This method will also be used in the next commit to
handle the new `remote.<name>.serverOption` configuration for setting
server options in Git protocol v2.

Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-10-08 10:22:06 -07:00
Patrick Steinhardt
6f54d00439 fetch-pack: fix leaking sought refs
When calling `fetch_pack()` the caller is expected to pass in a set of
sought-after refs that they want to fetch. This array gets massaged to
not contain duplicate entries, which is done by replacing duplicate refs
with `NULL` pointers. This modifies the caller-provided array, and in
case we do unset any pointers the caller now loses track of that ref and
cannot free it anymore.

Now the obvious fix would be to not only unset these pointers, but to
also free their contents. But this doesn't work because callers continue
to use those refs. Another potential solution would be to copy the array
in `fetch_pack()` so that we dont modify the caller-provided one. But
that doesn't work either because the NULL-ness of those entries is used
by callers to skip over ref entries that we didn't even try to fetch in
`report_unmatched_refs()`.

Instead, we make it the responsibility of our callers to duplicate these
arrays as needed. It ain't pretty, but it works to plug the memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 10:24:52 -07:00
Junio C Hamano
8c1c63d525 Merge branch 'ps/leakfixes-part-5'
Even more leak fixes.

* ps/leakfixes-part-5:
  transport: fix leaking negotiation tips
  transport: fix leaking arguments when fetching from bundle
  builtin/fetch: fix leaking transaction with `--atomic`
  remote: fix leaking peer ref when expanding refmap
  remote: fix leaks when matching refspecs
  remote: fix leaking config strings
  builtin/fetch-pack: fix leaking refs
  sideband: fix leaks when configuring sideband colors
  builtin/send-pack: fix leaking refspecs
  transport: fix leaking OID arrays in git:// transport data
  t/helper: fix leaking multi-pack-indices in "read-midx"
  builtin/repack: fix leaks when computing packs to repack
  midx-write: fix leaking hashfile on error cases
  builtin/archive: fix leaking `OPT_FILENAME()` value
  builtin/upload-archive: fix leaking args passed to `write_archive()`
  builtin/merge-tree: fix leaking `-X` strategy options
  pretty: fix leaking key/value separator buffer
  pretty: fix memory leaks when parsing pretty formats
  convert: fix leaks when resetting attributes
  mailinfo: fix leaking header data
2024-09-03 09:15:00 -07:00
Junio C Hamano
5e56a39e6a Merge branch 'ps/config-wo-the-repository'
Use of API functions that implicitly depend on the_repository
object in the config subsystem has been rewritten to pass a
repository object through the callchain.

* ps/config-wo-the-repository:
  config: hide functions using `the_repository` by default
  global: prepare for hiding away repo-less config functions
  config: don't depend on `the_repository` with branch conditions
  config: don't have setters depend on `the_repository`
  config: pass repo to functions that rename or copy sections
  config: pass repo to `git_die_config()`
  config: pass repo to `git_config_get_expiry_in_days()`
  config: pass repo to `git_config_get_expiry()`
  config: pass repo to `git_config_get_max_percent_split_change()`
  config: pass repo to `git_config_get_split_index()`
  config: pass repo to `git_config_get_index_threads()`
  config: expose `repo_config_clear()`
  config: introduce missing setters that take repo as parameter
  path: hide functions using `the_repository` by default
  path: stop relying on `the_repository` in `worktree_git_path()`
  path: stop relying on `the_repository` when reporting garbage
  hooks: remove implicit dependency on `the_repository`
  editor: do not rely on `the_repository` for interactive edits
  path: expose `do_git_common_path()` as `repo_common_pathv()`
  path: expose `do_git_path()` as `repo_git_pathv()`
2024-08-23 09:02:34 -07:00
Patrick Steinhardt
13b23d2da5 transport: fix leaking negotiation tips
We do not free negotiation tips in the transport's smart options. Fix
this by freeing them on disconnect.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt
7720460ccf transport: fix leaking arguments when fetching from bundle
In `fetch_refs_from_bundle()` we assemble a vector of arguments to pass
to `unbundle()`, but never free it. And in theory we wouldn't have to
because `unbundle()` already knows to free the vector for us. But it
fails to do so when it exits early due to `verify_bundle()` failing.

The calling convention that the arguments are freed by the callee and
not the caller feels somewhat weird. Refactor the code such that it is
instead the responsibility of the caller to free the vector, adapting
the only two callsites where we pass extra arguments. This also fixes
the memory leak.

This memory leak gets hit in t5510, but fixing it isn't sufficient to
make the whole test suite pass.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:06 -07:00
Patrick Steinhardt
ca52234183 transport: fix leaking OID arrays in git:// transport data
The transport data for the "git://" protocol contains two OID arrays
that we never free, creating a memory leak. Plug them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-22 09:18:05 -07:00
Patrick Steinhardt
169c979771 hooks: remove implicit dependency on the_repository
We implicitly depend on `the_repository` in our hook subsystem because
we use `strbuf_git_path()` to compute hook paths. Remove this dependency
by accepting a `struct repository` as parameter instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-13 10:01:01 -07:00
Junio C Hamano
448d51d549 transport: fix leak with transport helper URLs
Transport URLs can be prefixed with "foo::", which would tell us that
the transport uses a remote helper called "foo". We extract the helper
name by `xstrndup()`ing the prefix before the double-colons, but never
free that string.

Fix this leak by assigning the result to a separate local variable that
we can then free upon returning.

Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-07 17:38:31 -07:00
Junio C Hamano
125e389470 Merge branch 'xx/bundie-uri-fixes'
When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.

* xx/bundie-uri-fixes:
  unbundle: extend object verification for fetches
  fetch-pack: expose fsckObjects configuration logic
  bundle-uri: verify oid before writing refs
2024-07-08 14:53:11 -07:00
Junio C Hamano
ca463101c8 Merge branch 'jk/remote-wo-url'
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.

* jk/remote-wo-url:
  remote: drop checks for zero-url case
  remote: always require at least one url in a remote
  t5801: test remote.*.vcs config
  t5801: make remote-testgit GIT_DIR setup more robust
  remote: allow resetting url list
  config: document remote.*.url/pushurl interaction
  remote: simplify url/pushurl selection
  remote: use strvecs to store remote url/pushurl
  remote: transfer ownership of memory in add_url(), etc
  remote: refactor alias_url() memory ownership
  archive: fix check for missing url
2024-07-02 09:59:01 -07:00
Xing Xin
63d903ff52 unbundle: extend object verification for fetches
The existing fetch.fsckObjects and transfer.fsckObjects configurations
were not fully applied to bundle-involved fetches, including direct
bundle fetches and bundle-uri enabled fetches. Furthermore, there was no
object verification support for unbundle.

This commit extends object verification support in `bundle.c:unbundle`
by adding the `VERIFY_BUNDLE_FSCK` option to `verify_bundle_flags`. When
this option is enabled, we append the `--fsck-objects` flag to
`git-index-pack`.

The `VERIFY_BUNDLE_FSCK` option is now used by bundle-involved fetches,
where we use `fetch-pack.c:fetch_pack_fsck_objects` to determine whether
to enable this option for `bundle.c:unbundle`, specifically in:

- `transport.c:fetch_refs_from_bundle` for direct bundle fetches.
- `bundle-uri.c:unbundle_from_file` for bundle-uri enabled fetches.

This addition ensures a consistent logic for object verification during
fetches. Tests have been added to confirm functionality in the scenarios
mentioned above.

Reviewed-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-20 10:30:08 -07:00
Patrick Steinhardt
e7da938570 global: introduce USE_THE_REPOSITORY_VARIABLE macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
Jeff King
aecd794fca remote: drop checks for zero-url case
Now that the previous commit removed the possibility that a "struct
remote" will ever have zero url fields, we can drop a number of
redundant checks and untriggerable code paths.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:39 -07:00
Jeff King
8e804415fd remote: use strvecs to store remote url/pushurl
Now that the url/pushurl fields of "struct remote" own their strings, we
can switch from bare arrays to strvecs. This has a few advantages:

  - push/clear are now one-liners

  - likewise the free+assigns in alias_all_urls() can use
    strvec_replace()

  - we now use size_t for storage, avoiding possible overflow

  - this will enable some further cleanups in future patches

There's quite a bit of fallout in the code that reads these fields, as
it tends to access these arrays directly. But it's mostly a mechanical
replacement of "url_nr" with "url.nr", and "url[i]" with "url.v[i]",
with a few variations (e.g. "*url" could become "*url.v", but I used
"url.v[0]" for consistency).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 09:34:38 -07:00
Patrick Steinhardt
97613b9cb9 transport-helper: fix leaking helper name
When initializing the transport helper in `transport_get()`, we
allocate the name of the helper. We neither end up transferring
ownership of the name, nor do we free it. The associated memory thus
leaks.

Fix this memory leak by freeing the string at the calling side in
`transport_get()`. `transport_helper_init()` now creates its own copy of
the string and thus can free it as required.

An alterantive way to fix this would be to transfer ownership of the
string passed into `transport_helper_init()`, which would avoid the call
to xstrdup(1). But it does make for a more surprising calling convention
as we do not typically transfer ownership of strings like this.

Mark now-passing tests as leak free.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-27 11:19:57 -07:00
Patrick Steinhardt
2e5c4758b7 cocci: apply rules to rewrite callers of "refs" interfaces
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
Junio C Hamano
951eafe36f Merge branch 'es/some-up-to-date-messages-must-stay'
Comment updates to help developers not to attempt to modify
messages from plumbing commands that must stay constant.

It might make sense to reassess the plumbing needs every few years,
but that should be done as a separate effort.

* es/some-up-to-date-messages-must-stay:
  messages: mark some strings with "up-to-date" not to touch
2024-01-26 08:54:45 -08:00
Junio C Hamano
80bdaba894 messages: mark some strings with "up-to-date" not to touch
The treewide clean-up of "up-to-date" strings done in 7560f547
(treewide: correct several "up-to-date" to "up to date", 2017-08-23)
deliberately left some out, but unlike the lines that were changed
by the commit, the lines that were deliberately left untouched by
the commit is impossible to ask "git blame" to link back to the
commit that did not touch them.

Let's do the second best thing, leave a short comment near them
explaining why those strings should not be modified or localized.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
[es: make in-code comment more developer-friendly]
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-12 10:20:01 -08:00
Elijah Newren
eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
Calvin Wan
91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Calvin Wan
da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
Elijah Newren
a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Junio C Hamano
80d268f309 Merge branch 'jk/protocol-cap-parse-fix'
The code to parse capability list for v0 on-wire protocol fell into
an infinite loop when a capability appears multiple times, which
has been corrected.

* jk/protocol-cap-parse-fix:
  v0 protocol: use size_t for capability length/offset
  t5512: test "ls-remote --heads --symref" filtering with v0 and v2
  t5512: allow any protocol version for filtered symref test
  t5512: add v2 support for "ls-remote --symref" test
  v0 protocol: fix sha1/sha256 confusion for capabilities^{}
  t5512: stop referring to "v1" protocol
  v0 protocol: fix infinite loop when parsing multi-valued capabilities
2023-04-25 13:56:20 -07:00
Junio C Hamano
0807e57807 Merge branch 'en/header-split-cache-h'
Header clean-up.

* en/header-split-cache-h: (24 commits)
  protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
  mailmap, quote: move declarations of global vars to correct unit
  treewide: reduce includes of cache.h in other headers
  treewide: remove double forward declaration of read_in_full
  cache.h: remove unnecessary includes
  treewide: remove cache.h inclusion due to pager.h changes
  pager.h: move declarations for pager.c functions from cache.h
  treewide: remove cache.h inclusion due to editor.h changes
  editor: move editor-related functions and declarations into common file
  treewide: remove cache.h inclusion due to object.h changes
  object.h: move some inline functions and defines from cache.h
  treewide: remove cache.h inclusion due to object-file.h changes
  object-file.h: move declarations for object-file.c functions from cache.h
  treewide: remove cache.h inclusion due to git-zlib changes
  git-zlib: move declarations for git-zlib functions from cache.h
  treewide: remove cache.h inclusion due to object-name.h changes
  object-name.h: move declarations for object-name.c functions from cache.h
  treewide: remove unnecessary cache.h inclusion
  treewide: be explicit about dependence on mem-pool.h
  treewide: be explicit about dependence on oid-array.h
  ...
2023-04-25 13:56:20 -07:00
Jeff King
7ce4c8f752 v0 protocol: use size_t for capability length/offset
When parsing server capabilities, we use "int" to store lengths and
offsets. At first glance this seems like a spot where our parser may be
confused by integer overflow if somebody sent us a malicious response.

In practice these strings are all bounded by the 64k limit of a
pkt-line, so using "int" is OK. However, it makes the code simpler to
audit if they just use size_t everywhere. Note that because we take
these parameters as pointers, this also forces many callers to update
their declared types.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-14 15:08:13 -07:00
Elijah Newren
d812c3b6a0 treewide: remove cache.h inclusion due to object.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren
dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren
6c6ddf92d5 treewide: be explicit about dependence on advice.h
Dozens of files made use of advice functions, without explicitly
including advice.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
advice.h if they are using it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren
74ea5c9574 treewide: be explicit about dependence on trace.h & trace2.h
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h.  This made it more difficult
to find which files could remove a dependence on cache.h.  Make C files
explicitly include trace.h or trace2.h if they are using them.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:08 -07:00
Junio C Hamano
6047b28eb7 Merge branch 'en/header-split-cleanup'
Split key function and data structure definitions out of cache.h to
new header files and adjust the users.

* en/header-split-cleanup:
  csum-file.h: remove unnecessary inclusion of cache.h
  write-or-die.h: move declarations for write-or-die.c functions from cache.h
  treewide: remove cache.h inclusion due to setup.h changes
  setup.h: move declarations for setup.c functions from cache.h
  treewide: remove cache.h inclusion due to environment.h changes
  environment.h: move declarations for environment.c functions from cache.h
  treewide: remove unnecessary includes of cache.h
  wrapper.h: move declarations for wrapper.c functions from cache.h
  path.h: move function declarations for path.c functions from cache.h
  cache.h: remove expand_user_path()
  abspath.h: move absolute path functions from cache.h
  environment: move comment_line_char from cache.h
  treewide: remove unnecessary cache.h inclusion from several sources
  treewide: remove unnecessary inclusion of gettext.h
  treewide: be explicit about dependence on gettext.h
  treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06 13:38:31 -07:00
Junio C Hamano
72871b198f Merge branch 'ab/remove-implicit-use-of-the-repository'
Code clean-up around the use of the_repository.

* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-06 13:38:30 -07:00
Junio C Hamano
9bc647a2d1 Merge branch 'jk/unused-post-2.40'
More "-Wunused-parameters" code clean-up.

* jk/unused-post-2.40:
  transport: mark unused parameters in fetch_refs_from_bundle()
  http: mark unused parameter in fill_active_slot() callbacks
  http: drop unused parameter from start_object_request()
  mailmap: drop debugging code
2023-04-06 13:38:28 -07:00
Junio C Hamano
e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Junio C Hamano
f879501ad0 Merge branch 'jk/fix-proto-downgrade-to-v0'
Transports that do not support protocol v2 did not correctly fall
back to protocol v0 under certain conditions, which has been
corrected.

* jk/fix-proto-downgrade-to-v0:
  git_connect(): fix corner cases in downgrading v2 to v0
2023-03-28 10:51:52 -07:00
Ævar Arnfjörð Bjarmason
d850b7a545 cocci: apply the "cache.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"cache.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
Elijah Newren
32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
d5ebb50dcb wrapper.h: move declarations for wrapper.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
a6dc3d364c treewide: remove unnecessary cache.h inclusion from a few headers
Ever since a64215b6cd ("object.h: stop depending on cache.h; make
cache.h depend on object.h", 2023-02-24), we have a few headers that
could have replaced their include of cache.h with an include of
object.h.  Make that change now.

Some C files had to start including cache.h after this change (or some
smaller header it had brought in), because the C files were depending
on things from cache.h but were only formerly implicitly getting
cache.h through one of these headers being modified in this patch.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:50 -07:00
Jeff King
eaa0fd6584 git_connect(): fix corner cases in downgrading v2 to v0
There's code in git_connect() that checks whether we are doing a push
with protocol_v2, and if so, drops us to protocol_v0 (since we know
how to do v2 only for fetches). But it misses some corner cases:

  1. it checks the "prog" variable, which is actually the path to
     receive-pack on the remote side. By default this is just
     "git-receive-pack", but it could be an arbitrary string (like
     "/path/to/git receive-pack", etc). We'd accidentally stay in v2
     mode in this case.

  2. besides "receive-pack" and "upload-pack", there's one other value
     we'd expect: "upload-archive" for handling "git archive --remote".
     Like receive-pack, this doesn't understand v2, and should use the
     v0 protocol.

In practice, neither of these causes bugs in the real world so far. We
do send a "we understand v2" probe to the server, but since no server
implements v2 for anything but upload-pack, it's simply ignored. But
this would eventually become a problem if we do implement v2 for those
endpoints, as older clients would falsely claim to understand it,
leading to a server response they can't parse.

We can fix (1) by passing in both the program path and the "name" of the
operation. I treat the name as a string here, because that's the pattern
set in transport_connect(), which is one of our callers (we were simply
throwing away the "name" value there before).

We can fix (2) by allowing only known-v2 protocols ("upload-pack"),
rather than blocking unknown ones ("receive-pack" and "upload-archive").
That will mean whoever eventually implements v2 push will have to adjust
this list, but that's reasonable. We'll do the safe, conservative thing
(sticking to v0) by default, and anybody working on v2 will quickly
realize this spot needs to be updated.

The new tests cover the receive-pack and upload-archive cases above, and
re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that
already worked before this patch, of course, but it would be an easy
thing to break if we flipped the allow/block logic without also handling
"name" separately).

Here are a few miscellaneous implementation notes, since I had to do a
little head-scratching to understand who calls what:

  - transport_connect() is called only for git-upload-archive. For
    non-http git remotes, that resolves to the virtual connect_git()
    function (which then calls git_connect(); confused yet?). So
    plumbing through "name" in connect_git() covers that.

  - for regular fetches and pushes, callers use higher-level functions
    like transport_fetch_refs(). For non-http git remotes, that means
    calling git_connect() under the hood via connect_setup(). And that
    uses the "for_push" flag to decide which name to use.

  - likewise, plumbing like fetch-pack and send-pack may call
    git_connect() directly; they each know which name to use.

  - for remote helpers (including http), we already have separate
    parameters for "name" and "exec" (another name for "prog"). In
    process_connect_service(), we feed the "name" to the helper via
    "connect" or "stateless-connect" directives.

    There's also a "servpath" option, which can be used to tell the
    helper about the "exec" path. But no helpers we implement support
    it! For http it would be useless anyway (no reasonable server
    implementation will allow you to send a shell command to run the
    server). In theory it would be useful for more obscure helpers like
    remote-ext, but even there it is not implemented.

    It's tempting to get rid of it simply to reduce confusion, but we
    have publicly documented it since it was added in fa8c097cc9
    (Support remote helpers implementing smart transports, 2009-12-09),
    so it's possible some helper in the wild is using it.

  - So for v2, helpers (again, including http) are mainly used via
    stateless-connect, driven by the main program. But they do still
    need to decide whether to do a v2 probe. And so there's similar
    logic in remote-curl.c's discover_refs() that looks for
    "git-receive-pack". But it's not buggy in the same way. Since it
    doesn't support servpath, it is always dealing with a "service"
    string like "git-receive-pack". And since it doesn't support
    straight "connect", it can't be used for "upload-archive".

    So we could leave that spot alone. But I've updated it here to match
    the logic we're changing in connect_git(). That seems like the least
    confusing thing for somebody who has to touch both of these spots
    later (say, to add v2 push support). I didn't add a new test to make
    sure this doesn't break anything; we already have several tests (in
    t5551 and elsewhere) that make sure we are using v2 over http.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 15:15:59 -07:00
Jeff King
b3edf335df transport: mark unused parameters in fetch_refs_from_bundle()
We don't look at the "to_fetch" or "nr_heads" parameters at all. At
first glance this seems like a bug (or at least pessimisation), because
it means we fetch more objects from the bundle than we actually need.
But the bundle does not have any way of computing the set of reachable
objects itself (we'd have to pull all of the objects out to walk them).
And anyway, we've probably already paid most of the cost of grabbing the
objects, since we must copy the bundle locally before accessing it.

So it's perfectly reasonable for the bundle code to just pull everything
into the local object store. Unneeded objects can be dropped later via
gc, etc.

But we should mark these unused parameters as such to avoid the wrath of
-Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 14:17:48 -07:00
Elijah Newren
41771fa435 cache.h: remove dependence on hex.h; make other files include it explicitly
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
Elijah Newren
36bf195890 alloc.h: move ALLOC_GROW() functions from cache.h
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places.  It does mean that we also need to add
includes of alloc.h in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
Junio C Hamano
0903d8bbde Merge branch 'ds/bundle-uri-4'
Bundle URIs part 4.

* ds/bundle-uri-4:
  clone: unbundle the advertised bundles
  bundle-uri: download bundles from an advertised list
  bundle-uri: allow relative URLs in bundle lists
  strbuf: introduce strbuf_strip_file_from_path()
  bundle-uri: serve bundle.* keys from config
  bundle-uri client: add helper for testing server
  transport: rename got_remote_heads
  bundle-uri client: add boolean transfer.bundleURI setting
  clone: request the 'bundle-uri' command when available
  t: create test harness for 'bundle-uri' command
  protocol v2: add server-side "bundle-uri" skeleton
2023-01-02 21:37:18 +09:00
Derrick Stolee
ebc3947955 bundle-uri: allow relative URLs in bundle lists
Bundle providers may want to distribute that data across multiple CDNs.
This might require a change in the base URI, all the way to the domain
name. If all bundles require an absolute URI in their 'uri' value, then
every push to a CDN would require altering the table of contents to
match the expected domain and exact location within it.

Allow a bundle list to specify a relative URI for the bundles. This URI
is based on where the client received the bundle list. For a list
provided in the 'bundle-uri' protocol v2 command, the Git remote URI is
the base URI. Otherwise, the bundle list was provided from an HTTP URI
not using the Git protocol, and that URI is the base URI. This allows
easier distribution of bundle data.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
Ævar Arnfjörð Bjarmason
70b9c10373 bundle-uri client: add helper for testing server
Add a 'test-tool bundle-uri ls-remote' command. This is a thin wrapper
for issuing protocol v2 "bundle-uri" commands to a server, and to the
parsing routines in bundle-uri.c.

In the "git clone" case we'll have already done the handshake(),
but not here. Add an extra case to check for this handshake in
get_bundle_uri() for ease of use for future callers.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
Derrick Stolee
1b759e0cf1 transport: rename got_remote_heads
The 'got_remote_heads' member of 'struct git_transport_data' was used
historically to indicate that the initial server connection was made and
the ref advertisement was returned. With protocol v2, that initial
handshake does not necessarily include the ref advertisement, so this
member is not an accurate name. Thankfully, all uses of the member are
only checking to see if the handshake should take place, not whether or
not some local data has the ref advertisement.

Rename the member to 'finished_handshake' to represent the proper state.
Note that the variable is only set to 1 during the handshake() method.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:24 +09:00
Ævar Arnfjörð Bjarmason
7cce9074a7 bundle-uri client: add boolean transfer.bundleURI setting
The yet-to-be introduced client support for bundle-uri will always
fall back on a full clone, but we'd still like to be able to ignore a
server's bundle-uri advertisement entirely.

The new transfer.bundleURI config option defaults to 'false', but a user
can set it to 'true' to enable checking for bundle URIs from the origin
Git server using protocol v2.

Co-authored-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00
Ævar Arnfjörð Bjarmason
0cfde740f0 clone: request the 'bundle-uri' command when available
Set up all the needed client parts of the 'bundle-uri' protocol v2
command, without actually doing anything with the bundle URIs.

If the server says it supports 'bundle-uri' teach Git to issue the
'bundle-uri' command after the 'ls-refs' during 'git clone'. The
returned key=value pairs are passed to the bundle list code which is
tested using a different ingest mechanism in t5750-bundle-uri-parse.sh.

At this point, Git does nothing with that bundle list. It will not
download any of the bundles. That will come in a later change after
these protocol bits are finalized.

The no-op client is initially used only by 'git clone' to test the basic
functionality, and eventually will bootstrap the initial download of Git
objects during a fresh clone. The bundle URI client will not be
integrated into other fetches until a mechanism is created to select a
subset of bundles for download.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:24:23 +09:00