mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-10-31 06:17:56 +01:00

Author	SHA1	Message	Date
Junio C Hamano	2fb346c06a	Merge branch 'js/packet-read-line-check-null' Some low level protocol codepath could crash when they get an unexpected flush packet, which is now fixed. * js/packet-read-line-check-null: always check for NULL return from packet_read_line() correct error messages for NULL packet_read_line()	2018-02-27 10:33:55 -08:00
Junio C Hamano	6bed209a20	Merge branch 'jh/partial-clone' The machinery to clone & fetch, which in turn involves packing and unpacking objects, have been told how to omit certain objects using the filtering mechanism introduced by the jh/object-filtering topic, and also mark the resulting pack as a promisor pack to tolerate missing objects, taking advantage of the mechanism introduced by the jh/fsck-promisors topic. * jh/partial-clone: t5616: test bulk prefetch after partial fetch fetch: inherit filter-spec from partial clone t5616: end-to-end tests for partial clone fetch-pack: restore save_commit_buffer after use unpack-trees: batch fetching of missing blobs clone: partial clone partial-clone: define partial clone settings in config fetch: support filters fetch: refactor calculation of remote list fetch-pack: test support excluding large blobs fetch-pack: add --no-filter fetch-pack, index-pack, transport: partial clone upload-pack: add object filtering for partial clone	2018-02-13 13:39:04 -08:00
Junio C Hamano	f3d618d2bf	Merge branch 'jh/fsck-promisors' In preparation for implementing narrow/partial clone, the machinery for checking object connectivity used by gc and fsck has been taught that a missing object is OK when it is referenced by a packfile specially marked as coming from trusted repository that promises to make them available on-demand and lazily. * jh/fsck-promisors: gc: do not repack promisor packfiles rev-list: support termination at promisor objects sha1_file: support lazily fetching missing objects introduce fetch-object: fetch one promisor object index-pack: refactor writing of .keep files fsck: support promisor objects as CLI argument fsck: support referenced promisor objects fsck: support refs pointing to promisor objects fsck: introduce partialclone extension extension.partialclone: introduce partial clone extension	2018-02-13 13:39:03 -08:00
Jeff King	bc9d4dc5b0	correct error messages for NULL packet_read_line() The packet_read_line() function dies if it gets an unexpected EOF. It only returns NULL if we get a flush packet (or technically, a zero-length "0004" packet, but nobody is supposed to send those, and they are indistinguishable from a flush in this interface). Let's correct error messages which claim an unexpected EOF; it's really an unexpected flush packet. While we're here, let's also check "!line" instead of "!len" in the second case. The two events should always coincide, but checking "!line" makes it more obvious that we are not about to dereference NULL. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-08 12:37:30 -08:00
Jonathan Tan	a1c6d7c1a7	fetch-pack: restore save_commit_buffer after use In fetch-pack, the global variable save_commit_buffer is set to 0, but not restored to its original value after use. In particular, if show_log() (in log-tree.c) is invoked after fetch_pack() in the same process, show_log() will return before printing out the commit message (because the invocation to get_cached_commit_buffer() returns NULL, because the commit buffer was not saved). I discovered this when attempting to run "git log -S" in a partial clone, triggering the case where revision walking lazily loads missing objects. Therefore, restore save_commit_buffer to its original value after use. An alternative to solve the problem I had is to replace get_cached_commit_buffer() with get_commit_buffer(). That invocation was introduced in commit `a97934d` ("use get_cached_commit_buffer where appropriate", 2014-06-13) to replace "commit->buffer" introduced in commit `3131b71` ("Add "--show-all" revision walker flag for debugging", 2008-02-13). In the latter commit, the commit author seems to be deciding between not showing an unparsed commit at all and showing an unparsed commit without the message (which is what the commit does), and did not mention parsing the unparsed commit, so I prefer to preserve the existing behavior. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:58:52 -08:00
Jeff Hostetler	640d8b72fe	fetch-pack, index-pack, transport: partial clone Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:58:51 -08:00
Junio C Hamano	79bafd23a8	Merge branch 'jk/fewer-pack-rescan' Internaly we use 0{40} as a placeholder object name to signal the codepath that there is no such object (e.g. the fast-forward check while "git fetch" stores a new remote-tracking ref says "we know there is no 'old' thing pointed at by the ref, as we are creating it anew" by passing 0{40} for the 'old' side), and expect that a codepath to locate an in-core object to return NULL as a sign that the object does not exist. A look-up for an object that does not exist however is quite costly with a repository with large number of packfiles. This access pattern has been optimized. * jk/fewer-pack-rescan: sha1_file: fast-path null sha1 as a missing object everything_local: use "quick" object existence check p5551: add a script to test fetch pack-dir rescans t/perf/lib-pack: use fast-import checkpoint to create packs p5550: factor out nonsense-pack creation	2017-12-06 09:23:42 -08:00
Jonathan Tan	88e2f9ed8e	introduce fetch-object: fetch one promisor object Introduce fetch-object, providing the ability to fetch one object from a promisor remote. This uses fetch-pack. To do this, the transport mechanism has been updated with 2 flags, "from-promisor" to indicate that the resulting pack comes from a promisor remote (and thus should be annotated as such by index-pack), and "no-dependents" to indicate that only the objects themselves need to be fetched (but fetching additional objects is nevertheless safe). Whenever "no-dependents" is used, fetch-pack will refrain from using any object flags, because it is most likely invoked as part of a dynamic object fetch by another Git command (which may itself use object flags). An alternative to this is to leave fetch-pack alone, and instead update the allocation of flags so that fetch-pack's flags never overlap with any others, but this will end up shrinking the number of flags available to nearly every other Git command (that is, every Git command that accesses objects), so the approach in this commit was used instead. This will be tested in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-05 09:46:05 -08:00
Jeff King	c291293b2e	everything_local: use "quick" object existence check In `b495697b82` (fetch-pack: avoid repeatedly re-scanning pack directory, 2013-01-26), we noticed that everything_local() could waste time trying to find and parse objects which we _expect_ to be missing. The solution was to put has_sha1_file() in front of parse_object() to skip the more-expensive parse attempt. That optimization was negated later when has_sha1_file() learned to do the same re-scan in `45e8a74873` (has_sha1_file: re-check pack directory before giving up, 2013-08-30). We can restore it by using the "quick" flag to tell has_sha1_file (actually has_object_file these days) that we prefer speed to thoroughness for this call. See also the fixes in `5827a0354` and `0eeb077be7` for prior art and discussion on using the "quick" flag for these cases. The recently-added performance regression test in p5551 demonstrates the problem. You can see the original fix: Test b495697b82^ `b495697b82` -------------------------------------------------------- 5551.4: fetch 1.68(1.33+0.35) 0.87(0.69+0.18) -48.2% and then the regression: Test 45e8a74873^ `45e8a74873` --------------------------------------------------------- 5551.4: fetch 0.96(0.77+0.19) 2.55(2.04+0.50) +165.6% and now our fix: Test HEAD^ HEAD -------------------------------------------------------- 5551.4: fetch 7.21(6.58+0.63) 5.47(5.04+0.43) -24.1% You can also see that other things have gotten a lot slower since 2013. We'll deal with those in separate patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-21 11:30:41 +09:00
Jonathan Tan	9e6fabde82	oidmap: map with OID as key This is similar to using the hashmap in hashmap.c, but with an easier-to-use API. In particular, custom entry comparisons no longer need to be written, and lookups can be done without constructing a temporary entry structure. This is implemented as a thin wrapper over the hashmap API. In particular, this means that there is an additional 4-byte overhead due to the fact that the first 4 bytes of the hash is redundantly stored. For now, I'm taking the simpler approach, but if need be, we can reimplement oidmap without affecting the callers significantly. oidset has been updated to use oidmap. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:18:03 +09:00
Jonathan Tan	0abe14f6a5	pack: move {,re}prepare_packed_git and approximate_object_count Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	02c531eba2	Merge branch 'jt/fetch-allow-tip-sha1-implicitly' There is no good reason why "git fetch $there $sha1" should fail when the $sha1 names an object at the tip of an advertised ref, even when the other side hasn't enabled allowTipSHA1InWant. * jt/fetch-allow-tip-sha1-implicitly: fetch-pack: always allow fetching of literal SHA1s	2017-05-30 11:16:43 +09:00
Junio C Hamano	6b526ced6f	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...	2017-05-29 12:34:43 +09:00
Junio C Hamano	b15667bbdc	Merge branch 'js/larger-timestamps' Some platforms have ulong that is smaller than time_t, and our historical use of ulong for timestamp would mean they cannot represent some timestamp that the platform allows. Invent a separate and dedicated timestamp_t (so that we can distingiuish timestamps and a vanilla ulongs, which along is already a good move), and then declare uintmax_t is the type to be used as the timestamp_t. * js/larger-timestamps: archive-tar: fix a sparse 'constant too large' warning use uintmax_t for timestamps date.c: abort if the system time cannot handle one of our timestamps timestamp_t: a new data type for timestamps PRItime: introduce a new "printf format" for timestamps parse_timestamp(): specify explicitly where we parse timestamps t0006 & t5000: skip "far in the future" test when time_t is too limited t0006 & t5000: prepare for 64-bit timestamps ref-filter: avoid using `unsigned long` for catch-all data type	2017-05-16 11:51:59 +09:00
Jonathan Tan	fdb69d33c4	fetch-pack: always allow fetching of literal SHA1s fetch-pack, when fetching a literal SHA-1 from a server that is not configured with uploadpack.allowtipsha1inwant (or similar), always returns an error message of the form "Server does not allow request for unadvertised object %s". However, it is sometimes the case that such object is advertised. This situation would occur, for example, if a user or a script was provided a SHA-1 instead of a branch or tag name for fetching, and wanted to invoke "git fetch" or "git fetch-pack" using that SHA-1. Teach fetch-pack to also check the SHA-1s of the refs in the received ref advertisement if a literal SHA-1 was given by the user. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-16 10:17:05 +09:00
brian m. carlson	c251c83df2	object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:58 +09:00
brian m. carlson	bc83266abe	Convert lookup_commit* to struct object_id Convert lookup_commit, lookup_commit_or_die, lookup_commit_reference, and lookup_commit_reference_gently to take struct object_id arguments. Introduce a temporary in parse_object buffer in order to convert this function. This is required since in order to convert parse_object and parse_object_buffer, lookup_commit_reference_gently and lookup_commit_or_die would need to be converted. Not introducing a temporary would therefore require that lookup_commit_or_die take a struct object_id , but lookup_commit would take unsigned char , leaving a confusing and hard-to-use interface. parse_object_buffer will lose this temporary in a later patch. This commit was created with manual changes to commit.c, commit.h, and object.c, plus the following semantic patch: @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1.hash, E2) + lookup_commit_reference_gently(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_reference_gently(E1->hash, E2) + lookup_commit_reference_gently(E1, E2) @@ expression E1; @@ - lookup_commit_reference(E1.hash) + lookup_commit_reference(&E1) @@ expression E1; @@ - lookup_commit_reference(E1->hash) + lookup_commit_reference(E1) @@ expression E1; @@ - lookup_commit(E1.hash) + lookup_commit(&E1) @@ expression E1; @@ - lookup_commit(E1->hash) + lookup_commit(E1) @@ expression E1, E2; @@ - lookup_commit_or_die(E1.hash, E2) + lookup_commit_or_die(&E1, E2) @@ expression E1, E2; @@ - lookup_commit_or_die(E1->hash, E2) + lookup_commit_or_die(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	e92b848cb6	shallow: convert shallow registration functions to object_id Convert register_shallow and unregister_shallow to take struct object_id. register_shallow is a caller of lookup_commit, which we will convert later. It doesn't make sense for the registration and unregistration functions to have incompatible interfaces, so convert them both. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 15:12:57 +09:00
brian m. carlson	1b283377b1	fetch-pack: convert to struct object_id Convert all uses of unsigned char [20] to struct object_id. Switch one use of get_sha1_hex to parse_oid_hex to avoid the need for a constant. This change is necessary in order to convert parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-02 10:46:41 +09:00
Johannes Schindelin	dddbad728c	timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-27 13:07:39 +09:00
Junio C Hamano	5938454cbc	Merge branch 'dt/xgethostname-nul-termination' gethostname(2) may not NUL terminate the buffer if hostname does not fit; unfortunately there is no easy way to see if our buffer was too small, but at least this will make sure we will not end up using garbage past the end of the buffer. * dt/xgethostname-nul-termination: xgethostname: handle long hostnames use HOST_NAME_MAX to size buffers for gethostname(2)	2017-04-23 22:07:57 -07:00
Junio C Hamano	d2617eb984	Merge branch 'jt/fetch-pack-error-reporting' "git fetch-pack" was not prepared to accept ERR packet that the upload-pack can send with a human-readable error message. It showed the packet contents with ERR prefix, so there was no data loss, but it was redundant to say "ERR" in an error message. * jt/fetch-pack-error-reporting: fetch-pack: show clearer error message upon ERR	2017-04-23 22:07:53 -07:00
Johannes Schindelin	cb71f8bdb5	PRItime: introduce a new "printf format" for timestamps Currently, Git's source code treats all timestamps as if they were unsigned longs. Therefore, it is okay to write "%lu" when printing them. There is a substantial problem with that, though: at least on Windows, time_t is larger than unsigned long, and hence we will want to switch away from the ill-specified `unsigned long` data type. So let's introduce the pseudo format "PRItime" (currently simply being defined to "lu") to make it easier to change the data type used for timestamps. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-23 20:19:15 -07:00
David Turner	5781a9a270	xgethostname: handle long hostnames If the full hostname doesn't fit in the buffer supplied to gethostname, POSIX does not specify whether the buffer will be null-terminated, so to be safe, we should do it ourselves. Introduce new function, xgethostname, which ensures that there is always a \0 at the end of the buffer. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 19:58:04 -07:00
René Scharfe	da25bdb776	use HOST_NAME_MAX to size buffers for gethostname(2) POSIX limits the length of host names to HOST_NAME_MAX. Export the fallback definition from daemon.c and use this constant to make all buffers used with gethostname(2) big enough for any possible result and a terminating NUL. Inspired-by: David Turner <dturner@twosigma.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 19:57:41 -07:00
Jonathan Tan	8e2c7bef03	fetch-pack: show clearer error message upon ERR Currently, fetch-pack prints a confusing error message ("expected ACK/NAK") when the server it's communicating with sends a pkt-line starting with "ERR". Replace it with a less confusing error message. Also update the documentation describing the fetch-pack/upload-pack protocol (pack-protocol.txt) to indicate that "ERR" can be sent in the place of "ACK" or "NAK". In practice, this has been done for quite some time by other Git implementations (e.g. JGit sends "want $id not valid") and by Git itself (since commit `bdb31ea`: "upload-pack: report "not our ref" to client", 2017-02-23) whenever a "want" line references an object that it does not have. (This is uncommon, but can happen if a repository is garbage-collected during a negotiation.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-17 18:51:28 -07:00
brian m. carlson	910650d2f8	Rename sha1_array to oid_array Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-31 08:33:56 -07:00
brian m. carlson	98a72ddc12	Make sha1_array_append take a struct object_id * Convert the callers to pass struct object_id by changing the function declaration and definition and applying the following semantic patch: @@ expression E1, E2; @@ - sha1_array_append(E1, E2.hash) + sha1_array_append(E1, &E2) @@ expression E1, E2; @@ - sha1_array_append(E1, E2->hash) + sha1_array_append(E1, E2) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-31 08:33:55 -07:00
brian m. carlson	ee3051bd23	sha1-array: convert internal storage for struct sha1_array to object_id Make the internal storage for struct sha1_array use an array of struct object_id internally. Update the users of this struct which inspect its internals. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-28 09:59:34 -07:00
Junio C Hamano	07198afbd1	Merge branch 'mm/fetch-show-error-message-on-unadvertised-object' "git fetch" that requests a commit by object name, when the other side does not allow such an request, failed without much explanation. * mm/fetch-show-error-message-on-unadvertised-object: fetch-pack: add specific error for fetching an unadvertised object fetch_refs_via_pack: call report_unmatched_refs fetch-pack: move code to report unmatched refs to a function	2017-03-14 15:23:18 -07:00
Matt McCutchen	d56583ded6	fetch-pack: add specific error for fetching an unadvertised object Enhance filter_refs (which decides whether a request for an unadvertised object should be sent to the server) to record a new match status on the "struct ref" when a request is not allowed, and have report_unmatched_refs check for this status and print a special error message, "Server does not allow request for unadvertised object". Signed-off-by: Matt McCutchen <matt@mattmccutchen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:12:53 -08:00
Matt McCutchen	e860d96bf8	fetch-pack: move code to report unmatched refs to a function Prepare to reuse this code in transport.c for "git fetch". While we're here, internationalize the existing error message. Signed-off-by: Matt McCutchen <matt@mattmccutchen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-02 11:12:53 -08:00
Jeff King	41a078c60b	fetch-pack: cache results of for_each_alternate_ref We may run for_each_alternate_ref() twice, once in find_common() and once in everything_local(). This operation can be expensive, because it involves running a sub-process which must freshly load all of the alternate's refs from disk. Let's cache and reuse the results between the two calls. We can make some optimizations based on the particular use pattern in fetch-pack to keep our memory usage down. The first is that we only care about the sha1s, not the refs themselves. So it's OK to store only the sha1s, and to suppress duplicates. The natural fit would therefore be a sha1_array. However, sha1_array's de-duplication happens only after it has read and sorted all entries. It still stores each duplicate. For an alternate with a large number of refs pointing to the same commits, this is a needless expense. Instead, we'd prefer to eliminate duplicates before putting them in the cache, which implies using a hash. We can further note that fetch-pack will call parse_object() on each alternate sha1. We can therefore keep our cache as a set of pointers to "struct object". That gives us a place to put our "already seen" bit with an optimized hash lookup. And as a bonus, the object stores the sha1 for us, so pointer-to-object is all we need. There are two extra optimizations I didn't do here: - we actually store an array of pointer-to-object. Technically we could just walk the obj_hash table looking for entries with the ALTERNATE flag set (because our use case doesn't care about the order here). But that hash table may be mostly composed of non-ALTERNATE entries, so we'd waste time walking over them. So it would be a slight win in memory use, but a loss in CPU. - the items we pull out of the cache are actual "struct object"s, but then we feed "obj->sha1" to our sub-functions, which promptly call parse_object(). This second parse is cheap, because it starts with lookup_object() and will bail immediately when it sees we've already parsed the object. We could save the extra hash lookup, but it would involve refactoring the functions we call. It may or may not be worth the trouble. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-08 15:39:55 -08:00
Jeff King	2429d63a46	for_each_alternate_ref: pass name/oid instead of ref struct Breaking down the fields in the interface makes it easier to change the backend of for_each_alternate_ref to something that doesn't use "struct ref" internally. The only field that callers actually look at is the oid, anyway. The refname is kept in the interface as a plausible thing for future code to want. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-08 15:39:55 -08:00
Ralf Thielow	dfbfb9f377	fetch-pack.c: correct command at the beginning of an error message One error message in fetch-pack.c uses 'git fetch_pack' at the beginning which is not a git command. Use 'git fetch-pack' instead. Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-11 13:28:39 -08:00
Junio C Hamano	a460ea4a3c	Merge branch 'nd/shallow-deepen' The existing "git fetch --depth=<n>" option was hard to use correctly when making the history of an existing shallow clone deeper. A new option, "--deepen=<n>", has been added to make this easier to use. "git clone" also learned "--shallow-since=<date>" and "--shallow-exclude=<tag>" options to make it easier to specify "I am interested only in the recent N months worth of history" and "Give me only the history since that version". * nd/shallow-deepen: (27 commits) fetch, upload-pack: --deepen=N extends shallow boundary by N commits upload-pack: add get_reachable_list() upload-pack: split check_unreachable() in two, prep for get_reachable_list() t5500, t5539: tests for shallow depth excluding a ref clone: define shallow clone boundary with --shallow-exclude fetch: define shallow boundary with --shallow-exclude upload-pack: support define shallow boundary by excluding revisions refs: add expand_ref() t5500, t5539: tests for shallow depth since a specific date clone: define shallow clone boundary based on time with --shallow-since fetch: define shallow boundary with --shallow-since upload-pack: add deepen-since to cut shallow repos based on time shallow.c: implement a generic shallow boundary finder based on rev-list fetch-pack: use a separate flag for fetch in deepening mode fetch-pack.c: mark strings for translating fetch-pack: use a common function for verbose printing fetch-pack: use skip_prefix() instead of starts_with() upload-pack: move rev-list code out of check_non_tip() upload-pack: make check_non_tip() clean things up on error upload-pack: tighten number parsing at "deepen" lines ...	2016-10-10 14:03:50 -07:00
Junio C Hamano	b8688adb12	Merge branch 'rs/qsort' We call "qsort(array, nelem, sizeof(array[0]), fn)", and most of the time third parameter is redundant. A new QSORT() macro lets us omit it. * rs/qsort: show-branch: use QSORT use QSORT, part 2 coccicheck: use --all-includes by default remove unnecessary check before QSORT use QSORT add QSORT	2016-10-10 14:03:46 -07:00
René Scharfe	9ed0d8d6e6	use QSORT Apply the semantic patch contrib/coccinelle/qsort.cocci to the code base, replacing calls of qsort(3) with QSORT. The resulting code is shorter and supports empty arrays with NULL pointers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-29 15:42:18 -07:00
Jonathan Tan	06b3d386e0	fetch-pack: do not reset in_vain on non-novel acks The MAX_IN_VAIN mechanism was introduced in commit `f061e5f` ("fetch-pack: give up after getting too many "ack continue"", 2006-05-24) to stop ref negotiation if a number of consecutive "have"s have been sent with no corresponding new acks. This is to stop the client from digging too deep in an irrelevant side branch in vain without ever finding a common ancestor. A use case (as described in that commit) is the scenario in which the local repository has more roots than the remote repository. However, during a negotiation in which stateless RPCs are used, MAX_IN_VAIN will (almost) never trigger (in the more-roots scenario above and others) because in each new request, the client has to inform the server of objects it already has and knows the server has (to remind the server of the state), which the server then acks. Make fetch-pack only consider, as new acks for the purpose of MAX_IN_VAIN, acks for objects for which the client has never received an ack before in this session. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-23 12:37:45 -07:00
Jonathan Tan	da470981de	fetch-pack: grow stateless RPC windows exponentially When updating large repositories, the LARGE_FLUSH limit (that is, the limit at which the window growth strategy switches from exponential to linear) is reached quite quickly. Use a conservative exponential growth strategy when that limit is reached instead (and increase LARGE_FLUSH so that there is no regression in window size). This optimization is only applied during stateless RPCs to avoid the issue raised and fixed in commit `44d8dc54` (Fix potential local deadlock during fetch-pack, 2011-03-29). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-19 13:27:22 -07:00
Nguyễn Thái Ngọc Duy	cccf74e2da	fetch, upload-pack: --deepen=N extends shallow boundary by N commits In git-fetch, --depth argument is always relative with the latest remote refs. This makes it a bit difficult to cover this use case, where the user wants to make the shallow history, say 3 levels deeper. It would work if remote refs have not moved yet, but nobody can guarantee that, especially when that use case is performed a couple months after the last clone or "git fetch --depth". Also, modifying shallow boundary using --depth does not work well with clones created by --since or --not. This patch fixes that. A new argument --deepen=<N> will add <N> more () parent commits to the current history regardless of where remote refs are. Have/Want negotiation is still respected. So if remote refs move, the server will send two chunks: one between "have" and "want" and another to extend shallow history. In theory, the client could send no "want"s in order to get the second chunk only. But the protocol does not allow that. Either you send no want lines, which means ls-remote; or you have to send at least one want line that carries deep-relative to the server.. The main work was done by Dongcan Jiang. I fixed it up here and there. And of course all the bugs belong to me. () We could even support --deepen=<N> where <N> is negative. In that case we can cut some history from the shallow clone. This operation (and --depth=<shorter depth>) does not require interaction with remote side (and more complicated to implement as a result). Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Dongcan Jiang <dongcan.jiang@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	a45a260086	fetch: define shallow boundary with --shallow-exclude Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	508ea88226	fetch: define shallow boundary with --shallow-since Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	79891cb90a	fetch-pack: use a separate flag for fetch in deepening mode The shallow repo could be deepened or shortened when then user gives --depth. But in future that won't be the only way to deepen/shorten a repo. Stop relying on args->depth in this mode. Future deepening methods can simply set this flag on instead of updating all these if expressions. The new name "deepen" was chosen after the command to define shallow boundary in pack protocol. New commands also follow this tradition. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	1dd73e20d7	fetch-pack.c: mark strings for translating Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Nguyễn Thái Ngọc Duy	0d789a5bc1	fetch-pack: use a common function for verbose printing This reduces the number of "if (verbose)" which makes it a bit easier to read imo. It also makes it easier to redirect all these printouts, to a file for example. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-13 14:38:16 -07:00
Jeff King	df85757244	fetch-pack: isolate sigpipe in demuxer thread In commit `9ff18fa` (fetch-pack: ignore SIGPIPE in sideband demuxer, 2016-02-24), we started using sigchain_push() to ignore SIGPIPE in the async demuxer thread. However, this is rather clumsy, as it ignores SIGPIPE for the entire process, including the main thread. At the time we didn't have any per-thread signal support, but we now we do. Let's use it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-20 13:33:56 -07:00
Jeff King	9ff18faf2f	fetch-pack: ignore SIGPIPE in sideband demuxer If the other side feeds us a bogus pack, index-pack (or unpack-objects) may die early, before consuming all of its input. As a result, the sideband demuxer may get SIGPIPE (racily, depending on whether our data made it into the pipe buffer or not). If this happens and we are compiled with pthread support, it will take down the main thread, too. This isn't the end of the world, as the main process will just die() anyway when it sees index-pack failed. But it does mean we don't get a chance to say "fatal: index-pack failed" or similar. And it also means that we racily fail t5504, as we sometimes die() and sometimes are killed by SIGPIPE. So let's ignore SIGPIPE while demuxing the sideband. We are already careful to check the return value of write(), so we won't waste time writing to a broken pipe. The caller will notice the error return from the async thread, though in practice we don't even get that far, as we die() as soon as we see that index-pack failed. The non-sideband case is already fine; we let index-pack read straight from the socket, so there is no SIGPIPE at all. Technically the non-threaded async case is also OK without this (the forked async process gets SIGPIPE), but it's not worth distinguishing from the threaded case here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-25 13:51:47 -08:00
brian m. carlson	ed1c9977cb	Remove get_object_hash. Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	f2fd0760f6	Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	7999b2cf77	Add several uses of get_object_hash. Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
brian m. carlson	f4e54d02b8	Convert struct ref to use object_id. Use struct object_id in three fields in struct ref and convert all the necessary places that use it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>	2015-11-20 08:02:05 -05:00
Jeff King	984a43b902	fetch-pack: use argv_array for index-pack / unpack-objects This cleans up a magic number that must be kept in sync with the rest of the code (the number of argv slots). It also lets us drop some fixed buffers and an sprintf (since we can now use argv_array_pushf). We do still have to keep one fixed buffer for calling gethostname, but at least now the size computations for it are much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-10-05 11:08:05 -07:00
Jeff King	f932729cc7	memoize common git-path "constant" files One of the most common uses of git_path() is to pass a constant, like git_path("MERGE_MSG"). This has two drawbacks: 1. The return value is a static buffer, and the lifetime is dependent on other calls to git_path, etc. 2. There's no compile-time checking of the pathname. This is OK for a one-off (after all, we have to spell it correctly at least once), but many of these constant strings appear throughout the code. This patch introduces a series of functions to "memoize" these strings, which are essentially globals for the lifetime of the program. We compute the value once, take ownership of the buffer, and return the cached value for subsequent calls. cache.h provides a helper macro for defining these functions as one-liners, and defines a few common ones for global use. Using a macro is a little bit gross, but it does nicely document the purpose of the functions. If we need to touch them all later (e.g., because we learned how to change the git_dir variable at runtime, and need to invalidate all of the stored values), it will be much easier to have the complete list. Note that the shared-global functions have separate, manual declarations. We could do something clever with the macros (e.g., expand it to a declaration in some places, and a declaration _and_ a definition in path.c). But there aren't that many, and it's probably better to stay away from too-magical macros. Likewise, if we abandon the C preprocessor in favor of generating these with a script, we could get much fancier. E.g., normalizing "FOO/BAR-BAZ" into "git_path_foo_bar_baz". But the small amount of saved typing is probably not worth the resulting confusion to readers who want to grep for the function's definition. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-08-10 15:37:14 -07:00
Junio C Hamano	58eb0122f3	Merge branch 'me/fetch-into-shallow-safety' "git fetch --depth=<depth>" and "git clone --depth=<depth>" issued a shallow transfer request even to an upload-pack that does not support the capability. * me/fetch-into-shallow-safety: fetch-pack: check for shallow if depth given	2015-07-01 14:02:33 -07:00
Mike Edgar	eb86a507a1	fetch-pack: check for shallow if depth given When a repository is first fetched as a shallow clone, either by git-clone or by fetching into an empty repo, the server's capabilities are not currently consulted. The client will send shallow requests even if the server does not understand them, and the resulting error may be unhelpful to the user. This change pre-emptively checks so we can exit with a helpful error if necessary. Signed-off-by: Mike Edgar <adgar@google.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-06-17 12:03:58 -07:00
Junio C Hamano	5455ee0573	Merge branch 'bc/object-id' for_each_ref() callback functions were taught to name the objects not with "unsigned char sha1[20]" but with "struct object_id". * bc/object-id: (56 commits) struct ref_lock: convert old_sha1 member to object_id warn_if_dangling_symref(): convert local variable "junk" to object_id each_ref_fn_adapter(): remove adapter rev_list_insert_ref(): remove unneeded arguments rev_list_insert_ref_oid(): new function, taking an object_oid mark_complete(): remove unneeded arguments mark_complete_oid(): new function, taking an object_oid clear_marks(): rewrite to take an object_id argument mark_complete(): rewrite to take an object_id argument send_ref(): convert local variable "peeled" to object_id upload-pack: rewrite functions to take object_id arguments find_symref(): convert local variable "unused" to object_id find_symref(): rewrite to take an object_id argument write_one_ref(): rewrite to take an object_id argument write_refs_to_temp_dir(): convert local variable sha1 to object_id submodule: rewrite to take an object_id argument shallow: rewrite functions to take object_id arguments handle_one_ref(): rewrite to take an object_id argument add_info_ref(): rewrite to take an object_id argument handle_one_reflog(): rewrite to take an object_id argument ...	2015-06-05 12:17:37 -07:00
Michael Haggerty	c38cd1c89d	rev_list_insert_ref(): remove unneeded arguments Now that the function is not being used as an each_ref_sha1_fn, we can delete the unused arguments in its signature. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:39 -07:00
Michael Haggerty	b1b49c6eb6	rev_list_insert_ref_oid(): new function, taking an object_oid This function can be used with for_each_ref() without having to be wrapped. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:39 -07:00
Michael Haggerty	6e20a51a80	mark_complete(): remove unneeded arguments Now that the function is not being used as an each_ref_sha1_fn, we can delete the unused arguments in its signature. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:38 -07:00
Michael Haggerty	f8ee4d8522	mark_complete_oid(): new function, taking an object_oid This function can be used with for_each_ref() without having to be wrapped. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:38 -07:00
Michael Haggerty	c50fb6cee6	clear_marks(): rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:38 -07:00
Michael Haggerty	2b2a5be394	each_ref_fn: change to take an object_id parameter Change typedef each_ref_fn to take a "const struct object_id oid" parameter instead of "const unsigned char sha1". To aid this transition, implement an adapter that can be used to wrap old-style functions matching the old typedef, which is now called "each_ref_sha1_fn"), and make such functions callable via the new interface. This requires the old function and its cb_data to be wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be used as the cb_data for an adapter function, each_ref_fn_adapter(). This is an enormous diff, but most of it consists of simple, mechanical changes to the sites that call any of the "for_each_ref" family of functions. Subsequent to this change, the call sites can be rewritten one by one to use the new interface. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-25 12:19:27 -07:00
Fredrik Medley	68ee628932	upload-pack: optionally allow fetching reachable sha1 With uploadpack.allowReachableSHA1InWant configuration option set on the server side, "git fetch" can make a request with a "want" line that names an object that has not been advertised (likely to have been obtained out of band or from a submodule pointer). Only objects reachable from the branch tips, i.e. the union of advertised branches and branches hidden by transfer.hideRefs, will be processed. Note that there is an associated cost of having to walk back the history to check the reachability. This feature can be used when obtaining the content of a certain commit, for which the sha1 is known, without the need of cloning the whole repository, especially if a shallow fetch is used. Useful cases are e.g. repositories containing large files in the history, fetching only the needed data for a submodule checkout, when sharing a sha1 without telling which exact branch it belongs to and in Gerrit, if you think in terms of commits instead of change numbers. (The Gerrit case has already been solved through allowTipSHA1InWant as every Gerrit change has a ref.) Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-22 18:25:36 -07:00
Fredrik Medley	7199c093ad	upload-pack: prepare to extend allow-tip-sha1-in-want To allow future extensions, e.g. allowing non-tip sha1, replace the boolean allow_tip_sha1_in_want variable with the flag-style allow_request_with_bare_object_name variable. Signed-off-by: Fredrik Medley <fredrik.medley@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-05-22 18:25:35 -07:00
Jeff King	32d0462f8d	fetch-pack: remove dead assignment to ref->new_sha1 In everything_local(), we used to assign the current ref's value found in ref->old_sha1 to ref->new_sha1 when we already have all the necessary objects to complete the history leading to that commit. This copying was broken at `49bb805e` (Do not ask for objects known to be complete., 2005-10-19) and ever since we instead stuffed a random bytes in ref->new_sha1 here. No code complained or failed due to this breakage. It turns out that no code path that comes after this assignment even looks at ref->new_sha1 at all. - The only caller of everything_local(), do_fetch_pack(), returns this list of refs, whose element has bogus new_sha1 values, to its caller. It does not look at the elements itself, but does pass them to find_common, which looks only at the name and old_sha1 fields. - The only caller of do_fetch_pack(), fetch_pack(), returns this list to its caller. It does not look at the elements nor act on them. - One of the two callers of fetch_pack() is cmd_fetch_pack(), the top-level that implements "git fetch-pack". The only thing it looks at in the elements of the returned ref list is the old_sha1 and name fields. - The other caller of fetch_pack() is fetch_refs_via_pack() in the transport layer, which is a helper that implements "git fetch". It only cares about whether the returned list is empty (i.e. failed to fetch anything). Just drop the bogus assignment, that is not even necessary. The remote-tracking refs are updated based on a different list and not using the ref list being manipulated by this code path; the caller do_fetch_pack() created a copy of that real ref list and passed the copy down to this function, and modifying the elements here does not affect anything. Noticed-by: Kyle J. McKay <mackyle@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-19 14:11:52 -07:00
Jeff King	c3c17bf107	filter_ref: make a copy of extra "sought" entries If the server supports allow_tip_sha1_in_want, we add any unmatched raw-sha1 entries in our "sought" list of refs to the list of refs we will ask the other side for. We do so by inserting the original "struct ref" directly into our list, rather than making a copy. This has several problems. The most minor problem is that one cannot ever free the resulting list; it contains structs that are copies of the remote refs (made earlier by fetch_pack) along with sought refs that are referenced elsewhere. But more importantly that we set the ref->next pointer to NULL, chopping off the remainder of any existing list that the ref was a part of. We get the set of "sought" refs in an array rather than a linked list, but that array is often in turn generated from a list. The test modification in t5516 demonstrates this. Rather than fetching just an exact sha1, we fetch that sha1 plus another ref: - we build a linked list of refs to fetch when do_fetch calls get_ref_map; the exact sha1 is first, followed by the named ref ("refs/heads/extra" in this case). - we pass that linked list to transport_fetch_ref, which squashes it into an array of pointers - that array goes to fetch_pack, which calls filter_ref. There we generate the want list from a mix of what the remote side has advertised, and the "sought" entry for the exact sha1. We set the sought entry's "next" pointer to NULL. - after we return from transport_fetch_refs, we then try to update the refs by following the linked list. But our list is now truncated, and we do not update refs/heads/extra at all. We can fix this by making a copy of the ref. There's nothing that fetch_pack does to it that must be reflected in the original "sought" list (and indeed, if that were the case we would have a serious bug, because it is only exact-sha1 entries which are treated this way). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-19 14:11:11 -07:00
Jeff King	b7916422c7	filter_ref: avoid overwriting ref->old_sha1 with garbage If the server supports allow_tip_sha1_in_want, then fetch-pack's filter_refs function tries to check whether a ref is a request for a straight sha1 by running: if (get_sha1_hex(ref->name, ref->old_sha1)) ... I.e., we are using get_sha1_hex to ask "is this ref name a sha1?". If it is true, then the contents of ref->old_sha1 will end up unchanged. But if it is false, then get_sha1_hex makes no guarantees about what it has written. With a ref name like "abcdefoo", we would overwrite 3 bytes of ref->old_sha1 before realizing that it was not a sha1. This is likely not a problem in practice, as anything in refs->name (besides a sha1) will start with "refs/", meaning that we would notice on the first character that there is a problem. Still, we are making assumptions about the state left in the output when get_sha1_hex returns an error (e.g., it could start from the end of the string, or error check the values only once they were placed in the output). It's better to be defensive. We could just check that we have exactly 40 characters of sha1. But let's be even more careful and make sure that we have a 40-char hex refname that matches what is in old_sha1. This is perhaps overly defensive, but spells out our assumptions clearly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-03-19 13:52:54 -07:00
Michael Haggerty	697cc8efd9	lockfile.h: extract new header file for the functions in lockfile.c Move the interface declaration for the functions in lockfile.c from cache.h to a new file, lockfile.h. Add #includes where necessary (and remove some redundant includes of cache.h by files that already include builtin.h). Move the documentation of the lock_file state diagram from lockfile.c to the new header file. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-10-01 13:56:14 -07:00
Junio C Hamano	825fd93767	Merge branch 'rs/child-process-init' Code clean-up. * rs/child-process-init: run-command: inline prepare_run_command_v_opt() run-command: call run_command_v_opt_cd_env() instead of duplicating it run-command: introduce child_process_init() run-command: introduce CHILD_PROCESS_INIT	2014-09-11 10:33:27 -07:00
René Scharfe	d318027932	run-command: introduce CHILD_PROCESS_INIT Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-20 09:53:37 -07:00
Tanay Abhra	f44af51d13	fetchpack.c: replace `git_config()` with `git_config_get_()` family Use `git_config_get_()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-07 13:33:27 -07:00
Junio C Hamano	e91ae32a01	Merge branch 'jk/skip-prefix' * jk/skip-prefix: http-push: refactor parsing of remote object names imap-send: use skip_prefix instead of using magic numbers use skip_prefix to avoid repeated calculations git: avoid magic number with skip_prefix fetch-pack: refactor parsing in get_ack fast-import: refactor parsing of spaces stat_opt: check extra strlen call daemon: use skip_prefix to avoid magic numbers fast-import: use skip_prefix for parsing input use skip_prefix to avoid repeating strings use skip_prefix to avoid magic numbers transport-helper: avoid reading past end-of-string fast-import: fix read of uninitialized argv memory apply: use skip_prefix instead of raw addition refactor skip_prefix to return a boolean avoid using skip_prefix as a boolean daemon: mark some strings as const parse_diff_color_slot: drop ofs parameter	2014-07-09 11:33:28 -07:00
Jeff King	82e56767aa	fetch-pack: refactor parsing in get_ack There are several uses of the magic number "line+45" when parsing ACK lines from the server, and it's rather unclear why 45 is the correct number. We can make this more clear by keeping a running pointer as we parse, using skip_prefix to jump past the first "ACK ", then adding 40 to jump past get_sha1_hex (which is still magical, but hopefully 40 is less magical to readers of git code). Note that this actually puts us at line+44. The original required some character between the sha1 and further ACK flags (it is supposed to be a space, but we never enforced that). We start our search for flags at line+44, which meanas we are slightly more liberal than the old code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:45:19 -07:00
Jeff King	ae021d8791	use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-20 10:44:45 -07:00
René Scharfe	50e19a8358	Use starts_with() for C strings instead of memcmp() Convert three cases of checking for a constant prefix using memcmp() to starts_with(). This way there is no need for magic string length constants and we avoid running over the end of the string should it be shorter than the prefix. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-06-09 14:38:12 -07:00
Junio C Hamano	b407d40933	Merge branch 'nd/log-show-linear-break' Attempts to show where a single-strand-of-pearls break in "git log" output. * nd/log-show-linear-break: log: add --show-linear-break to help see non-linear history object.h: centralize object flag allocation	2014-04-03 12:38:11 -07:00
Nguyễn Thái Ngọc Duy	208acbfb82	object.h: centralize object flag allocation While the field "flags" is mainly used by the revision walker, it is also used in many other places. Centralize the whole flag allocation to one place for a better overview (and easier to move flags if we have too). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-25 15:09:24 -07:00
Junio C Hamano	3e14384b12	Merge branch 'jk/shallow-update-fix' Serving objects from a shallow repository needs to write a new file to hold the temporary shallow boundaries but it was not cleaned when we exit due to die() or a signal. * jk/shallow-update-fix: shallow: verify shallow file after taking lock shallow: automatically clean up shallow tempfiles shallow: use stat_validity to check for up-to-date file	2014-03-21 12:33:29 -07:00
Jeff King	0179c945fc	shallow: automatically clean up shallow tempfiles We sometimes write tempfiles of the form "shallow_XXXXXX" during fetch/push operations with shallow repositories. Under normal circumstances, we clean up the result when we are done. However, we do no take steps to clean up after ourselves when we exit due to die() or signal death. This patch teaches the tempfile creation code to register handlers to clean up after ourselves. To handle this, we change the ownership semantics of the filename returned by setup_temporary_shallow. It now keeps a copy of the filename itself, and returns only a const pointer to it. We can also do away with explicit tempfile removal in the callers. They all exit not long after finishing with the file, so they can rely on the auto-cleanup, simplifying the code. Note that we keep things simple and maintain only a single filename to be cleaned. This is sufficient for the current caller, but we future-proof it with a die("BUG"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-27 12:07:13 -08:00
Nguyễn Thái Ngọc Duy	ff62eca7d1	fetch-pack: fix deepen shallow over smart http with no-done cap In smart http, upload-pack adds new shallow lines at the beginning of each rpc response. Only shallow lines from the first rpc call are useful. After that they are thrown away. It's designed this way because upload-pack is stateless and has no idea when its shallow lines are helpful or not. So after refs are negotiated with multi_ack_detailed and the server thinks it learned enough, it sends "ACK obj-id ready", terminates the rpc call and waits for the final rpc round. The client sends "done". The server sends another response, which also has shallow lines at the beginning, and the last "ACK obj-id" line. When no-done is active, the last round is cut out, the server sends "ACK obj-id ready" and "ACK obj-id" in the same rpc response. fetch-pack is updated to recognize this and not send "done". However it still tries to consume shallow lines, which are never sent. Update the code, make sure to skip consuming shallow lines when no-done is enabled. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-10 10:21:33 -08:00
Junio C Hamano	f583ace157	Merge branch 'jk/allow-fetch-onelevel-refname' "git clone" would fail to clone from a repository that has a ref directly under "refs/", e.g. "refs/stash", because different validation paths do different things on such a refname. Loosen the client side's validation to allow such a ref. * jk/allow-fetch-onelevel-refname: fetch-pack: do not filter out one-level refs	2014-01-27 10:44:14 -08:00
Junio C Hamano	92251b1b5b	Merge branch 'nd/shallow-clone' Fetching from a shallow-cloned repository used to be forbidden, primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This attempts to allow object transfer out of a shallow-cloned repository in a controlled way (i.e. the receiver become a shallow repository with truncated history). * nd/shallow-clone: (31 commits) t5537: fix incorrect expectation in test case 10 shallow: remove unused code send-pack.c: mark a file-local function static git-clone.txt: remove shallow clone limitations prune: clean .git/shallow after pruning objects clone: use git protocol for cloning shallow repo locally send-pack: support pushing from a shallow clone via http receive-pack: support pushing to a shallow clone via http smart-http: support shallow fetch/clone remote-curl: pass ref SHA-1 to fetch-pack as well send-pack: support pushing to a shallow clone receive-pack: allow pushes that update .git/shallow connected.c: add new variant that runs with --shallow-file add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses receive/send-pack: support pushing from a shallow clone receive-pack: reorder some code in unpack() fetch: add --update-shallow to accept refs that update .git/shallow upload-pack: make sure deepening preserves shallow roots fetch: support fetching from a shallow repository clone: support remote shallow repository ...	2014-01-17 12:21:20 -08:00
Jeff King	4c22408111	fetch-pack: do not filter out one-level refs Currently fetching a one-level ref like "refs/foo" does not work consistently. The outer "git fetch" program filters the list of refs, checking each against check_refname_format. Then it feeds the result to do_fetch_pack to actually negotiate the haves/wants and get the pack. The fetch-pack code does its own filter, and it behaves differently. The fetch-pack filter looks for refs in "refs/", and then feeds everything _after_ the slash (i.e., just "foo") into check_refname_format. But check_refname_format is not designed to look at a partial refname. It complains that the ref has only one component, thinking it is at the root (i.e., alongside "HEAD"), when in reality we just fed it a partial refname. As a result, we omit a ref like "refs/foo" from the pack request, even though "git fetch" then tries to store the resulting ref. If we happen to get the object anyway (e.g., because the ref is contained in another ref we are fetching), then the fetch succeeds. But if it is a unique object, we fail when trying to update "refs/foo". We can fix this by just passing the whole refname into check_refname_format; we know the part we were omitting is "refs/", which is acceptable in a refname. This at least makes the checks consistent with each other. This problem happens most commonly with "refs/stash", which is the only one-level ref in wide use. However, our test does not use "refs/stash", as we may later want to restrict it specifically (not because it is one-level, but because of the semantics of stashes). We may also want to do away with the multiple levels of filtering (which can cause problems when they are out of sync), or even forbid one-level refs entirely. However, those decisions can come later; this fixes the most immediate problem, which is the mismatch between the two. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-15 12:37:24 -08:00
Ramsay Jones	feefdf62c1	shallow: remove unused code Commit `58babfff` ("shallow.c: the 8 steps to select new commits for .git/shallow", 05-12-2013) added a function to implement step 5 of the quoted eight steps, namely 'remove_nonexistent_ours_in_pack()'. This function implements an optional optimization step in the new shallow commit selection algorithm. However, this function has no callers. (The commented out call sites would need to change, in order to provide information required by the function.) Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Acked-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-06 09:05:40 -08:00
Junio C Hamano	ad70448576	Merge branch 'cc/starts-n-ends-with' Remove a few duplicate implementations of prefix/suffix comparison functions, and rename them to starts_with and ends_with. * cc/starts-n-ends-with: replace {pre,suf}fixcmp() with {starts,ends}_with() strbuf: introduce starts_with() and ends_with() builtin/remote: remove postfixcmp() and use suffixcmp() instead environment: normalize use of prefixcmp() by removing " != 0"	2013-12-17 12:02:44 -08:00
Nguyễn Thái Ngọc Duy	48d25cae22	fetch: add --update-shallow to accept refs that update .git/shallow The same steps are done as in when --update-shallow is not given. The only difference is we now add all shallow commits in "ours" and "theirs" to .git/shallow (aka "step 8"). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy	4820a33baa	fetch: support fetching from a shallow repository This patch just put together pieces from the 8 steps patch. We stop at step 7 and reject refs that require new shallow commits. Note that, by rejecting refs that require new shallow commits, we leave dangling objects in the repo, which become "object islands" by the next "git fetch" of the same source. If the first fetch our "ours" set is zero and we do practically nothing at step 7, "ours" is full at the next fetch and we may need to walk through commits for reachability test. Room for improvement. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy	beea4152d9	clone: support remote shallow repository Cloning from a shallow repository does not follow the "8 steps for new .git/shallow" because if it does we need to get through step 6 for all refs. That means commit walking down to the bottom. Instead the rule to create .git/shallow is simpler and, more importantly, cheap: if a shallow commit is found in the pack, it's probably used (i.e. reachable from some refs), so we add it. Others are dropped. One may notice this method seems flawed by the word "probably". A shallow commit may not be reachable from any refs at all if it's attached to an object island (a group of objects that are not reachable by any refs). If that object island is not complete, a new fetch request may send more objects to connect it to some ref. At that time, because we incorrectly installed the shallow commit in this island, the user will not see anything after that commit (fsck is still ok). This is not desired. Given that object islands are rare (C Git never sends such islands for security reasons) and do not really harm the repository integrity, a tradeoff is made to surprise the user occasionally but work faster everyday. A new option --strict could be added later that follows exactly the 8 steps. "git prune" can also learn to remove dangling objects _and_ the shallow commits that are attached to them from .git/shallow. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-10 16:14:17 -08:00
Nguyễn Thái Ngọc Duy	a796ccee51	fetch-pack.c: move shallow update code out of fetch_pack() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-10 16:14:16 -08:00
Nguyễn Thái Ngọc Duy	1a30f5a2f2	shallow.c: extend setup_*_shallow() to accept extra shallow commits Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-10 16:14:16 -08:00
Christian Couder	5955654823	replace {pre,suf}fixcmp() with {starts,ends}_with() Leaving only the function definitions and declarations so that any new topic in flight can still make use of the old functions, replace existing uses of the prefixcmp() and suffixcmp() with new API functions. The change can be recreated by mechanically applying this: $ git grep -l -e prefixcmp -e suffixcmp -- \*.c \| grep -v strbuf\\.c \| xargs perl -pi -e ' s\|!prefixcmp\(\|starts_with\(\|g; s\|prefixcmp\(\|!starts_with\(\|g; s\|!suffixcmp\(\|ends_with\(\|g; s\|suffixcmp\(\|!ends_with\(\|g; ' on the result of preparatory changes in this series. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-12-05 14:13:21 -08:00
Junio C Hamano	5bb62059f2	Merge branch 'jk/robustify-parse-commit' * jk/robustify-parse-commit: checkout: do not die when leaving broken detached HEAD use parse_commit_or_die instead of custom message use parse_commit_or_die instead of segfaulting assume parse_commit checks for NULL commit assume parse_commit checks commit->object.parsed log_tree_diff: die when we fail to parse a commit	2013-12-05 12:54:01 -08:00
Junio C Hamano	832ee79ab8	Merge branch 'jl/pack-transfer-avoid-double-close' The codepath that send_pack() calls pack_objects() mistakenly closed the same file descriptor twice, leading to potentially closing a wrong file descriptor that was opened in the meantime. * jl/pack-transfer-avoid-double-close: Clear fd after closing to avoid double-close error	2013-10-30 12:10:45 -07:00
Jeff King	0064053bd7	assume parse_commit checks commit->object.parsed The parse_commit function will check the "parsed" flag of the object and do nothing if it is set. There is no need for callers to check the flag themselves, and doing so only clutters the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-24 15:43:50 -07:00
Junio C Hamano	6ba0d9551a	Merge branch 'nd/fetch-into-shallow' into maint When there is no sufficient overlap between old and new history during a "git fetch" into a shallow repository, objects that the sending side knows the receiving end has were unnecessarily sent. * nd/fetch-into-shallow: Add testcase for needless objects during a shallow fetch list-objects: mark more commits as edges in mark_edges_uninteresting list-objects: reduce one argument in mark_edges_uninteresting upload-pack: delegate rev walking in shallow fetch to pack-objects shallow: add setup_temporary_shallow() shallow: only add shallow graft points to new shallow file move setup_alternate_shallow and write_shallow_commits to shallow.c	2013-10-23 13:32:17 -07:00
Jens Lindstrom	37cb1dd671	Clear fd after closing to avoid double-close error In send_pack(), clear the fd passed to pack_objects() by setting it to -1, since pack_objects() closes the fd (via a call to run_command()). Likewise, in get_pack(), clear the fd passed to run_command(). Not doing so risks having git_transport_push(), caller of send_pack(), closing the fd again, possibly incorrectly closing some other open file; or similarly with fetch_refs_from_pack(), indirect caller of get_pack(). Signed-off-by: Jens Lindström <jl@opera.com> Acked-by: Jeff King <peff@peff.net> Acked-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2013-10-23 09:07:09 -07:00
Jonathan Nieder	40b77322d2	Merge branch 'nd/fetch-pack-error-reporting-fix' * nd/fetch-pack-error-reporting-fix: fetch-pack.c: show correct command name that fails	2013-09-24 23:27:02 -07:00

1 2 3 4 5