This patch adds the possibility to supply a set of non-0 file
descriptors for async process communication instead of the
default-created pipe.
Additionally, we now support bi-directional communiction with the
async procedure, by giving the async function both read and write
file descriptors.
To retain compatiblity and similar "API feel" with start_command,
we require start_async callers to set .out = -1 to get a readable
file descriptor. If either of .in or .out is 0, we supply no file
descriptor to the async process.
[sp: Note: Erik started this patch, and a huge bulk of it is
his work. All bugs were introduced later by Shawn.]
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-remote-curl backend detects if the remote server supports
the git-upload-pack service, and if so, runs git-fetch-pack locally
in a pipe to generate the want/have commands.
The advertisements from the server that were obtained during the
discovery are passed into git-fetch-pack before the POST request
starts, permitting server capability discovery and enablement.
Common objects that are discovered are appended onto the request as
have lines and are sent again on the next request. This allows the
remote side to reinitialize its in-memory list of common objects
during the next request.
Because all requests are relatively short, below git-remote-curl's
1 MiB buffer limit, requests will use the standard Content-Length
header and be valid HTTP/1.0 POST requests. This makes the fetch
client more tolerant of proxy servers which don't support HTTP/1.1
or the chunked transfer encoding.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
CC: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When multi_ack_detailed is enabled the ACK continue messages returned
by the remote upload-pack are broken out to describe the different
states within the peer. This permits the client to better understand
the server's in-memory state.
The fetch-pack/upload-pack protocol now looks like:
NAK
---------------------------------
Always sent in response to "done" if there was no common base
selected from the "have" lines (or no have lines were sent).
* no multi_ack or multi_ack_detailed:
Sent when the client has sent a pkt-line flush ("0000") and
the server has not yet found a common base object.
* either multi_ack or multi_ack_detailed:
Always sent in response to a pkt-line flush.
ACK %s
-----------------------------------
* no multi_ack or multi_ack_detailed:
Sent in response to "have" when the object exists on the remote
side and is therefore an object in common between the peers.
The argument is the SHA-1 of the common object.
* either multi_ack or multi_ack_detailed:
Sent in response to "done" if there are common objects.
The argument is the last SHA-1 determined to be common.
ACK %s continue
-----------------------------------
* multi_ack only:
Sent in response to "have".
The remote side wants the client to consider this object as
common, and immediately stop transmitting additional "have"
lines for objects that are reachable from it. The reason
the client should stop is not given, but is one of the two
cases below available under multi_ack_detailed.
ACK %s common
-----------------------------------
* multi_ack_detailed only:
Sent in response to "have". Both sides have this object.
Like with "ACK %s continue" above the client should stop
sending have lines reachable for objects from the argument.
ACK %s ready
-----------------------------------
* multi_ack_detailed only:
Sent in response to "have".
The client should stop transmitting objects which are reachable
from the argument, and send "done" soon to get the objects.
If the remote side has the specified object, it should
first send an "ACK %s common" message prior to sending
"ACK %s ready".
Clients may still submit additional "have" lines if there are
more side branches for the client to explore that might be added
to the common set and reduce the number of objects to transfer.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 41cb7488 Linus moved this function to connect.c for reuse inside
of the git-clone-pack command. That was 2005, but in 2006 Junio
retired git-clone-pack in commit efc7fa53. Since then the only
caller has been fetch-pack. Since this ACK/NAK exchange is only
used by the fetch-pack/upload-pack protocol we should move it back
to be a private detail of fetch-pack.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This change is being offered as a refactoring to make later
commits in the smart HTTP series easier.
By changing the enabled capabilities to be formatted in a strbuf
it is easier to add a new capability to the set of supported
capabilities.
By formatting the want portion of the request into a strbuf and
writing it as a whole block we can later decide to hold onto
the req_buf (instead of releasing it) to recycle in stateless
communications.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fetch-pack runs the sideband demultiplexer using start_async(). This
facility requires that the asynchronously executed function closes the
output file descriptor (see Documentation/technical/api-run-command.txt).
But the sideband demultiplexer did not do that. This fixes it.
In certain error situations this could lock up a fetch operation on
Windows because the asynchronous function is run in a thread; by not
closing the output fd the reading end never got EOF and waited for more
data indefinitely. On Unix this is not a problem because the asynchronous
function is run in a separate process, which exits after the function ends
and so implicitly closes the output.
Since the pack that is sent over the wire encodes the number of objects in
the stream, during normal operation the reading end knows when the stream
ends and terminates by itself, and does not lock up.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ar/unlink-err:
print unlink(2) errno in copy_or_link_directory
replace direct calls to unlink(2) with unlink_or_warn
Introduce an unlink(2) wrapper which gives warning if unlink failed
If the local receiving repository has disabled the use of delta base
offset, for example to retain compatibility with older versions of
Git that predate OFS_DELTA, we shouldn't ask for ofs-delta support
when we obtain a pack from the remote server.
[ issue noticed by Shawn Pearce ]
Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Essentially; s/type* /type */ as per the coding guidelines.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps to notice when something's going wrong, especially on
systems which lock open files.
I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
called from such a function
- it is in a static function, returning void and the function is only
called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* kb/checkout-optim:
Revert "lstat_cache(): print a warning if doing ping-pong between cache types"
checkout bugfix: use stat.mtime instead of stat.ctime in two places
Makefile: Set compiler switch for USE_NSEC
Create USE_ST_TIMESPEC and turn it on for Darwin
Not all systems use st_[cm]tim field for ns resolution file timestamp
Record ns-timestamps if possible, but do not use it without USE_NSEC
write_index(): update index_state->timestamp after flushing to disk
verify_uptodate(): add ce_uptodate(ce) test
make USE_NSEC work as expected
fix compile error when USE_NSEC is defined
check_updates(): effective removal of cache entries marked CE_REMOVE
lstat_cache(): print a warning if doing ping-pong between cache types
show_patch_diff(): remove a call to fstat()
write_entry(): use fstat() instead of lstat() when file is open
write_entry(): cleanup of some duplicated code
create_directories(): remove some memcpy() and strchr() calls
unlink_entry(): introduce schedule_dir_for_removal()
lstat_cache(): swap func(length, string) into func(string, length)
lstat_cache(): generalise longest_match_lstat_cache()
lstat_cache(): small cleanup and optimisation
Commit e1afca4fd "write_index(): update index_state->timestamp after
flushing to disk" on 2009-02-23 used stat.ctime to record the
timestamp of the index-file. This is wrong, so fix this and use the
correct stat.mtime timestamp instead.
Commit 110c46a909 "Not all systems use st_[cm]tim field for ns
resolution file timestamp" on 2009-03-08, has a similar bug for the
builtin-fetch-pack.c file.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This removes the last parameter of recv_sideband, by which the callers
told which channel bands #2 and #3 should be written to.
Sayeth Shawn Pearce:
The definition of the streams in the current sideband protocol
are rather well defined for the one protocol that uses it,
fetch-pack/receive-pack:
stream #1: pack data
stream #2: stderr messages, progress, meant for tty
stream #3: abort message, remote is dead, goodbye!
Since both callers of the function passed 2 for the parameter, we hereby
remove it and send bands #2 and #3 to stderr explicitly using fprintf.
This has the nice side-effect that these two streams pass through our
ANSI emulation layer on Windows.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Acked-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some codepaths do not still use the ST_[CM]TIME_NSEC() pair of macros
introduced by the previous commit but assumes all systems use st_mtim
and st_ctim fields in "struct stat" to record nanosecond resolution part
of the file timestamps.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These variables were unused and can be removed safely:
builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
builtin-fetch-pack.c::find_common(): len
builtin-remote.c::mv(): symref
diff.c::show_stats():show_stats(): total
diffcore-break.c::should_break(): base_size
fast-import.c::validate_raw_date(): date, sign
fsck.c::fsck_tree(): o_sha1, sha1
xdiff-interface.c::parse_num(): read_some
Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, the lack of USE_NSEC meant "do not record nor use the
nanosecond resolution part of the file timestamps". To avoid problems on
filesystems that lose the ns part when the metadata is flushed to the disk
and then later read back in, disabling USE_NSEC has been a good idea in
general.
If you are on a filesystem without such an issue, it does not hurt to read
and store them in the cached stat data in the index entries even if your
git is compiled without USE_NSEC. The index left with such a version of
git can be read by git compiled with USE_NSEC and it can make use of the
nanosecond part to optimize the check to see if the path on the filesystem
hsa been modified since we last looked at.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'struct cache' does not have a 'usec' member, but a 'unsigned int
nsec' member. Simmilar 'struct stat' does not have a 'st_mtim.usec'
member, and we should instead use 'st_mtim.tv_nsec'.
Signed-off-by: Kjetil Barvik <barvik@broadpark.no>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint:
GIT 1.6.0.5
"git diff <tree>{3,}": do not reverse order of arguments
tag: delete TAG_EDITMSG only on successful tag
gitweb: Make project specific override for 'grep' feature work
http.c: use 'git_config_string' to get 'curl_http_proxy'
fetch-pack: Avoid memcpy() with src==dst
memcpy() may only be used for disjoint memory areas, but when invoked
from cmd_fetch_pack(), we have my_args == &args. (The argument cannot
be removed entirely because transport.c invokes with its own
variable.)
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/maint-co-track:
Enhance hold_lock_file_for_{update,append}() API
demonstrate breakage of detached checkout with symbolic link HEAD
Fix "checkout --track -b newbranch" on detached HEAD
Conflicts:
builtin-commit.c
This changes the "die_on_error" boolean parameter to a mere "flags", and
changes the existing callers of hold_lock_file_for_update/append()
functions to pass LOCK_DIE_ON_ERROR.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jc/alternate-push:
push: receiver end advertises refs from alternate repositories
push: prepare sender to receive extended ref information from the receiver
receive-pack: make it a builtin
is_directory(): a generic helper function
"git push" enhancement allows the receiving end to report not only its own
refs but refs in repositories it borrows from via the alternate object
store mechanism. By telling the sender that objects reachable from these
extra refs are already complete in the receiving end, the number of
objects that need to be transfered can be cut down.
These entries are sent over the wire with string ".have", instead of the
actual names of the refs. This string was chosen so that they are ignored
by older programs at the sending end. If we sent some random but valid
looking refnames for these entries, "matching refs" rule (triggered when
running "git push" without explicit refspecs, where the sender learns what
refs the receiver has, and updates only the ones with the names of the
refs the sender also has) and "delete missing" rule (triggered when "git
push --mirror" is used, where the sender tells the receiver to delete the
refs it itself does not have) would try to update/delete them, which is
not what we want.
This prepares the send-pack (and "push" that runs native protocol) to
accept extended existing ref information and make use of it. The ".have"
entries are excluded from ref matching rules, and are exempt from deletion
rule while pushing with --mirror option, but are still used for pack
generation purposes by providing more "bottom" range commits.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
User notifications are presented as 'git cmd', and code comments
are presented as '"cmd"' or 'git's cmd', rather than 'git-cmd'.
Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some systems (like e.g. OpenSolaris) define pid_t as long,
therefore all our sprintf that use %i/%d cause a compiler warning
beacuse of the implicit long->int cast. To make sure that
we fit the limits, we display pids as PRIuMAX and cast them explicitly
to uintmax_t.
Signed-off-by: David Soria Parra <dsp@php.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is a mechanical conversion of all '*.c' files with:
s/((?:die|error|warning)\("git)-(\S+:)/$1 $2/;
The result was manually inspected and no false positive was found.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When you misuse a git command, you are shown the usage string.
But this is currently shown in the dashed form. So if you just
copy what you see, it will not work, when the dashed form
is no longer supported.
This patch makes git commands show the dash-less version.
For shell scripts that do not specify OPTIONS_SPEC, git-sh-setup.sh
generates a dash-less usage string now.
Signed-off-by: Stephan Beyer <s-beyer@gmx.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When printing valuds of type uint32_t, we should use PRIu32, and should
not assume that it is unsigned int. On 32-bit platforms, it could be
defined as unsigned long. The same caution applies to ntohl().
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the repo is empty, it is obvious that there are no common commits
when fetching from _anywhere_.
So there is no use in saying it in that case, and it can even be
annoying. Therefore suppress the message unilaterally if the repository
is empty prior to the fetch.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Immediately after fetching a pack, we should call reprepare_packed_git() to
make sure the objects in the pack are reachable. Otherwise, we will fail to
look up objects that are present only in the fetched pack.
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_config() only had a function parameter, but no callback data
parameter. This assumes that all callback functions only modify
global variables.
With this patch, every callback gets a void * parameter, and it is hoped
that this will help the libification effort.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When I applied Linus's patch from the list by hand somehow I ended
up reversing the logic by mistake. This fixes it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
f3ec549 (fetch-pack: check parse_commit/object results, 2008-03-03)
broke common ancestor computation by stopping traversal when it sees
an already parsed commit. This should fix it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Before the second fetch-pack connection in the same process, unmark
all of the objects marked in the first connection, in order that we'll
list them as things we have instead of thinking we've already
mentioned them.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* sp/fetch-optim:
Teach git-fetch to exploit server side automatic tag following
Teach fetch-pack/upload-pack about --include-tag
git-pack-objects: Automatically pack annotated tags if object was packed
Teach git-fetch to grab a tag at the same time as a commit
Make git-fetch follow tags we already have objects for sooner
Teach upload-pack to log the received need lines to an fd
Free the path_lists used to find non-local tags in git-fetch
Allow builtin-fetch's find_non_local_tags to append onto a list
Ensure tail pointer gets setup correctly when we fetch HEAD only
Remove unnecessary delaying of free_refs(ref_map) in builtin-fetch
Remove unused variable in builtin-fetch find_non_local_tags
The new protocol extension "include-tag" allows the client side
of the connection (fetch-pack) to request that the server side of the
native git protocol (upload-pack / pack-objects) use --include-tag
as it prepares the packfile, thus ensuring that an annotated tag object
will be included in the resulting packfile if the object it refers to
was also included into the packfile.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mk/maint-parse-careful:
receive-pack: use strict mode for unpacking objects
index-pack: introduce checking mode
unpack-objects: prevent writing of inconsistent objects
unpack-object: cache for non written objects
add common fsck error printing function
builtin-fsck: move common object checking code to fsck.c
builtin-fsck: reports missing parent commits
Remove unused object-ref code
builtin-fsck: move away from object-refs to fsck_walk
add generic, type aware object chain walker
Conflicts:
Makefile
builtin-fsck.c
By setting .in, .out, or .err members of struct child_process to -1, the
callers of start_command() can request that a pipe is allocated that talks
to the child process and one end is returned by replacing -1 with the
file descriptor.
Previously, a flag was set (for .in and .out, but not .err) to signal
finish_command() to close the pipe end that start_command() had handed out,
so it was optional for callers to close the pipe, and many already do so.
Now we make it mandatory to close the pipe.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This shares the connection between getting the remote ref list and
getting objects in the first batch. (A second connection is still used
to follow tags).
When we do not fetch objects (i.e. either ls-remote disconnects after
getting list of refs, or we decide we are already up-to-date), we
clean up the connection properly; otherwise the connection is left
open in need of cleaning up to avoid getting an error message from
the remote end when ssh is used.
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_pack() receives a pair of file descriptors that communicate with
upload-pack at the remote end. In order to support the case where the
side-band demultiplexer runs in a thread, and, hence, in the same process
as the main routine, we must not close the readable file descriptor early.
The handling of the readable fd is changed in the case where upload-pack
supports side-band communication: The old code closed the fd after it was
inherited to the side-band demultiplexer process. Now we do not close it.
The caller (do_fetch_pack) will close it later anyway. The demultiplexer
is the only reader, it does not matter that the fd remains open in the
main process as well as in unpack-objects/index-pack, which inherits it.
The writable fd is not needed in get_pack(), hence, the old code closed
the fd. For symmetry with the readable fd, we now do not close it; the
caller (do_fetch_pack) will close it later anyway. Therefore, the new
behavior is that the channel now remains open during the entire
conversation, but this has no ill effects because upload-pack does not read
from it once it has begun to send the pack data. For the same reason it
does not matter that the writable fd is now inherited to the demultiplexer
and unpack-objects/index-pack processes.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/forkexec:
Use the asyncronous function infrastructure to run the content filter.
Avoid a dup2(2) in apply_filter() - start_command() can do it for us.
t0021-conversion.sh: Test that the clean filter really cleans content.
upload-pack: Run rev-list in an asynchronous function.
upload-pack: Move the revision walker into a separate function.
Use the asyncronous function infrastructure in builtin-fetch-pack.c.
Add infrastructure to run a function asynchronously.
upload-pack: Use start_command() to run pack-objects in create_pack_file().
Have start_command() create a pipe to read the stderr of the child.
Use start_comand() in builtin-fetch-pack.c instead of explicit fork/exec.
Use run_command() to spawn external diff programs instead of fork/exec.
Use start_command() to run content filters instead of explicit fork/exec.
Use start_command() in git_connect() instead of explicit fork/exec.
Change git_connect() to return a struct child_process instead of a pid_t.
Conflicts:
builtin-fetch-pack.c