Add managed "env" array to child_process to clarify the lifetime
rules.
* rs/run-command-env-array:
use env_array member of struct child_process
run-command: add env_array, an optional argv_array for env
Convert users of struct child_process to using the managed env_array for
specifying environment variables instead of supplying an array on the
stack or bringing their own argv_array. This shortens and simplifies
the code and ensures automatically that the allocated memory is freed
after use.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading). Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.
While at it, swap two of the arguments in the function to put output
arguments at the end. As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.
Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most struct child_process variables are cleared using memset first after
declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead. That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use `git_config_get_bool()` family instead of `git_config()` to take advantage of
the config-set API which provides a cleaner control flow.
Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use xmemdupz() to allocate the memory, copy the data and make sure to
NUL-terminate the result, all in one step. The resulting code is
shorter, doesn't contain the constants 1 and '\0', and avoids
duplicating function parameters.
For blame, the last copied byte (o->file.ptr[o->file.size]) is always
set to NUL by fake_working_tree_commit() or read_sha1_file(), so no
information is lost by the conversion to using xmemdupz().
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the webserver responds with "405 Method Not Allowed", it
should tell the client what methods are allowed with the "Allow"
header.
* bc/http-backend-allow-405:
http-backend: provide Allow header for 405
The HTTP 1.1 standard requires an Allow header for 405 Method Not Allowed:
The response MUST include an Allow header containing a list of valid methods
for the requested resource.
So provide such a header when we return a 405 to the user agent.
Signed-off-by: Brian M. Carlson <sandals@crustytoothpaste.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow smart-capable HTTP servers to be restricted via the
GIT_NAMESPACE mechanism when talking with commit-walker clients
(they already do so when talking with smart HTTP clients).
* jk/http-dumb-namespaces:
http-backend: respect GIT_NAMESPACE with dumb clients
Filter the list of refs returned via the dumb HTTP protocol according
to the active namespace, consistent with other clients of the
upload-pack service.
Signed-off-by: John Koleszar <jkoleszar@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is just write_or_die by another name. The one
distinction is that write_or_die will treat EPIPE specially
by suppressing error messages. That's fine, as we die by
SIGPIPE anyway (and in the off chance that it is disabled,
write_or_die will simulate it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The http-backend program sets default GIT_COMMITTER_NAME and
GIT_COMMITTER_EMAIL variables based on the REMOTE_USER and
REMOTE_ADDR variables provided by the webserver. However, it
unconditionally overwrites any existing GIT_COMMITTER
variables, which may have been customized by site-specific
code in the webserver (or in a script wrapping http-backend).
Let's leave those variables intact if they already exist,
assuming that any such configuration was intentional. There
is a slight chance of a regression if somebody has set
GIT_COMMITTER_* for the entire webserver, not intending it
to leak through http-backend. We could protect against this
by passing the information in alternate variables. However,
it seems unlikely that anyone will care about that
regression, and there is value in the simplicity of using
the common variable names that are used elsewhere in git.
While we're tweaking the environment-handling in
http-backend, let's switch it to use argv_array to handle
the list of variables. That makes the memory management much
simpler.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the skeleton implementation of i18n in Git to one that can show
localized strings to users for our C, Shell and Perl programs using
either GNU libintl or the Solaris gettext implementation.
This new internationalization support is enabled by default. If
gettext isn't available, or if Git is compiled with
NO_GETTEXT=YesPlease, Git falls back on its current behavior of
showing interface messages in English. When using the autoconf script
we'll auto-detect if the gettext libraries are installed and act
appropriately.
This change is somewhat large because as well as adding a C, Shell and
Perl i18n interface we're adding a lot of tests for them, and for
those tests to work we need a skeleton PO file to actually test
translations. A minimal Icelandic translation is included for this
purpose. Icelandic includes multi-byte characters which makes it easy
to test various edge cases, and it's a language I happen to
understand.
The rest of the commit message goes into detail about various
sub-parts of this commit.
= Installation
Gettext .mo files will be installed and looked for in the standard
$(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to
override that, but that's only intended to be used to test Git itself.
= Perl
Perl code that's to be localized should use the new Git::I18n
module. It imports a __ function into the caller's package by default.
Instead of using the high level Locale::TextDomain interface I've
opted to use the low-level (equivalent to the C interface)
Locale::Messages module, which Locale::TextDomain itself uses.
Locale::TextDomain does a lot of redundant work we don't need, and
some of it would potentially introduce bugs. It tries to set the
$TEXTDOMAIN based on package of the caller, and has its own
hardcoded paths where it'll search for messages.
I found it easier just to completely avoid it rather than try to
circumvent its behavior. In any case, this is an issue wholly
internal Git::I18N. Its guts can be changed later if that's deemed
necessary.
See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for
a further elaboration on this topic.
= Shell
Shell code that's to be localized should use the git-sh-i18n
library. It's basically just a wrapper for the system's gettext.sh.
If gettext.sh isn't available we'll fall back on gettext(1) if it's
available. The latter is available without the former on Solaris,
which has its own non-GNU gettext implementation. We also need to
emulate eval_gettext() there.
If neither are present we'll use a dumb printf(1) fall-through
wrapper.
= About libcharset.h and langinfo.h
We use libcharset to query the character set of the current locale if
it's available. I.e. we'll use it instead of nl_langinfo if
HAVE_LIBCHARSET_H is set.
The GNU gettext manual recommends using langinfo.h's
nl_langinfo(CODESET) to acquire the current character set, but on
systems that have libcharset.h's locale_charset() using the latter is
either saner, or the only option on those systems.
GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either,
but MinGW and some others need to use libcharset.h's locale_charset()
instead.
=Credits
This patch is based on work by Jeff Epler <jepler@unpythonic.net> who
did the initial Makefile / C work, and a lot of comments from the Git
mailing list, including Jonathan Nieder, Jakub Narebski, Johannes
Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and
others.
[jc: squashed a small Makefile fix from Ramsay]
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.
But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.
In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.
Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-backend.c uses inflateInit2() to tell the library that it wants to
accept only gzip format. Wrap it in a helper function so that readers do
not have to wonder what the magic numbers 15 and 16 are for.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Two callsites in http-backend.c to inflate() and inflateEnd()
were not using git_ prefixed versions. After this, running
$ find all objects -print | xargs nm -ugo | grep inflate
shows only zlib.c makes direct calls to zlib for inflate operation,
except for a singlecall to inflateInit2 in http-backend.c
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* jp/string-list-api-cleanup:
string_list: Fix argument order for string_list_append
string_list: Fix argument order for string_list_lookup
string_list: Fix argument order for string_list_insert_at_index
string_list: Fix argument order for string_list_insert
string_list: Fix argument order for for_each_string_list
string_list: Fix argument order for print_string_list
Update the definition and callers of string_list_lookup to use the
string_list as the first argument. This helps make the string_list
API easier to use by being more consistent.
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the definition and callers of string_list_insert to use the
string_list as the first argument. This helps make the string_list
API easier to use by being more consistent.
Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/async-thread:
fast-import: die_nicely() back to vsnprintf (reverts part of ebaa79f)
Enable threaded async procedures whenever pthreads is available
Dying in an async procedure should only exit the thread, not the process.
Reimplement async procedures using pthreads
Windows: more pthreads functions
Fix signature of fcntl() compatibility dummy
Make report() from usage.c public as vreportf() and use it.
Modernize t5530-upload-pack-error.
Conflicts:
http-backend.c
The is_url function and url percent-decoding functions were
static, but are generally useful. Let's make them available
to other parts of the code.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If stdout has already been closed by the CGI and die() gets called,
the CGI will fail to write the "Status: 500 Internal Server Error" to
the pipe, which results in die() being called again (via safe_write).
This goes on in an infinite loop until the stack overflows and the
process is killed by SIGSEGV.
Instead set a flag on the first die() invocation and if we came back to
the handler, just die silently, as it only means we failed to report the
failure---we cannot report anything anyway in such a case. This way
failures to write the error messages to the stdout pipe do not result in
an infinite loop.
We also now report on the death to stderr before we report to stdout,
to increase the chances that the cause of the die() invocation will
appear in the server's error log.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fixup! http-backend.c: Don't infinite loop
Now die_webcgi() actually can return during a recursive call into it,
causing
http-backend.c:554: error: 'noreturn' function does return
The only reason we would come back to the die handler is because we
failed during it, so we cannot report anything anyway.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There exist already a number of static functions named 'report', therefore,
the function name was changed.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to how git-daemon checks whether a repository is OK to be
exported, smart-http should also check. This check can be satisfied
in two different ways: the environmental variable GIT_HTTP_EXPORT_ALL
may be set to export all repositories, or the individual repository
may have the file git-daemon-export-ok.
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We already have these checks in many printf-type functions that have
prototypes which are in header files. Add these same checks to
static functions in http-backend.c
Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com>
Acked-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Found with valgrind while looking for Content-Length corruption in
smart http.
Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our Content-Length needs to report an off_t, which could be larger
precision than size_t on this system (e.g. 32 bit binary built with
64 bit large file support).
We also shouldn't be passing a size_t parameter to printf when
we've used PRIuMAX as the format specifier.
Fix both issues by using uintmax_t for the hdr_int() routine,
allowing strbuf's size_t to automatically upcast, and off_t to
always fit.
Also fixed the copy loop we use inside of send_local_file(), we never
actually updated the size variable so we might as well not use it.
Reported-by: Tarmigan <tarmigan+git@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eons ago HPA taught git-daemon how to protect itself from /../
attacks, which Junio brought back into service in d79374c7b5
("daemon.c and path.enter_repo(): revamp path validation").
I did not carry this into git-http-backend as originally we relied
only upon PATH_TRANSLATED, and assumed the HTTP server had done
its access control checks to validate the resolved path was within
a directory permitting access from the remote client. This would
usually be sufficient to protect a server from requests for its
/etc/passwd file by http://host/smart/../etc/passwd sorts of URLs.
However in 917adc0360 Mark Lodato added GIT_PROJECT_ROOT as an
additional method of configuring the CGI. When this environment
variable is used the web server does not generate the final access
path and therefore may blindly pass through "/../etc/passwd"
in PATH_INFO under the assumption that "/../" might have special
meaning to the invoked CGI.
Instead of permitting these sorts of malformed path requests, we
now reject them back at the client, with an error message for the
server log. This matches git-daemon behavior.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
http-backend: Fix symbol clash on AIX 5.3
Mike says:
> > +static void send_file(const char *the_type, const char *name)
> > +{
>
> I think a symbol clash here is responsible for a build breakage in
> next on AIX 5.3:
>
> CC http-backend.o
> http-backend.c:213: error: conflicting types for `send_file'
> /usr/include/sys/socket.h:676: error: previous declaration of `send_file'
> gmake: *** [http-backend.o] Error 1
So we rename the function send_local_file().
Reported-by: Mike Ralphson <mike.ralphson@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some repository owners may wish to enable smart HTTP, but disallow
dumb content serving. Disallowing dumb serving might be because
the owners want to rely upon reachability to control which objects
clients may access from the repository, or they just want to
encourage clients to use the more bandwidth efficient transport.
If http.getanyfile is set to false the backend CGI will return with
'403 Forbidden' when an object file is accessed by a dumb client.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a new environment variable, GIT_PROJECT_ROOT, to override the
method of using PATH_TRANSLATED to find the git repository on disk.
This makes it much easier to configure the web server, especially when
the web server's DocumentRoot does not contain the git repositories,
which is the usual case.
Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack
are forwarded to the corresponding backend process by directly
executing it and leaving stdin and stdout connected to the invoking
web server. Prior to starting the backend process the HTTP response
headers are sent, thereby freeing the backend from needing to know
about the HTTP protocol.
Requests that are encoded with Content-Encoding: gzip are
automatically inflated before being streamed into the backend.
This is primarily useful for the git-upload-pack backend, which
receives highly repetitive text data from clients that easily
compresses to 50% of its original size.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The git-http-backend CGI can be configured into any Apache server
using ScriptAlias, such as with the following configuration:
LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so
LoadModule alias_module /usr/libexec/apache2/mod_alias.so
ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/
Repositories are accessed via the translated PATH_INFO.
The CGI is backwards compatible with the dumb client, allowing all
older HTTP clients to continue to download repositories which are
managed by the CGI.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>