mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-01 06:47:52 +01:00

704 lines

16 KiB

C

Raw Normal View History

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`#include "cache.h"`
			`#include "refs.h"`
			`#include "pkt-line.h"`
			`#include "object.h"`
			`#include "tag.h"`
			`#include "exec_cmd.h"`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`#include "run-command.h"`
			`#include "string-list.h"`
make url-related functions reusable The is_url function and url percent-decoding functions were static, but are generally useful. Let's make them available to other parts of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-23 11:17:55 +02:00			`#include "url.h"`
http-backend: respect existing GIT_COMMITTER_* variables The http-backend program sets default GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL variables based on the REMOTE_USER and REMOTE_ADDR variables provided by the webserver. However, it unconditionally overwrites any existing GIT_COMMITTER variables, which may have been customized by site-specific code in the webserver (or in a script wrapping http-backend). Let's leave those variables intact if they already exist, assuming that any such configuration was intentional. There is a slight chance of a regression if somebody has set GIT_COMMITTER_* for the entire webserver, not intending it to leak through http-backend. We could protect against this by passing the information in alternate variables. However, it seems unlikely that anyone will care about that regression, and there is value in the simplicity of using the common variable names that are used elsewhere in git. While we're tweaking the environment-handling in http-backend, let's switch it to use argv_array to handle the list of variables. That makes the memory management much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-30 09:01:30 +02:00			`#include "argv-array.h"`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`static const char content_type[] = "Content-Type";`
			`static const char content_length[] = "Content-Length";`
			`static const char last_modified[] = "Last-Modified";`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`static int getanyfile = 1;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`static unsigned long max_request_buffer = 10 * 1024 * 1024;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`static struct string_list *query_params;`

			`struct rpc_service {`
			`const char *name;`
			`const char *config_name;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`unsigned buffer_input : 1;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`signed enabled : 2;`
			`};`

			`static struct rpc_service rpc_service[] = {`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`{ "upload-pack", "uploadpack", 1, 1 },`
			`{ "receive-pack", "receivepack", 0, -1 },`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`};`

			`static struct string_list *get_parameters(void)`
			`{`
			`if (!query_params) {`
			`const char *query = getenv("QUERY_STRING");`

			`query_params = xcalloc(1, sizeof(*query_params));`
			`while (query && *query) {`
make url-related functions reusable The is_url function and url percent-decoding functions were static, but are generally useful. Let's make them available to other parts of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-23 11:17:55 +02:00			`char *name = url_decode_parameter_name(&query);`
			`char *value = url_decode_parameter_value(&query);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`struct string_list_item *i;`

string_list: Fix argument order for string_list_lookup Update the definition and callers of string_list_lookup to use the string_list as the first argument. This helps make the string_list API easier to use by being more consistent. Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-26 01:41:37 +02:00			`i = string_list_lookup(query_params, name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`if (!i)`
string_list: Fix argument order for string_list_insert Update the definition and callers of string_list_insert to use the string_list as the first argument. This helps make the string_list API easier to use by being more consistent. Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-26 01:41:35 +02:00			`i = string_list_insert(query_params, name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`else`
			`free(i->util);`
			`i->util = value;`
			`}`
			`}`
			`return query_params;`
			`}`

			`static const char get_parameter(const char name)`
			`{`
			`struct string_list_item *i;`
string_list: Fix argument order for string_list_lookup Update the definition and callers of string_list_lookup to use the string_list as the first argument. This helps make the string_list API easier to use by being more consistent. Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-26 01:41:37 +02:00			`i = string_list_lookup(get_parameters(), name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`return i ? i->util : NULL;`
			`}`

http-backend: Let gcc check the format of more printf-type functions. We already have these checks in many printf-type functions that have prototypes which are in header files. Add these same checks to static functions in http-backend.c Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:58 +01:00			`__attribute__((format (printf, 2, 3)))`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static void format_write(int fd, const char *fmt, ...)`
			`{`
			`static char buffer[1024];`

			`va_list args;`
			`unsigned n;`

			`va_start(args, fmt);`
			`n = vsnprintf(buffer, sizeof(buffer), fmt, args);`
			`va_end(args);`
			`if (n >= sizeof(buffer))`
			`die("protocol error: impossibly long line");`

pkt-line: drop safe_write function This is just write_or_die by another name. The one distinction is that write_or_die will treat EPIPE specially by suppressing error messages. That's fine, as we die by SIGPIPE anyway (and in the off chance that it is disabled, write_or_die will simulate it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-20 21:01:56 +01:00			`write_or_die(fd, buffer, n);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

			`static void http_status(unsigned code, const char *msg)`
			`{`
			`format_write(1, "Status: %u %s\r\n", code, msg);`
			`}`

			`static void hdr_str(const char name, const char value)`
			`{`
			`format_write(1, "%s: %s\r\n", name, value);`
			`}`

http-backend: Fix bad treatment of uintmax_t in Content-Length Our Content-Length needs to report an off_t, which could be larger precision than size_t on this system (e.g. 32 bit binary built with 64 bit large file support). We also shouldn't be passing a size_t parameter to printf when we've used PRIuMAX as the format specifier. Fix both issues by using uintmax_t for the hdr_int() routine, allowing strbuf's size_t to automatically upcast, and off_t to always fit. Also fixed the copy loop we use inside of send_local_file(), we never actually updated the size variable so we might as well not use it. Reported-by: Tarmigan <tarmigan+git@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-12 05:42:41 +01:00			`static void hdr_int(const char *name, uintmax_t value)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
			`format_write(1, "%s: %" PRIuMAX "\r\n", name, value);`
			`}`

			`static void hdr_date(const char *name, unsigned long when)`
			`{`
convert "enum date_mode" into a struct In preparation for adding date modes that may carry extra information beyond the mode itself, this patch converts the date_mode enum into a struct. Most of the conversion is fairly straightforward; we pass the struct as a pointer and dereference the type field where necessary. Locations that declare a date_mode can use a "{}" constructor. However, the tricky case is where we use the enum labels as constants, like: show_date(t, tz, DATE_NORMAL); Ideally we could say: show_date(t, tz, &{ DATE_NORMAL }); but of course C does not allow that. Likewise, we cannot cast the constant to a struct, because we need to pass an actual address. Our options are basically: 1. Manually add a "struct date_mode d = { DATE_NORMAL }" definition to each caller, and pass "&d". This makes the callers uglier, because they sometimes do not even have their own scope (e.g., they are inside a switch statement). 2. Provide a pre-made global "date_normal" struct that can be passed by address. We'd also need "date_rfc2822", "date_iso8601", and so forth. But at least the ugliness is defined in one place. 3. Provide a wrapper that generates the correct struct on the fly. The big downside is that we end up pointing to a single global, which makes our wrapper non-reentrant. But show_date is already not reentrant, so it does not matter. This patch implements 3, along with a minor macro to keep the size of the callers sane. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-25 18:55:02 +02:00			`const char *value = show_date(when, 0, DATE_MODE(RFC2822));`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`hdr_str(name, value);`
			`}`

			`static void hdr_nocache(void)`
			`{`
			`hdr_str("Expires", "Fri, 01 Jan 1980 00:00:00 GMT");`
			`hdr_str("Pragma", "no-cache");`
			`hdr_str("Cache-Control", "no-cache, max-age=0, must-revalidate");`
			`}`

			`static void hdr_cache_forever(void)`
			`{`
			`unsigned long now = time(NULL);`
			`hdr_date("Date", now);`
			`hdr_date("Expires", now + 31536000);`
			`hdr_str("Cache-Control", "public, max-age=31536000");`
			`}`

			`static void end_headers(void)`
			`{`
pkt-line: drop safe_write function This is just write_or_die by another name. The one distinction is that write_or_die will treat EPIPE specially by suppressing error messages. That's fine, as we die by SIGPIPE anyway (and in the off chance that it is disabled, write_or_die will simulate it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-20 21:01:56 +01:00			`write_or_die(1, "\r\n", 2);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: Let gcc check the format of more printf-type functions. We already have these checks in many printf-type functions that have prototypes which are in header files. Add these same checks to static functions in http-backend.c Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:58 +01:00			`__attribute__((format (printf, 1, 2)))`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static NORETURN void not_found(const char *err, ...)`
			`{`
			`va_list params;`

			`http_status(404, "Not Found");`
			`hdr_nocache();`
			`end_headers();`

			`va_start(params, err);`
			`if (err && *err)`
			`vfprintf(stderr, err, params);`
			`va_end(params);`
			`exit(0);`
			`}`

http-backend: Let gcc check the format of more printf-type functions. We already have these checks in many printf-type functions that have prototypes which are in header files. Add these same checks to static functions in http-backend.c Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:58 +01:00			`__attribute__((format (printf, 1, 2)))`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`static NORETURN void forbidden(const char *err, ...)`
			`{`
			`va_list params;`

			`http_status(403, "Forbidden");`
			`hdr_nocache();`
			`end_headers();`

			`va_start(params, err);`
			`if (err && *err)`
			`vfprintf(stderr, err, params);`
			`va_end(params);`
			`exit(0);`
			`}`

http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`static void select_getanyfile(void)`
			`{`
			`if (!getanyfile)`
			`forbidden("Unsupported service: getanyfile");`
			`}`

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static void send_strbuf(const char type, struct strbuf buf)`
			`{`
			`hdr_int(content_length, buf->len);`
			`hdr_str(content_type, type);`
			`end_headers();`
pkt-line: drop safe_write function This is just write_or_die by another name. The one distinction is that write_or_die will treat EPIPE specially by suppressing error messages. That's fine, as we die by SIGPIPE anyway (and in the off chance that it is disabled, write_or_die will simulate it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-20 21:01:56 +01:00			`write_or_die(1, buf->buf, buf->len);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

Git-aware CGI to provide dumb HTTP transport http-backend: Fix symbol clash on AIX 5.3 Mike says: > > +static void send_file(const char the_type, const char name) > > +{ > > I think a symbol clash here is responsible for a build breakage in > next on AIX 5.3: > > CC http-backend.o > http-backend.c:213: error: conflicting types for `send_file' > /usr/include/sys/socket.h:676: error: previous declaration of `send_file' > gmake: *** [http-backend.o] Error 1 So we rename the function send_local_file(). Reported-by: Mike Ralphson <mike.ralphson@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 16:24:33 +01:00			`static void send_local_file(const char the_type, const char name)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
prefer git_pathdup to git_path in some possibly-dangerous cases Because git_path uses a static buffer that is shared with calls to git_path, mkpath, etc, it can be dangerous to assign the result to a variable or pass it to a non-trivial function. The value may change unexpectedly due to other calls. None of the cases changed here has a known bug, but they're worth converting away from git_path because: 1. It's easy to use git_pathdup in these cases. 2. They use constructs (like assignment) that make it hard to tell whether they're safe or not. The extra malloc overhead should be trivial, as an allocation should be an order of magnitude cheaper than a system call (which we are clearly about to make, since we are constructing a filename). The real cost is that we must remember to free the result. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-08-10 11:35:31 +02:00			`char *p = git_pathdup("%s", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`size_t buf_alloc = 8192;`
			`char *buf = xmalloc(buf_alloc);`
			`int fd;`
			`struct stat sb;`

			`fd = open(p, O_RDONLY);`
			`if (fd < 0)`
			`not_found("Cannot open '%s': %s", p, strerror(errno));`
			`if (fstat(fd, &sb) < 0)`
			`die_errno("Cannot stat '%s'", p);`

http-backend: Fix bad treatment of uintmax_t in Content-Length Our Content-Length needs to report an off_t, which could be larger precision than size_t on this system (e.g. 32 bit binary built with 64 bit large file support). We also shouldn't be passing a size_t parameter to printf when we've used PRIuMAX as the format specifier. Fix both issues by using uintmax_t for the hdr_int() routine, allowing strbuf's size_t to automatically upcast, and off_t to always fit. Also fixed the copy loop we use inside of send_local_file(), we never actually updated the size variable so we might as well not use it. Reported-by: Tarmigan <tarmigan+git@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-12 05:42:41 +01:00			`hdr_int(content_length, sb.st_size);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`hdr_str(content_type, the_type);`
			`hdr_date(last_modified, sb.st_mtime);`
			`end_headers();`

http-backend: Fix bad treatment of uintmax_t in Content-Length Our Content-Length needs to report an off_t, which could be larger precision than size_t on this system (e.g. 32 bit binary built with 64 bit large file support). We also shouldn't be passing a size_t parameter to printf when we've used PRIuMAX as the format specifier. Fix both issues by using uintmax_t for the hdr_int() routine, allowing strbuf's size_t to automatically upcast, and off_t to always fit. Also fixed the copy loop we use inside of send_local_file(), we never actually updated the size variable so we might as well not use it. Reported-by: Tarmigan <tarmigan+git@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-12 05:42:41 +01:00			`for (;;) {`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`ssize_t n = xread(fd, buf, buf_alloc);`
			`if (n < 0)`
			`die_errno("Cannot read '%s'", p);`
			`if (!n)`
			`break;`
pkt-line: drop safe_write function This is just write_or_die by another name. The one distinction is that write_or_die will treat EPIPE specially by suppressing error messages. That's fine, as we die by SIGPIPE anyway (and in the off chance that it is disabled, write_or_die will simulate it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-20 21:01:56 +01:00			`write_or_die(1, buf, n);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`
			`close(fd);`
			`free(buf);`
prefer git_pathdup to git_path in some possibly-dangerous cases Because git_path uses a static buffer that is shared with calls to git_path, mkpath, etc, it can be dangerous to assign the result to a variable or pass it to a non-trivial function. The value may change unexpectedly due to other calls. None of the cases changed here has a known bug, but they're worth converting away from git_path because: 1. It's easy to use git_pathdup in these cases. 2. They use constructs (like assignment) that make it hard to tell whether they're safe or not. The extra malloc overhead should be trivial, as an allocation should be an order of magnitude cheaper than a system call (which we are clearly about to make, since we are constructing a filename). The real cost is that we must remember to free the result. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-08-10 11:35:31 +02:00			`free(p);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

			`static void get_text_file(char *name)`
			`{`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`select_getanyfile();`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`hdr_nocache();`
Git-aware CGI to provide dumb HTTP transport http-backend: Fix symbol clash on AIX 5.3 Mike says: > > +static void send_file(const char the_type, const char name) > > +{ > > I think a symbol clash here is responsible for a build breakage in > next on AIX 5.3: > > CC http-backend.o > http-backend.c:213: error: conflicting types for `send_file' > /usr/include/sys/socket.h:676: error: previous declaration of `send_file' > gmake: *** [http-backend.o] Error 1 So we rename the function send_local_file(). Reported-by: Mike Ralphson <mike.ralphson@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 16:24:33 +01:00			`send_local_file("text/plain", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

			`static void get_loose_object(char *name)`
			`{`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`select_getanyfile();`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`hdr_cache_forever();`
Git-aware CGI to provide dumb HTTP transport http-backend: Fix symbol clash on AIX 5.3 Mike says: > > +static void send_file(const char the_type, const char name) > > +{ > > I think a symbol clash here is responsible for a build breakage in > next on AIX 5.3: > > CC http-backend.o > http-backend.c:213: error: conflicting types for `send_file' > /usr/include/sys/socket.h:676: error: previous declaration of `send_file' > gmake: *** [http-backend.o] Error 1 So we rename the function send_local_file(). Reported-by: Mike Ralphson <mike.ralphson@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 16:24:33 +01:00			`send_local_file("application/x-git-loose-object", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

			`static void get_pack_file(char *name)`
			`{`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`select_getanyfile();`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`hdr_cache_forever();`
Git-aware CGI to provide dumb HTTP transport http-backend: Fix symbol clash on AIX 5.3 Mike says: > > +static void send_file(const char the_type, const char name) > > +{ > > I think a symbol clash here is responsible for a build breakage in > next on AIX 5.3: > > CC http-backend.o > http-backend.c:213: error: conflicting types for `send_file' > /usr/include/sys/socket.h:676: error: previous declaration of `send_file' > gmake: *** [http-backend.o] Error 1 So we rename the function send_local_file(). Reported-by: Mike Ralphson <mike.ralphson@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 16:24:33 +01:00			`send_local_file("application/x-git-packed-objects", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

			`static void get_idx_file(char *name)`
			`{`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`select_getanyfile();`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`hdr_cache_forever();`
Git-aware CGI to provide dumb HTTP transport http-backend: Fix symbol clash on AIX 5.3 Mike says: > > +static void send_file(const char the_type, const char name) > > +{ > > I think a symbol clash here is responsible for a build breakage in > next on AIX 5.3: > > CC http-backend.o > http-backend.c:213: error: conflicting types for `send_file' > /usr/include/sys/socket.h:676: error: previous declaration of `send_file' > gmake: *** [http-backend.o] Error 1 So we rename the function send_local_file(). Reported-by: Mike Ralphson <mike.ralphson@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 16:24:33 +01:00			`send_local_file("application/x-git-packed-objects-toc", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`static void http_config(void)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`int i, value = 0;`
			`struct strbuf var = STRBUF_INIT;`
use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`git_config_get_bool("http.getanyfile", &getanyfile);`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`git_config_get_ulong("http.maxrequestbuffer", &max_request_buffer);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`for (i = 0; i < ARRAY_SIZE(rpc_service); i++) {`
			`struct rpc_service *svc = &rpc_service[i];`
			`strbuf_addf(&var, "http.%s", svc->config_name);`
			`if (!git_config_get_bool(var.buf, &value))`
			`svc->enabled = value;`
			`strbuf_reset(&var);`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`}`

http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`strbuf_release(&var);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`}`

			`static struct rpc_service select_service(const char name)`
			`{`
use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00			`const char *svc_name;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`struct rpc_service *svc = NULL;`
			`int i;`

use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00			`if (!skip_prefix(name, "git-", &svc_name))`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`forbidden("Unsupported service: '%s'", name);`

			`for (i = 0; i < ARRAY_SIZE(rpc_service); i++) {`
			`struct rpc_service *s = &rpc_service[i];`
use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00			`if (!strcmp(s->name, svc_name)) {`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`svc = s;`
			`break;`
			`}`
			`}`

			`if (!svc)`
			`forbidden("Unsupported service: '%s'", name);`

			`if (svc->enabled < 0) {`
			`const char *user = getenv("REMOTE_USER");`
			`svc->enabled = (user && *user) ? 1 : 0;`
			`}`
			`if (!svc->enabled)`
			`forbidden("Service not enabled: '%s'", svc->name);`
			`return svc;`
			`}`

http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`/*`
			`* This is basically strbuf_read(), except that if we`
			`* hit max_request_buffer we die (we'd rather reject a`
			`* maliciously large request than chew up infinite memory).`
			`*/`
			`static ssize_t read_request(int fd, unsigned char **out)`
			`{`
			`size_t len = 0, alloc = 8192;`
			`unsigned char *buf = xmalloc(alloc);`

			`if (max_request_buffer < alloc)`
			`max_request_buffer = alloc;`

			`while (1) {`
			`ssize_t cnt;`

			`cnt = read_in_full(fd, buf + len, alloc - len);`
			`if (cnt < 0) {`
			`free(buf);`
			`return -1;`
			`}`

			`/* partial read from read_in_full means we hit EOF */`
			`len += cnt;`
			`if (len < alloc) {`
			`*out = buf;`
			`return len;`
			`}`

			`/* otherwise, grow and try again (if we can) */`
			`if (alloc == max_request_buffer)`
			`die("request was larger than our maximum size (%lu);"`
			`" try setting GIT_HTTP_MAX_REQUEST_BUFFER",`
			`max_request_buffer);`

			`alloc = alloc_nr(alloc);`
			`if (alloc > max_request_buffer)`
			`alloc = max_request_buffer;`
			`REALLOC_ARRAY(buf, alloc);`
			`}`
			`}`

			`static void inflate_request(const char *prog_name, int out, int buffer_input)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
zlib: zlib can only process 4GB at a time The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 20:52:15 +02:00			`git_zstream stream;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`unsigned char *full_request = NULL;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`unsigned char in_buf[8192];`
			`unsigned char out_buf[8192];`
			`unsigned long cnt = 0;`

			`memset(&stream, 0, sizeof(stream));`
zlib: wrap inflateInit2 used to accept only for gzip format http-backend.c uses inflateInit2() to tell the library that it wants to accept only gzip format. Wrap it in a helper function so that readers do not have to wonder what the magic numbers 15 and 16 are for. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:45:29 +02:00			`git_inflate_init_gzip_only(&stream);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`while (1) {`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`ssize_t n;`

			`if (buffer_input) {`
			`if (full_request)`
			`n = 0; /* nothing left to read */`
			`else`
			`n = read_request(0, &full_request);`
			`stream.next_in = full_request;`
			`} else {`
			`n = xread(0, in_buf, sizeof(in_buf));`
			`stream.next_in = in_buf;`
			`}`

Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`if (n <= 0)`
			`die("request ended in the middle of the gzip stream");`
			`stream.avail_in = n;`

			`while (0 < stream.avail_in) {`
			`int ret;`

			`stream.next_out = out_buf;`
			`stream.avail_out = sizeof(out_buf);`

zlib: wrap remaining calls to direct inflate/inflateEnd Two callsites in http-backend.c to inflate() and inflateEnd() were not using git_ prefixed versions. After this, running $ find all objects -print \| xargs nm -ugo \| grep inflate shows only zlib.c makes direct calls to zlib for inflate operation, except for a singlecall to inflateInit2 in http-backend.c Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:39:27 +02:00			`ret = git_inflate(&stream, Z_NO_FLUSH);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`if (ret != Z_OK && ret != Z_STREAM_END)`
			`die("zlib error inflating request, result %d", ret);`

			`n = stream.total_out - cnt;`
			`if (write_in_full(out, out_buf, n) != n)`
			`die("%s aborted reading request", prog_name);`
			`cnt += n;`

			`if (ret == Z_STREAM_END)`
			`goto done;`
			`}`
			`}`

			`done:`
zlib: wrap remaining calls to direct inflate/inflateEnd Two callsites in http-backend.c to inflate() and inflateEnd() were not using git_ prefixed versions. After this, running $ find all objects -print \| xargs nm -ugo \| grep inflate shows only zlib.c makes direct calls to zlib for inflate operation, except for a singlecall to inflateInit2 in http-backend.c Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:39:27 +02:00			`git_inflate_end(&stream);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`close(out);`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`free(full_request);`
			`}`

			`static void copy_request(const char *prog_name, int out)`
			`{`
			`unsigned char *buf;`
			`ssize_t n = read_request(0, &buf);`
			`if (n < 0)`
			`die_errno("error reading request body");`
			`if (write_in_full(out, buf, n) != n)`
			`die("%s aborted reading request", prog_name);`
			`close(out);`
			`free(buf);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`}`

http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`static void run_service(const char **argv, int buffer_input)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
			`const char *encoding = getenv("HTTP_CONTENT_ENCODING");`
			`const char *user = getenv("REMOTE_USER");`
			`const char *host = getenv("REMOTE_ADDR");`
			`int gzipped_request = 0;`
run-command: introduce CHILD_PROCESS_INIT Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-19 21:09:35 +02:00			`struct child_process cld = CHILD_PROCESS_INIT;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`if (encoding && !strcmp(encoding, "gzip"))`
			`gzipped_request = 1;`
			`else if (encoding && !strcmp(encoding, "x-gzip"))`
			`gzipped_request = 1;`

			`if (!user \|\| !*user)`
			`user = "anonymous";`
			`if (!host \|\| !*host)`
			`host = "(none)";`

http-backend: respect existing GIT_COMMITTER_* variables The http-backend program sets default GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL variables based on the REMOTE_USER and REMOTE_ADDR variables provided by the webserver. However, it unconditionally overwrites any existing GIT_COMMITTER variables, which may have been customized by site-specific code in the webserver (or in a script wrapping http-backend). Let's leave those variables intact if they already exist, assuming that any such configuration was intentional. There is a slight chance of a regression if somebody has set GIT_COMMITTER_* for the entire webserver, not intending it to leak through http-backend. We could protect against this by passing the information in alternate variables. However, it seems unlikely that anyone will care about that regression, and there is value in the simplicity of using the common variable names that are used elsewhere in git. While we're tweaking the environment-handling in http-backend, let's switch it to use argv_array to handle the list of variables. That makes the memory management much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-30 09:01:30 +02:00			`if (!getenv("GIT_COMMITTER_NAME"))`
use env_array member of struct child_process Convert users of struct child_process to using the managed env_array for specifying environment variables instead of supplying an array on the stack or bringing their own argv_array. This shortens and simplifies the code and ensures automatically that the allocated memory is freed after use. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-10-19 13:14:20 +02:00			`argv_array_pushf(&cld.env_array, "GIT_COMMITTER_NAME=%s", user);`
http-backend: respect existing GIT_COMMITTER_* variables The http-backend program sets default GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL variables based on the REMOTE_USER and REMOTE_ADDR variables provided by the webserver. However, it unconditionally overwrites any existing GIT_COMMITTER variables, which may have been customized by site-specific code in the webserver (or in a script wrapping http-backend). Let's leave those variables intact if they already exist, assuming that any such configuration was intentional. There is a slight chance of a regression if somebody has set GIT_COMMITTER_* for the entire webserver, not intending it to leak through http-backend. We could protect against this by passing the information in alternate variables. However, it seems unlikely that anyone will care about that regression, and there is value in the simplicity of using the common variable names that are used elsewhere in git. While we're tweaking the environment-handling in http-backend, let's switch it to use argv_array to handle the list of variables. That makes the memory management much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-30 09:01:30 +02:00			`if (!getenv("GIT_COMMITTER_EMAIL"))`
use env_array member of struct child_process Convert users of struct child_process to using the managed env_array for specifying environment variables instead of supplying an array on the stack or bringing their own argv_array. This shortens and simplifies the code and ensures automatically that the allocated memory is freed after use. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-10-19 13:14:20 +02:00			`argv_array_pushf(&cld.env_array,`
			`"GIT_COMMITTER_EMAIL=%s@http.%s", user, host);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`cld.argv = argv;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`if (buffer_input \|\| gzipped_request)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`cld.in = -1;`
			`cld.git_cmd = 1;`
			`if (start_command(&cld))`
			`exit(1);`

			`close(1);`
			`if (gzipped_request)`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`inflate_request(argv[0], cld.in, buffer_input);`
			`else if (buffer_input)`
			`copy_request(argv[0], cld.in);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`else`
			`close(0);`

			`if (finish_command(&cld))`
			`exit(1);`
			`}`

http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`static int show_text_ref(const char name, const struct object_id oid,`
			`int flag, void *cb_data)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`const char *name_nons = strip_namespace(name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`struct strbuf *buf = cb_data;`
http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`struct object *o = parse_object(oid->hash);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`if (!o)`
			`return 0;`

http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`strbuf_addf(buf, "%s\t%s\n", oid_to_hex(oid), name_nons);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`if (o->type == OBJ_TAG) {`
			`o = deref_tag(o, name, 0);`
			`if (!o)`
			`return 0;`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`strbuf_addf(buf, "%s\t%s^{}\n", sha1_to_hex(o->sha1),`
			`name_nons);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`
			`return 0;`
			`}`

			`static void get_info_refs(char *arg)`
			`{`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`const char *service_name = get_parameter("service");`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`struct strbuf buf = STRBUF_INIT;`

			`hdr_nocache();`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`if (service_name) {`
			`const char argv[] = {NULL / service name */,`
			`"--stateless-rpc", "--advertise-refs",`
			`".", NULL};`
			`struct rpc_service *svc = select_service(service_name);`

			`strbuf_addf(&buf, "application/x-git-%s-advertisement",`
			`svc->name);`
			`hdr_str(content_type, buf.buf);`
			`end_headers();`

			`packet_write(1, "# service=git-%s\n", svc->name);`
			`packet_flush(1);`

			`argv[0] = svc->name;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`run_service(argv, 0);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`} else {`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`select_getanyfile();`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`for_each_namespaced_ref(show_text_ref, &buf);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`send_strbuf("text/plain", &buf);`
			`}`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`strbuf_release(&buf);`
			`}`

http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`static int show_head_ref(const char refname, const struct object_id oid,`
			`int flag, void *cb_data)`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`{`
			`struct strbuf *buf = cb_data;`

			`if (flag & REF_ISSYMREF) {`
show_head_ref(): convert local variable "unused" to object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:56 +02:00			`struct object_id unused;`
refs.c: change resolve_ref_unsafe reading argument to be a flags field resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref resolves successfully for writing but not for reading). Change this to be a flags field instead, and pass the new constant RESOLVE_REF_READING when we want this behaviour. While at it, swap two of the arguments in the function to put output arguments at the end. As a nice side effect, this ensures that we can catch callers that were unaware of the new API so they can be audited. Give the wrapper functions resolve_refdup and read_ref_full the same treatment for consistency. Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-07-15 21:59:36 +02:00			`const char *target = resolve_ref_unsafe(refname,`
			`RESOLVE_REF_READING,`
show_head_ref(): convert local variable "unused" to object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:56 +02:00			`unused.hash, NULL);`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`const char *target_nons = strip_namespace(target);`

			`strbuf_addf(buf, "ref: %s\n", target_nons);`
			`} else {`
http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`strbuf_addf(buf, "%s\n", oid_to_hex(oid));`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`}`

			`return 0;`
			`}`

			`static void get_head(char *arg)`
			`{`
			`struct strbuf buf = STRBUF_INIT;`

			`select_getanyfile();`
			`head_ref_namespaced(show_head_ref, &buf);`
			`send_strbuf("text/plain", &buf);`
			`strbuf_release(&buf);`
			`}`

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static void get_info_packs(char *arg)`
			`{`
			`size_t objdirlen = strlen(get_object_directory());`
			`struct strbuf buf = STRBUF_INIT;`
			`struct packed_git *p;`
			`size_t cnt = 0;`

http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`select_getanyfile();`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`prepare_packed_git();`
			`for (p = packed_git; p; p = p->next) {`
			`if (p->pack_local)`
			`cnt++;`
			`}`

			`strbuf_grow(&buf, cnt * 53 + 2);`
			`for (p = packed_git; p; p = p->next) {`
			`if (p->pack_local)`
			`strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);`
			`}`
			`strbuf_addch(&buf, '\n');`

			`hdr_nocache();`
			`send_strbuf("text/plain; charset=utf-8", &buf);`
			`strbuf_release(&buf);`
			`}`

Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`static void check_content_type(const char *accepted_type)`
			`{`
			`const char *actual_type = getenv("CONTENT_TYPE");`

			`if (!actual_type)`
			`actual_type = "";`

			`if (strcmp(actual_type, accepted_type)) {`
			`http_status(415, "Unsupported Media Type");`
			`hdr_nocache();`
			`end_headers();`
			`format_write(1,`
			`"Expected POST with Content-Type '%s',"`
			`" but received '%s' instead.\n",`
			`accepted_type, actual_type);`
			`exit(0);`
			`}`
			`}`

			`static void service_rpc(char *service_name)`
			`{`
			`const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};`
			`struct rpc_service *svc = select_service(service_name);`
			`struct strbuf buf = STRBUF_INIT;`

			`strbuf_reset(&buf);`
			`strbuf_addf(&buf, "application/x-git-%s-request", svc->name);`
			`check_content_type(buf.buf);`

			`hdr_nocache();`

			`strbuf_reset(&buf);`
			`strbuf_addf(&buf, "application/x-git-%s-result", svc->name);`
			`hdr_str(content_type, buf.buf);`

			`end_headers();`

			`argv[0] = svc->name;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`run_service(argv, svc->buffer_input);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`strbuf_release(&buf);`
			`}`

http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`static int dead;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static NORETURN void die_webcgi(const char *err, va_list params)`
			`{`
http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`if (dead <= 1) {`
			`vreportf("fatal: ", err, params);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
http-backend: Don't infinite loop during die() If stdout has already been closed by the CGI and die() gets called, the CGI will fail to write the "Status: 500 Internal Server Error" to the pipe, which results in die() being called again (via safe_write). This goes on in an infinite loop until the stack overflows and the process is killed by SIGSEGV. Instead set a flag on the first die() invocation and if we came back to the handler, just die silently, as it only means we failed to report the failure---we cannot report anything anyway in such a case. This way failures to write the error messages to the stdout pipe do not result in an infinite loop. We also now report on the death to stderr before we report to stdout, to increase the chances that the cause of the die() invocation will appear in the server's error log. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> fixup! http-backend.c: Don't infinite loop Now die_webcgi() actually can return during a recursive call into it, causing http-backend.c:554: error: 'noreturn' function does return The only reason we would come back to the die handler is because we failed during it, so we cannot report anything anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-03-22 15:22:04 +01:00			`http_status(500, "Internal Server Error");`
			`hdr_nocache();`
			`end_headers();`
			`}`
			`exit(0); /* we successfully reported a failure ;-) */`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`static int die_webcgi_recursing(void)`
			`{`
			`return dead++ > 1;`
			`}`

http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`static char* getdir(void)`
			`{`
			`struct strbuf buf = STRBUF_INIT;`
			`char *pathinfo = getenv("PATH_INFO");`
			`char *root = getenv("GIT_PROJECT_ROOT");`
			`char *path = getenv("PATH_TRANSLATED");`

			`if (root && *root) {`
			`if (!pathinfo \|\| !*pathinfo)`
			`die("GIT_PROJECT_ROOT is set but PATH_INFO is not");`
http-backend: Protect GIT_PROJECT_ROOT from /../ requests Eons ago HPA taught git-daemon how to protect itself from /../ attacks, which Junio brought back into service in d79374c7b58d ("daemon.c and path.enter_repo(): revamp path validation"). I did not carry this into git-http-backend as originally we relied only upon PATH_TRANSLATED, and assumed the HTTP server had done its access control checks to validate the resolved path was within a directory permitting access from the remote client. This would usually be sufficient to protect a server from requests for its /etc/passwd file by http://host/smart/../etc/passwd sorts of URLs. However in 917adc036086 Mark Lodato added GIT_PROJECT_ROOT as an additional method of configuring the CGI. When this environment variable is used the web server does not generate the final access path and therefore may blindly pass through "/../etc/passwd" in PATH_INFO under the assumption that "/../" might have special meaning to the invoked CGI. Instead of permitting these sorts of malformed path requests, we now reject them back at the client, with an error message for the server log. This matches git-daemon behavior. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 20:26:43 +01:00			`if (daemon_avoid_alias(pathinfo))`
			`die("'%s': aliased", pathinfo);`
http-backend: use end_url_with_slash() Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-25 09:21:06 +01:00			`end_url_with_slash(&buf, root);`
http-backend: Protect GIT_PROJECT_ROOT from /../ requests Eons ago HPA taught git-daemon how to protect itself from /../ attacks, which Junio brought back into service in d79374c7b58d ("daemon.c and path.enter_repo(): revamp path validation"). I did not carry this into git-http-backend as originally we relied only upon PATH_TRANSLATED, and assumed the HTTP server had done its access control checks to validate the resolved path was within a directory permitting access from the remote client. This would usually be sufficient to protect a server from requests for its /etc/passwd file by http://host/smart/../etc/passwd sorts of URLs. However in 917adc036086 Mark Lodato added GIT_PROJECT_ROOT as an additional method of configuring the CGI. When this environment variable is used the web server does not generate the final access path and therefore may blindly pass through "/../etc/passwd" in PATH_INFO under the assumption that "/../" might have special meaning to the invoked CGI. Instead of permitting these sorts of malformed path requests, we now reject them back at the client, with an error message for the server log. This matches git-daemon behavior. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 20:26:43 +01:00			`if (pathinfo[0] == '/')`
			`pathinfo++;`
http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`strbuf_addstr(&buf, pathinfo);`
			`return strbuf_detach(&buf, NULL);`
			`} else if (path && *path) {`
			`return xstrdup(path);`
			`} else`
			`die("No GIT_PROJECT_ROOT or PATH_TRANSLATED from server");`
			`return NULL;`
			`}`

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static struct service_cmd {`
			`const char *method;`
			`const char *pattern;`
			`void (imp)(char );`
			`} services[] = {`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`{"GET", "/HEAD$", get_head},`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{"GET", "/info/refs$", get_info_refs},`
			`{"GET", "/objects/info/alternates$", get_text_file},`
			`{"GET", "/objects/info/http-alternates$", get_text_file},`
			`{"GET", "/objects/info/packs$", get_info_packs},`
			`{"GET", "/objects/[0-9a-f]{2}/[0-9a-f]{38}$", get_loose_object},`
			`{"GET", "/objects/pack/pack-[0-9a-f]{40}\\.pack$", get_pack_file},`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{"GET", "/objects/pack/pack-[0-9a-f]{40}\\.idx$", get_idx_file},`

			`{"POST", "/git-upload-pack$", service_rpc},`
			`{"POST", "/git-receive-pack$", service_rpc}`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`};`

			`int main(int argc, char **argv)`
			`{`
			`char *method = getenv("REQUEST_METHOD");`
http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`char *dir;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`struct service_cmd *cmd = NULL;`
			`char *cmd_arg = NULL;`
			`int i;`

i18n: add infrastructure for translating Git with gettext Change the skeleton implementation of i18n in Git to one that can show localized strings to users for our C, Shell and Perl programs using either GNU libintl or the Solaris gettext implementation. This new internationalization support is enabled by default. If gettext isn't available, or if Git is compiled with NO_GETTEXT=YesPlease, Git falls back on its current behavior of showing interface messages in English. When using the autoconf script we'll auto-detect if the gettext libraries are installed and act appropriately. This change is somewhat large because as well as adding a C, Shell and Perl i18n interface we're adding a lot of tests for them, and for those tests to work we need a skeleton PO file to actually test translations. A minimal Icelandic translation is included for this purpose. Icelandic includes multi-byte characters which makes it easy to test various edge cases, and it's a language I happen to understand. The rest of the commit message goes into detail about various sub-parts of this commit. = Installation Gettext .mo files will be installed and looked for in the standard $(prefix)/share/locale path. GIT_TEXTDOMAINDIR can also be set to override that, but that's only intended to be used to test Git itself. = Perl Perl code that's to be localized should use the new Git::I18n module. It imports a __ function into the caller's package by default. Instead of using the high level Locale::TextDomain interface I've opted to use the low-level (equivalent to the C interface) Locale::Messages module, which Locale::TextDomain itself uses. Locale::TextDomain does a lot of redundant work we don't need, and some of it would potentially introduce bugs. It tries to set the $TEXTDOMAIN based on package of the caller, and has its own hardcoded paths where it'll search for messages. I found it easier just to completely avoid it rather than try to circumvent its behavior. In any case, this is an issue wholly internal Git::I18N. Its guts can be changed later if that's deemed necessary. See <AANLkTilYD_NyIZMyj9dHtVk-ylVBfvyxpCC7982LWnVd@mail.gmail.com> for a further elaboration on this topic. = Shell Shell code that's to be localized should use the git-sh-i18n library. It's basically just a wrapper for the system's gettext.sh. If gettext.sh isn't available we'll fall back on gettext(1) if it's available. The latter is available without the former on Solaris, which has its own non-GNU gettext implementation. We also need to emulate eval_gettext() there. If neither are present we'll use a dumb printf(1) fall-through wrapper. = About libcharset.h and langinfo.h We use libcharset to query the character set of the current locale if it's available. I.e. we'll use it instead of nl_langinfo if HAVE_LIBCHARSET_H is set. The GNU gettext manual recommends using langinfo.h's nl_langinfo(CODESET) to acquire the current character set, but on systems that have libcharset.h's locale_charset() using the latter is either saner, or the only option on those systems. GNU and Solaris have a nl_langinfo(CODESET), FreeBSD can use either, but MinGW and some others need to use libcharset.h's locale_charset() instead. =Credits This patch is based on work by Jeff Epler <jepler@unpythonic.net> who did the initial Makefile / C work, and a lot of comments from the Git mailing list, including Jonathan Nieder, Jakub Narebski, Johannes Sixt, Erik Faye-Lund, Peter Krefting, Junio C Hamano, Thomas Rast and others. [jc: squashed a small Makefile fix from Ramsay] Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-11-18 00:14:42 +01:00			`git_setup_gettext();`

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`git_extract_argv0_path(argv[0]);`
			`set_die_routine(die_webcgi);`
http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`set_die_is_recursing_routine(die_webcgi_recursing);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`if (!method)`
			`die("No REQUEST_METHOD from server");`
			`if (!strcmp(method, "HEAD"))`
			`method = "GET";`
http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`dir = getdir();`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`for (i = 0; i < ARRAY_SIZE(services); i++) {`
			`struct service_cmd *c = &services[i];`
			`regex_t re;`
			`regmatch_t out[1];`

			`if (regcomp(&re, c->pattern, REG_EXTENDED))`
			`die("Bogus regex in service table: %s", c->pattern);`
			`if (!regexec(&re, dir, 1, out, 0)) {`
http-backend: Fix access beyond end of string. Found with valgrind while looking for Content-Length corruption in smart http. Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:57 +01:00			`size_t n;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`if (strcmp(method, c->method)) {`
			`const char *proto = getenv("SERVER_PROTOCOL");`
http-backend: provide Allow header for 405 The HTTP 1.1 standard requires an Allow header for 405 Method Not Allowed: The response MUST include an Allow header containing a list of valid methods for the requested resource. So provide such a header when we return a 405 to the user agent. Signed-off-by: Brian M. Carlson <sandals@crustytoothpaste.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-09-12 02:30:01 +02:00			`if (proto && !strcmp(proto, "HTTP/1.1")) {`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`http_status(405, "Method Not Allowed");`
http-backend: provide Allow header for 405 The HTTP 1.1 standard requires an Allow header for 405 Method Not Allowed: The response MUST include an Allow header containing a list of valid methods for the requested resource. So provide such a header when we return a 405 to the user agent. Signed-off-by: Brian M. Carlson <sandals@crustytoothpaste.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-09-12 02:30:01 +02:00			`hdr_str("Allow", !strcmp(c->method, "GET") ?`
			`"GET, HEAD" : c->method);`
			`} else`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`http_status(400, "Bad Request");`
			`hdr_nocache();`
			`end_headers();`
			`return 0;`
			`}`

			`cmd = c;`
http-backend: Fix access beyond end of string. Found with valgrind while looking for Content-Length corruption in smart http. Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:57 +01:00			`n = out[0].rm_eo - out[0].rm_so;`
use xmemdupz() to allocate copies of strings given by start and length Use xmemdupz() to allocate the memory, copy the data and make sure to NUL-terminate the result, all in one step. The resulting code is shorter, doesn't contain the constants 1 and '\0', and avoids duplicating function parameters. For blame, the last copied byte (o->file.ptr[o->file.size]) is always set to NUL by fake_working_tree_commit() or read_sha1_file(), so no information is lost by the conversion to using xmemdupz(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-07-19 17:35:34 +02:00			`cmd_arg = xmemdupz(dir + out[0].rm_so + 1, n - 1);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`dir[out[0].rm_so] = 0;`
			`break;`
			`}`
			`regfree(&re);`
			`}`

			`if (!cmd)`
			`not_found("Request not supported: '%s'", dir);`

			`setup_path();`
			`if (!enter_repo(dir, 0))`
			`not_found("Not a git repository: '%s'", dir);`
Smart-http: check if repository is OK to export before serving it Similar to how git-daemon checks whether a repository is OK to be exported, smart-http should also check. This check can be satisfied in two different ways: the environmental variable GIT_HTTP_EXPORT_ALL may be set to export all repositories, or the individual repository may have the file git-daemon-export-ok. Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-28 22:49:00 +01:00			`if (!getenv("GIT_HTTP_EXPORT_ALL") &&`
			`access("git-daemon-export-ok", F_OK) )`
			`not_found("Repository not exported: '%s'", dir);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`http_config();`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`max_request_buffer = git_env_ulong("GIT_HTTP_MAX_REQUEST_BUFFER",`
			`max_request_buffer);`

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`cmd->imp(cmd_arg);`
			`return 0;`
			`}`