mirrors/git - Incest Forge: Beyond sex. We incest.

mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-01 14:57:52 +01:00

713 lines

17 KiB

C

Raw Normal View History

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`#include "cache.h"`
			`#include "refs.h"`
			`#include "pkt-line.h"`
			`#include "object.h"`
			`#include "tag.h"`
			`#include "exec_cmd.h"`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`#include "run-command.h"`
			`#include "string-list.h"`
make url-related functions reusable The is_url function and url percent-decoding functions were static, but are generally useful. Let's make them available to other parts of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-23 11:17:55 +02:00			`#include "url.h"`
http-backend: respect existing GIT_COMMITTER_* variables The http-backend program sets default GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL variables based on the REMOTE_USER and REMOTE_ADDR variables provided by the webserver. However, it unconditionally overwrites any existing GIT_COMMITTER variables, which may have been customized by site-specific code in the webserver (or in a script wrapping http-backend). Let's leave those variables intact if they already exist, assuming that any such configuration was intentional. There is a slight chance of a regression if somebody has set GIT_COMMITTER_* for the entire webserver, not intending it to leak through http-backend. We could protect against this by passing the information in alternate variables. However, it seems unlikely that anyone will care about that regression, and there is value in the simplicity of using the common variable names that are used elsewhere in git. While we're tweaking the environment-handling in http-backend, let's switch it to use argv_array to handle the list of variables. That makes the memory management much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-30 09:01:30 +02:00			`#include "argv-array.h"`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`static const char content_type[] = "Content-Type";`
			`static const char content_length[] = "Content-Length";`
			`static const char last_modified[] = "Last-Modified";`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`static int getanyfile = 1;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`static unsigned long max_request_buffer = 10 * 1024 * 1024;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`static struct string_list *query_params;`

			`struct rpc_service {`
			`const char *name;`
			`const char *config_name;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`unsigned buffer_input : 1;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`signed enabled : 2;`
			`};`

			`static struct rpc_service rpc_service[] = {`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`{ "upload-pack", "uploadpack", 1, 1 },`
			`{ "receive-pack", "receivepack", 0, -1 },`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`};`

			`static struct string_list *get_parameters(void)`
			`{`
			`if (!query_params) {`
			`const char *query = getenv("QUERY_STRING");`

			`query_params = xcalloc(1, sizeof(*query_params));`
			`while (query && *query) {`
make url-related functions reusable The is_url function and url percent-decoding functions were static, but are generally useful. Let's make them available to other parts of the code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-05-23 11:17:55 +02:00			`char *name = url_decode_parameter_name(&query);`
			`char *value = url_decode_parameter_value(&query);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`struct string_list_item *i;`

string_list: Fix argument order for string_list_lookup Update the definition and callers of string_list_lookup to use the string_list as the first argument. This helps make the string_list API easier to use by being more consistent. Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-26 01:41:37 +02:00			`i = string_list_lookup(query_params, name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`if (!i)`
string_list: Fix argument order for string_list_insert Update the definition and callers of string_list_insert to use the string_list as the first argument. This helps make the string_list API easier to use by being more consistent. Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-26 01:41:35 +02:00			`i = string_list_insert(query_params, name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`else`
			`free(i->util);`
			`i->util = value;`
			`}`
			`}`
			`return query_params;`
			`}`

			`static const char get_parameter(const char name)`
			`{`
			`struct string_list_item *i;`
string_list: Fix argument order for string_list_lookup Update the definition and callers of string_list_lookup to use the string_list as the first argument. This helps make the string_list API easier to use by being more consistent. Signed-off-by: Julian Phillips <julian@quantumfyre.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-06-26 01:41:37 +02:00			`i = string_list_lookup(get_parameters(), name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`return i ? i->util : NULL;`
			`}`

http-backend: Let gcc check the format of more printf-type functions. We already have these checks in many printf-type functions that have prototypes which are in header files. Add these same checks to static functions in http-backend.c Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:58 +01:00			`__attribute__((format (printf, 2, 3)))`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static void format_write(int fd, const char *fmt, ...)`
			`{`
			`static char buffer[1024];`

			`va_list args;`
			`unsigned n;`

			`va_start(args, fmt);`
			`n = vsnprintf(buffer, sizeof(buffer), fmt, args);`
			`va_end(args);`
			`if (n >= sizeof(buffer))`
			`die("protocol error: impossibly long line");`

pkt-line: drop safe_write function This is just write_or_die by another name. The one distinction is that write_or_die will treat EPIPE specially by suppressing error messages. That's fine, as we die by SIGPIPE anyway (and in the off chance that it is disabled, write_or_die will simulate it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-20 21:01:56 +01:00			`write_or_die(fd, buffer, n);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void http_status(struct strbuf hdr, unsigned code, const char msg)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`strbuf_addf(hdr, "Status: %u %s\r\n", code, msg);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void hdr_str(struct strbuf hdr, const char name, const char *value)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`strbuf_addf(hdr, "%s: %s\r\n", name, value);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void hdr_int(struct strbuf hdr, const char name, uintmax_t value)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`strbuf_addf(hdr, "%s: %" PRIuMAX "\r\n", name, value);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-04-26 21:29:31 +02:00			`static void hdr_date(struct strbuf hdr, const char name, timestamp_t when)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
convert "enum date_mode" into a struct In preparation for adding date modes that may carry extra information beyond the mode itself, this patch converts the date_mode enum into a struct. Most of the conversion is fairly straightforward; we pass the struct as a pointer and dereference the type field where necessary. Locations that declare a date_mode can use a "{}" constructor. However, the tricky case is where we use the enum labels as constants, like: show_date(t, tz, DATE_NORMAL); Ideally we could say: show_date(t, tz, &{ DATE_NORMAL }); but of course C does not allow that. Likewise, we cannot cast the constant to a struct, because we need to pass an actual address. Our options are basically: 1. Manually add a "struct date_mode d = { DATE_NORMAL }" definition to each caller, and pass "&d". This makes the callers uglier, because they sometimes do not even have their own scope (e.g., they are inside a switch statement). 2. Provide a pre-made global "date_normal" struct that can be passed by address. We'd also need "date_rfc2822", "date_iso8601", and so forth. But at least the ugliness is defined in one place. 3. Provide a wrapper that generates the correct struct on the fly. The big downside is that we end up pointing to a single global, which makes our wrapper non-reentrant. But show_date is already not reentrant, so it does not matter. This patch implements 3, along with a minor macro to keep the size of the callers sane. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-06-25 18:55:02 +02:00			`const char *value = show_date(when, 0, DATE_MODE(RFC2822));`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_str(hdr, name, value);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void hdr_nocache(struct strbuf *hdr)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_str(hdr, "Expires", "Fri, 01 Jan 1980 00:00:00 GMT");`
			`hdr_str(hdr, "Pragma", "no-cache");`
			`hdr_str(hdr, "Cache-Control", "no-cache, max-age=0, must-revalidate");`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void hdr_cache_forever(struct strbuf *hdr)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
timestamp_t: a new data type for timestamps Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-04-26 21:29:31 +02:00			`timestamp_t now = time(NULL);`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_date(hdr, "Date", now);`
			`hdr_date(hdr, "Expires", now + 31536000);`
			`hdr_str(hdr, "Cache-Control", "public, max-age=31536000");`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void end_headers(struct strbuf *hdr)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`strbuf_add(hdr, "\r\n", 2);`
			`write_or_die(1, hdr->buf, hdr->len);`
			`strbuf_release(hdr);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`__attribute__((format (printf, 2, 3)))`
			`static NORETURN void not_found(struct strbuf hdr, const char err, ...)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
			`va_list params;`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`http_status(hdr, 404, "Not Found");`
			`hdr_nocache(hdr);`
			`end_headers(hdr);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`va_start(params, err);`
			`if (err && *err)`
			`vfprintf(stderr, err, params);`
			`va_end(params);`
			`exit(0);`
			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`__attribute__((format (printf, 2, 3)))`
			`static NORETURN void forbidden(struct strbuf hdr, const char err, ...)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
			`va_list params;`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`http_status(hdr, 403, "Forbidden");`
			`hdr_nocache(hdr);`
			`end_headers(hdr);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`va_start(params, err);`
			`if (err && *err)`
			`vfprintf(stderr, err, params);`
			`va_end(params);`
			`exit(0);`
			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void select_getanyfile(struct strbuf *hdr)`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`{`
			`if (!getanyfile)`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`forbidden(hdr, "Unsupported service: getanyfile");`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void send_strbuf(struct strbuf *hdr,`
			`const char type, struct strbuf buf)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_int(hdr, content_length, buf->len);`
			`hdr_str(hdr, content_type, type);`
			`end_headers(hdr);`
pkt-line: drop safe_write function This is just write_or_die by another name. The one distinction is that write_or_die will treat EPIPE specially by suppressing error messages. That's fine, as we die by SIGPIPE anyway (and in the off chance that it is disabled, write_or_die will simulate it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-20 21:01:56 +01:00			`write_or_die(1, buf->buf, buf->len);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void send_local_file(struct strbuf hdr, const char the_type,`
			`const char *name)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
prefer git_pathdup to git_path in some possibly-dangerous cases Because git_path uses a static buffer that is shared with calls to git_path, mkpath, etc, it can be dangerous to assign the result to a variable or pass it to a non-trivial function. The value may change unexpectedly due to other calls. None of the cases changed here has a known bug, but they're worth converting away from git_path because: 1. It's easy to use git_pathdup in these cases. 2. They use constructs (like assignment) that make it hard to tell whether they're safe or not. The extra malloc overhead should be trivial, as an allocation should be an order of magnitude cheaper than a system call (which we are clearly about to make, since we are constructing a filename). The real cost is that we must remember to free the result. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-08-10 11:35:31 +02:00			`char *p = git_pathdup("%s", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`size_t buf_alloc = 8192;`
			`char *buf = xmalloc(buf_alloc);`
			`int fd;`
			`struct stat sb;`

			`fd = open(p, O_RDONLY);`
			`if (fd < 0)`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`not_found(hdr, "Cannot open '%s': %s", p, strerror(errno));`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`if (fstat(fd, &sb) < 0)`
			`die_errno("Cannot stat '%s'", p);`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_int(hdr, content_length, sb.st_size);`
			`hdr_str(hdr, content_type, the_type);`
			`hdr_date(hdr, last_modified, sb.st_mtime);`
			`end_headers(hdr);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
http-backend: Fix bad treatment of uintmax_t in Content-Length Our Content-Length needs to report an off_t, which could be larger precision than size_t on this system (e.g. 32 bit binary built with 64 bit large file support). We also shouldn't be passing a size_t parameter to printf when we've used PRIuMAX as the format specifier. Fix both issues by using uintmax_t for the hdr_int() routine, allowing strbuf's size_t to automatically upcast, and off_t to always fit. Also fixed the copy loop we use inside of send_local_file(), we never actually updated the size variable so we might as well not use it. Reported-by: Tarmigan <tarmigan+git@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-12 05:42:41 +01:00			`for (;;) {`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`ssize_t n = xread(fd, buf, buf_alloc);`
			`if (n < 0)`
			`die_errno("Cannot read '%s'", p);`
			`if (!n)`
			`break;`
pkt-line: drop safe_write function This is just write_or_die by another name. The one distinction is that write_or_die will treat EPIPE specially by suppressing error messages. That's fine, as we die by SIGPIPE anyway (and in the off chance that it is disabled, write_or_die will simulate it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-02-20 21:01:56 +01:00			`write_or_die(1, buf, n);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`
			`close(fd);`
			`free(buf);`
prefer git_pathdup to git_path in some possibly-dangerous cases Because git_path uses a static buffer that is shared with calls to git_path, mkpath, etc, it can be dangerous to assign the result to a variable or pass it to a non-trivial function. The value may change unexpectedly due to other calls. None of the cases changed here has a known bug, but they're worth converting away from git_path because: 1. It's easy to use git_pathdup in these cases. 2. They use constructs (like assignment) that make it hard to tell whether they're safe or not. The extra malloc overhead should be trivial, as an allocation should be an order of magnitude cheaper than a system call (which we are clearly about to make, since we are constructing a filename). The real cost is that we must remember to free the result. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-08-10 11:35:31 +02:00			`free(p);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void get_text_file(struct strbuf hdr, char name)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`select_getanyfile(hdr);`
			`hdr_nocache(hdr);`
			`send_local_file(hdr, "text/plain", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void get_loose_object(struct strbuf hdr, char name)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`select_getanyfile(hdr);`
			`hdr_cache_forever(hdr);`
			`send_local_file(hdr, "application/x-git-loose-object", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void get_pack_file(struct strbuf hdr, char name)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`select_getanyfile(hdr);`
			`hdr_cache_forever(hdr);`
			`send_local_file(hdr, "application/x-git-packed-objects", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void get_idx_file(struct strbuf hdr, char name)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`select_getanyfile(hdr);`
			`hdr_cache_forever(hdr);`
			`send_local_file(hdr, "application/x-git-packed-objects-toc", name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`static void http_config(void)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`int i, value = 0;`
			`struct strbuf var = STRBUF_INIT;`
use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`git_config_get_bool("http.getanyfile", &getanyfile);`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`git_config_get_ulong("http.maxrequestbuffer", &max_request_buffer);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`for (i = 0; i < ARRAY_SIZE(rpc_service); i++) {`
			`struct rpc_service *svc = &rpc_service[i];`
			`strbuf_addf(&var, "http.%s", svc->config_name);`
			`if (!git_config_get_bool(var.buf, &value))`
			`svc->enabled = value;`
			`strbuf_reset(&var);`
http-backend: Use http.getanyfile to disable dumb HTTP serving Some repository owners may wish to enable smart HTTP, but disallow dumb content serving. Disallowing dumb serving might be because the owners want to rely upon reachability to control which objects clients may access from the repository, or they just want to encourage clients to use the more bandwidth efficient transport. If http.getanyfile is set to false the backend CGI will return with '403 Forbidden' when an object file is accessed by a dumb client. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-05 02:16:37 +01:00			`}`

http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`strbuf_release(&var);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static struct rpc_service select_service(struct strbuf hdr, const char *name)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00			`const char *svc_name;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`struct rpc_service *svc = NULL;`
			`int i;`

use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00			`if (!skip_prefix(name, "git-", &svc_name))`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`forbidden(hdr, "Unsupported service: '%s'", name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`for (i = 0; i < ARRAY_SIZE(rpc_service); i++) {`
			`struct rpc_service *s = &rpc_service[i];`
use skip_prefix to avoid magic numbers It's a common idiom to match a prefix and then skip past it with a magic number, like: if (starts_with(foo, "bar")) foo += 3; This is easy to get wrong, since you have to count the prefix string yourself, and there's no compiler check if the string changes. We can use skip_prefix to avoid the magic numbers here. Note that some of these conversions could be much shorter. For example: if (starts_with(arg, "--foo=")) { bar = arg + 6; continue; } could become: if (skip_prefix(arg, "--foo=", &bar)) continue; However, I have left it as: if (skip_prefix(arg, "--foo=", &v)) { bar = v; continue; } to visually match nearby cases which need to actually process the string. Like: if (skip_prefix(arg, "--foo=", &v)) { bar = atoi(v); continue; } Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-06-18 21:47:50 +02:00			`if (!strcmp(s->name, svc_name)) {`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`svc = s;`
			`break;`
			`}`
			`}`

			`if (!svc)`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`forbidden(hdr, "Unsupported service: '%s'", name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`if (svc->enabled < 0) {`
			`const char *user = getenv("REMOTE_USER");`
			`svc->enabled = (user && *user) ? 1 : 0;`
			`}`
			`if (!svc->enabled)`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`forbidden(hdr, "Service not enabled: '%s'", svc->name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`return svc;`
			`}`

http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`/*`
			`* This is basically strbuf_read(), except that if we`
			`* hit max_request_buffer we die (we'd rather reject a`
			`* maliciously large request than chew up infinite memory).`
			`*/`
			`static ssize_t read_request(int fd, unsigned char **out)`
			`{`
			`size_t len = 0, alloc = 8192;`
			`unsigned char *buf = xmalloc(alloc);`

			`if (max_request_buffer < alloc)`
			`max_request_buffer = alloc;`

			`while (1) {`
			`ssize_t cnt;`

			`cnt = read_in_full(fd, buf + len, alloc - len);`
			`if (cnt < 0) {`
			`free(buf);`
			`return -1;`
			`}`

			`/* partial read from read_in_full means we hit EOF */`
			`len += cnt;`
			`if (len < alloc) {`
			`*out = buf;`
			`return len;`
			`}`

			`/* otherwise, grow and try again (if we can) */`
			`if (alloc == max_request_buffer)`
			`die("request was larger than our maximum size (%lu);"`
			`" try setting GIT_HTTP_MAX_REQUEST_BUFFER",`
			`max_request_buffer);`

			`alloc = alloc_nr(alloc);`
			`if (alloc > max_request_buffer)`
			`alloc = max_request_buffer;`
			`REALLOC_ARRAY(buf, alloc);`
			`}`
			`}`

			`static void inflate_request(const char *prog_name, int out, int buffer_input)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
zlib: zlib can only process 4GB at a time The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 20:52:15 +02:00			`git_zstream stream;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`unsigned char *full_request = NULL;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`unsigned char in_buf[8192];`
			`unsigned char out_buf[8192];`
			`unsigned long cnt = 0;`

			`memset(&stream, 0, sizeof(stream));`
zlib: wrap inflateInit2 used to accept only for gzip format http-backend.c uses inflateInit2() to tell the library that it wants to accept only gzip format. Wrap it in a helper function so that readers do not have to wonder what the magic numbers 15 and 16 are for. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:45:29 +02:00			`git_inflate_init_gzip_only(&stream);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`while (1) {`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`ssize_t n;`

			`if (buffer_input) {`
			`if (full_request)`
			`n = 0; /* nothing left to read */`
			`else`
			`n = read_request(0, &full_request);`
			`stream.next_in = full_request;`
			`} else {`
			`n = xread(0, in_buf, sizeof(in_buf));`
			`stream.next_in = in_buf;`
			`}`

Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`if (n <= 0)`
			`die("request ended in the middle of the gzip stream");`
			`stream.avail_in = n;`

			`while (0 < stream.avail_in) {`
			`int ret;`

			`stream.next_out = out_buf;`
			`stream.avail_out = sizeof(out_buf);`

zlib: wrap remaining calls to direct inflate/inflateEnd Two callsites in http-backend.c to inflate() and inflateEnd() were not using git_ prefixed versions. After this, running $ find all objects -print \| xargs nm -ugo \| grep inflate shows only zlib.c makes direct calls to zlib for inflate operation, except for a singlecall to inflateInit2 in http-backend.c Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:39:27 +02:00			`ret = git_inflate(&stream, Z_NO_FLUSH);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`if (ret != Z_OK && ret != Z_STREAM_END)`
			`die("zlib error inflating request, result %d", ret);`

			`n = stream.total_out - cnt;`
			`if (write_in_full(out, out_buf, n) != n)`
			`die("%s aborted reading request", prog_name);`
			`cnt += n;`

			`if (ret == Z_STREAM_END)`
			`goto done;`
			`}`
			`}`

			`done:`
zlib: wrap remaining calls to direct inflate/inflateEnd Two callsites in http-backend.c to inflate() and inflateEnd() were not using git_ prefixed versions. After this, running $ find all objects -print \| xargs nm -ugo \| grep inflate shows only zlib.c makes direct calls to zlib for inflate operation, except for a singlecall to inflateInit2 in http-backend.c Signed-off-by: Junio C Hamano <gitster@pobox.com> 2011-06-10 19:39:27 +02:00			`git_inflate_end(&stream);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`close(out);`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`free(full_request);`
			`}`

			`static void copy_request(const char *prog_name, int out)`
			`{`
			`unsigned char *buf;`
			`ssize_t n = read_request(0, &buf);`
			`if (n < 0)`
			`die_errno("error reading request body");`
			`if (write_in_full(out, buf, n) != n)`
			`die("%s aborted reading request", prog_name);`
			`close(out);`
			`free(buf);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`}`

http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`static void run_service(const char **argv, int buffer_input)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
			`const char *encoding = getenv("HTTP_CONTENT_ENCODING");`
			`const char *user = getenv("REMOTE_USER");`
			`const char *host = getenv("REMOTE_ADDR");`
			`int gzipped_request = 0;`
run-command: introduce CHILD_PROCESS_INIT Most struct child_process variables are cleared using memset first after declaration. Provide a macro, CHILD_PROCESS_INIT, that can be used to initialize them statically instead. That's shorter, doesn't require a function call and is slightly more readable (especially given that we already have STRBUF_INIT, ARGV_ARRAY_INIT etc.). Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-19 21:09:35 +02:00			`struct child_process cld = CHILD_PROCESS_INIT;`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`if (encoding && !strcmp(encoding, "gzip"))`
			`gzipped_request = 1;`
			`else if (encoding && !strcmp(encoding, "x-gzip"))`
			`gzipped_request = 1;`

			`if (!user \|\| !*user)`
			`user = "anonymous";`
			`if (!host \|\| !*host)`
			`host = "(none)";`

http-backend: respect existing GIT_COMMITTER_* variables The http-backend program sets default GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL variables based on the REMOTE_USER and REMOTE_ADDR variables provided by the webserver. However, it unconditionally overwrites any existing GIT_COMMITTER variables, which may have been customized by site-specific code in the webserver (or in a script wrapping http-backend). Let's leave those variables intact if they already exist, assuming that any such configuration was intentional. There is a slight chance of a regression if somebody has set GIT_COMMITTER_* for the entire webserver, not intending it to leak through http-backend. We could protect against this by passing the information in alternate variables. However, it seems unlikely that anyone will care about that regression, and there is value in the simplicity of using the common variable names that are used elsewhere in git. While we're tweaking the environment-handling in http-backend, let's switch it to use argv_array to handle the list of variables. That makes the memory management much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-30 09:01:30 +02:00			`if (!getenv("GIT_COMMITTER_NAME"))`
use env_array member of struct child_process Convert users of struct child_process to using the managed env_array for specifying environment variables instead of supplying an array on the stack or bringing their own argv_array. This shortens and simplifies the code and ensures automatically that the allocated memory is freed after use. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-10-19 13:14:20 +02:00			`argv_array_pushf(&cld.env_array, "GIT_COMMITTER_NAME=%s", user);`
http-backend: respect existing GIT_COMMITTER_* variables The http-backend program sets default GIT_COMMITTER_NAME and GIT_COMMITTER_EMAIL variables based on the REMOTE_USER and REMOTE_ADDR variables provided by the webserver. However, it unconditionally overwrites any existing GIT_COMMITTER variables, which may have been customized by site-specific code in the webserver (or in a script wrapping http-backend). Let's leave those variables intact if they already exist, assuming that any such configuration was intentional. There is a slight chance of a regression if somebody has set GIT_COMMITTER_* for the entire webserver, not intending it to leak through http-backend. We could protect against this by passing the information in alternate variables. However, it seems unlikely that anyone will care about that regression, and there is value in the simplicity of using the common variable names that are used elsewhere in git. While we're tweaking the environment-handling in http-backend, let's switch it to use argv_array to handle the list of variables. That makes the memory management much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2012-03-30 09:01:30 +02:00			`if (!getenv("GIT_COMMITTER_EMAIL"))`
use env_array member of struct child_process Convert users of struct child_process to using the managed env_array for specifying environment variables instead of supplying an array on the stack or bringing their own argv_array. This shortens and simplifies the code and ensures automatically that the allocated memory is freed after use. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-10-19 13:14:20 +02:00			`argv_array_pushf(&cld.env_array,`
			`"GIT_COMMITTER_EMAIL=%s@http.%s", user, host);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`cld.argv = argv;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`if (buffer_input \|\| gzipped_request)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`cld.in = -1;`
			`cld.git_cmd = 1;`
			`if (start_command(&cld))`
			`exit(1);`

			`close(1);`
			`if (gzipped_request)`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`inflate_request(argv[0], cld.in, buffer_input);`
			`else if (buffer_input)`
			`copy_request(argv[0], cld.in);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`else`
			`close(0);`

			`if (finish_command(&cld))`
			`exit(1);`
			`}`

http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`static int show_text_ref(const char name, const struct object_id oid,`
			`int flag, void *cb_data)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`const char *name_nons = strip_namespace(name);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`struct strbuf *buf = cb_data;`
object: convert parse_object* to take struct object_id Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2017-05-07 00:10:38 +02:00			`struct object *o = parse_object(oid);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`if (!o)`
			`return 0;`

http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`strbuf_addf(buf, "%s\t%s\n", oid_to_hex(oid), name_nons);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`if (o->type == OBJ_TAG) {`
			`o = deref_tag(o, name, 0);`
			`if (!o)`
			`return 0;`
Convert struct object to object_id struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net> 2015-11-10 03:22:28 +01:00			`strbuf_addf(buf, "%s\t%s^{}\n", oid_to_hex(&o->oid),`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`name_nons);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`
			`return 0;`
			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void get_info_refs(struct strbuf hdr, char arg)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`const char *service_name = get_parameter("service");`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`struct strbuf buf = STRBUF_INIT;`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_nocache(hdr);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`if (service_name) {`
			`const char argv[] = {NULL / service name */,`
			`"--stateless-rpc", "--advertise-refs",`
			`".", NULL};`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`struct rpc_service *svc = select_service(hdr, service_name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`strbuf_addf(&buf, "application/x-git-%s-advertisement",`
			`svc->name);`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_str(hdr, content_type, buf.buf);`
			`end_headers(hdr);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
pkt-line: rename packet_write() to packet_write_fmt() packet_write() should be called packet_write_fmt() because it is a printf-like function that takes a format string as first parameter. packet_write_fmt() should be used for text strings only. Arbitrary binary data should use a new packet_write() function that is introduced in a subsequent patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-10-17 01:20:29 +02:00			`packet_write_fmt(1, "# service=git-%s\n", svc->name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`packet_flush(1);`

			`argv[0] = svc->name;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`run_service(argv, 0);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`} else {`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`select_getanyfile(hdr);`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`for_each_namespaced_ref(show_text_ref, &buf);`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`send_strbuf(hdr, "text/plain", &buf);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`}`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`strbuf_release(&buf);`
			`}`

http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`static int show_head_ref(const char refname, const struct object_id oid,`
			`int flag, void *cb_data)`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`{`
			`struct strbuf *buf = cb_data;`

			`if (flag & REF_ISSYMREF) {`
show_head_ref(): convert local variable "unused" to object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:56 +02:00			`struct object_id unused;`
refs.c: change resolve_ref_unsafe reading argument to be a flags field resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref resolves successfully for writing but not for reading). Change this to be a flags field instead, and pass the new constant RESOLVE_REF_READING when we want this behaviour. While at it, swap two of the arguments in the function to put output arguments at the end. As a nice side effect, this ensures that we can catch callers that were unaware of the new API so they can be audited. Give the wrapper functions resolve_refdup and read_ref_full the same treatment for consistency. Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-07-15 21:59:36 +02:00			`const char *target = resolve_ref_unsafe(refname,`
			`RESOLVE_REF_READING,`
show_head_ref(): convert local variable "unused" to object_id Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:56 +02:00			`unused.hash, NULL);`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00
show_head_ref(): check the result of resolve_ref_namespace() Only use the result of resolve_ref_namespace() if it is non-NULL. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-04-07 21:03:09 +02:00			`if (target)`
			`strbuf_addf(buf, "ref: %s\n", strip_namespace(target));`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`} else {`
http-backend: rewrite to take an object_id argument Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-25 20:38:55 +02:00			`strbuf_addf(buf, "%s\n", oid_to_hex(oid));`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`}`

			`return 0;`
			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void get_head(struct strbuf hdr, char arg)`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`{`
			`struct strbuf buf = STRBUF_INIT;`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`select_getanyfile(hdr);`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`head_ref_namespaced(show_head_ref, &buf);`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`send_strbuf(hdr, "text/plain", &buf);`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`strbuf_release(&buf);`
			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void get_info_packs(struct strbuf hdr, char arg)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
			`size_t objdirlen = strlen(get_object_directory());`
			`struct strbuf buf = STRBUF_INIT;`
			`struct packed_git *p;`
			`size_t cnt = 0;`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`select_getanyfile(hdr);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`prepare_packed_git();`
			`for (p = packed_git; p; p = p->next) {`
			`if (p->pack_local)`
			`cnt++;`
			`}`

			`strbuf_grow(&buf, cnt * 53 + 2);`
			`for (p = packed_git; p; p = p->next) {`
			`if (p->pack_local)`
			`strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);`
			`}`
			`strbuf_addch(&buf, '\n');`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_nocache(hdr);`
			`send_strbuf(hdr, "text/plain; charset=utf-8", &buf);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`strbuf_release(&buf);`
			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void check_content_type(struct strbuf hdr, const char accepted_type)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
			`const char *actual_type = getenv("CONTENT_TYPE");`

			`if (!actual_type)`
			`actual_type = "";`

			`if (strcmp(actual_type, accepted_type)) {`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`http_status(hdr, 415, "Unsupported Media Type");`
			`hdr_nocache(hdr);`
			`end_headers(hdr);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`format_write(1,`
			`"Expected POST with Content-Type '%s',"`
			`" but received '%s' instead.\n",`
			`accepted_type, actual_type);`
			`exit(0);`
			`}`
			`}`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static void service_rpc(struct strbuf hdr, char service_name)`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{`
			`const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`struct rpc_service *svc = select_service(hdr, service_name);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`struct strbuf buf = STRBUF_INIT;`

			`strbuf_reset(&buf);`
			`strbuf_addf(&buf, "application/x-git-%s-request", svc->name);`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`check_content_type(hdr, buf.buf);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_nocache(hdr);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`strbuf_reset(&buf);`
			`strbuf_addf(&buf, "application/x-git-%s-result", svc->name);`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`hdr_str(hdr, content_type, buf.buf);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`end_headers(hdr);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00
			`argv[0] = svc->name;`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`run_service(argv, svc->buffer_input);`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`strbuf_release(&buf);`
			`}`

http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`static int dead;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static NORETURN void die_webcgi(const char *err, va_list params)`
			`{`
http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`if (dead <= 1) {`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`struct strbuf hdr = STRBUF_INIT;`

http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`vreportf("fatal: ", err, params);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`http_status(&hdr, 500, "Internal Server Error");`
			`hdr_nocache(&hdr);`
			`end_headers(&hdr);`
http-backend: Don't infinite loop during die() If stdout has already been closed by the CGI and die() gets called, the CGI will fail to write the "Status: 500 Internal Server Error" to the pipe, which results in die() being called again (via safe_write). This goes on in an infinite loop until the stack overflows and the process is killed by SIGSEGV. Instead set a flag on the first die() invocation and if we came back to the handler, just die silently, as it only means we failed to report the failure---we cannot report anything anyway in such a case. This way failures to write the error messages to the stdout pipe do not result in an infinite loop. We also now report on the death to stderr before we report to stdout, to increase the chances that the cause of the die() invocation will appear in the server's error log. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> fixup! http-backend.c: Don't infinite loop Now die_webcgi() actually can return during a recursive call into it, causing http-backend.c:554: error: 'noreturn' function does return The only reason we would come back to the die handler is because we failed during it, so we cannot report anything anyway. Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-03-22 15:22:04 +01:00			`}`
			`exit(0); /* we successfully reported a failure ;-) */`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`}`

http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`static int die_webcgi_recursing(void)`
			`{`
			`return dead++ > 1;`
			`}`

http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`static char* getdir(void)`
			`{`
			`struct strbuf buf = STRBUF_INIT;`
			`char *pathinfo = getenv("PATH_INFO");`
			`char *root = getenv("GIT_PROJECT_ROOT");`
			`char *path = getenv("PATH_TRANSLATED");`

			`if (root && *root) {`
			`if (!pathinfo \|\| !*pathinfo)`
			`die("GIT_PROJECT_ROOT is set but PATH_INFO is not");`
http-backend: Protect GIT_PROJECT_ROOT from /../ requests Eons ago HPA taught git-daemon how to protect itself from /../ attacks, which Junio brought back into service in d79374c7b58d ("daemon.c and path.enter_repo(): revamp path validation"). I did not carry this into git-http-backend as originally we relied only upon PATH_TRANSLATED, and assumed the HTTP server had done its access control checks to validate the resolved path was within a directory permitting access from the remote client. This would usually be sufficient to protect a server from requests for its /etc/passwd file by http://host/smart/../etc/passwd sorts of URLs. However in 917adc036086 Mark Lodato added GIT_PROJECT_ROOT as an additional method of configuring the CGI. When this environment variable is used the web server does not generate the final access path and therefore may blindly pass through "/../etc/passwd" in PATH_INFO under the assumption that "/../" might have special meaning to the invoked CGI. Instead of permitting these sorts of malformed path requests, we now reject them back at the client, with an error message for the server log. This matches git-daemon behavior. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 20:26:43 +01:00			`if (daemon_avoid_alias(pathinfo))`
			`die("'%s': aliased", pathinfo);`
http-backend: use end_url_with_slash() Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2010-11-25 09:21:06 +01:00			`end_url_with_slash(&buf, root);`
http-backend: Protect GIT_PROJECT_ROOT from /../ requests Eons ago HPA taught git-daemon how to protect itself from /../ attacks, which Junio brought back into service in d79374c7b58d ("daemon.c and path.enter_repo(): revamp path validation"). I did not carry this into git-http-backend as originally we relied only upon PATH_TRANSLATED, and assumed the HTTP server had done its access control checks to validate the resolved path was within a directory permitting access from the remote client. This would usually be sufficient to protect a server from requests for its /etc/passwd file by http://host/smart/../etc/passwd sorts of URLs. However in 917adc036086 Mark Lodato added GIT_PROJECT_ROOT as an additional method of configuring the CGI. When this environment variable is used the web server does not generate the final access path and therefore may blindly pass through "/../etc/passwd" in PATH_INFO under the assumption that "/../" might have special meaning to the invoked CGI. Instead of permitting these sorts of malformed path requests, we now reject them back at the client, with an error message for the server log. This matches git-daemon behavior. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-09 20:26:43 +01:00			`if (pathinfo[0] == '/')`
			`pathinfo++;`
http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`strbuf_addstr(&buf, pathinfo);`
			`return strbuf_detach(&buf, NULL);`
			`} else if (path && *path) {`
			`return xstrdup(path);`
			`} else`
			`die("No GIT_PROJECT_ROOT or PATH_TRANSLATED from server");`
			`return NULL;`
			`}`

Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`static struct service_cmd {`
			`const char *method;`
			`const char *pattern;`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`void (imp)(struct strbuf , char *);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`} services[] = {`
http-backend: respect GIT_NAMESPACE with dumb clients Filter the list of refs returned via the dumb HTTP protocol according to the active namespace, consistent with other clients of the upload-pack service. Signed-off-by: John Koleszar <jkoleszar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2013-04-10 02:55:08 +02:00			`{"GET", "/HEAD$", get_head},`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{"GET", "/info/refs$", get_info_refs},`
			`{"GET", "/objects/info/alternates$", get_text_file},`
			`{"GET", "/objects/info/http-alternates$", get_text_file},`
			`{"GET", "/objects/info/packs$", get_info_packs},`
			`{"GET", "/objects/[0-9a-f]{2}/[0-9a-f]{38}$", get_loose_object},`
			`{"GET", "/objects/pack/pack-[0-9a-f]{40}\\.pack$", get_pack_file},`
Smart fetch and push over HTTP: server side Requests for $GIT_URL/git-receive-pack and $GIT_URL/git-upload-pack are forwarded to the corresponding backend process by directly executing it and leaving stdin and stdout connected to the invoking web server. Prior to starting the backend process the HTTP response headers are sent, thereby freeing the backend from needing to know about the HTTP protocol. Requests that are encoded with Content-Encoding: gzip are automatically inflated before being streamed into the backend. This is primarily useful for the git-upload-pack backend, which receives highly repetitive text data from clients that easily compresses to 50% of its original size. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:34 +01:00			`{"GET", "/objects/pack/pack-[0-9a-f]{40}\\.idx$", get_idx_file},`

			`{"POST", "/git-upload-pack$", service_rpc},`
			`{"POST", "/git-receive-pack$", service_rpc}`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`};`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`static int bad_request(struct strbuf hdr, const struct service_cmd c)`
			`{`
			`const char *proto = getenv("SERVER_PROTOCOL");`

			`if (proto && !strcmp(proto, "HTTP/1.1")) {`
			`http_status(hdr, 405, "Method Not Allowed");`
			`hdr_str(hdr, "Allow",`
			`!strcmp(c->method, "GET") ? "GET, HEAD" : c->method);`
			`} else`
			`http_status(hdr, 400, "Bad Request");`
			`hdr_nocache(hdr);`
			`end_headers(hdr);`
			`return 0;`
			`}`

add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char " instead of "char " for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-07-01 07:58:58 +02:00			`int cmd_main(int argc, const char **argv)`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`{`
			`char *method = getenv("REQUEST_METHOD");`
http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`char *dir;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`struct service_cmd *cmd = NULL;`
			`char *cmd_arg = NULL;`
			`int i;`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`struct strbuf hdr = STRBUF_INIT;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`set_die_routine(die_webcgi);`
http-backend: fix die recursion with custom handler When we die() in http-backend, we call a custom handler that writes an HTTP 500 response to stdout, then reports the error to stderr. Our routines for writing out the HTTP response may themselves die, leading to us entering die() again. When it was originally written, that was OK; our custom handler keeps a variable to notice this and does not recurse. However, since cd163d4 (usage.c: detect recursion in die routines and bail out immediately, 2012-11-14), the main die() implementation detects recursion before we even get to our custom handler, and bails without printing anything useful. We can handle this case by doing two things: 1. Installing a custom die_is_recursing handler that allows us to enter up to one level of recursion. Only the first call to our custom handler will try to write out the error response. So if we die again, that is OK. If we end up dying more than that, it is a sign that we are in an infinite recursion. 2. Reporting the error to stderr before trying to write out the HTTP response. In the current code, if we do die() trying to write out the response, we'll exit immediately from this second die(), and never get a chance to output the original error (which is almost certainly the more interesting one; the second die is just going to be along the lines of "I tried to write to stdout but it was closed"). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-15 08:29:27 +02:00			`set_die_is_recursing_routine(die_webcgi_recursing);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`if (!method)`
			`die("No REQUEST_METHOD from server");`
			`if (!strcmp(method, "HEAD"))`
			`method = "GET";`
http-backend: add GIT_PROJECT_ROOT environment var Add a new environment variable, GIT_PROJECT_ROOT, to override the method of using PATH_TRANSLATED to find the git repository on disk. This makes it much easier to configure the web server, especially when the web server's DocumentRoot does not contain the git repositories, which is the usual case. Signed-off-by: Mark Lodato <lodatom@gmail.com> Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:35 +01:00			`dir = getdir();`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`for (i = 0; i < ARRAY_SIZE(services); i++) {`
			`struct service_cmd *c = &services[i];`
			`regex_t re;`
			`regmatch_t out[1];`

			`if (regcomp(&re, c->pattern, REG_EXTENDED))`
			`die("Bogus regex in service table: %s", c->pattern);`
			`if (!regexec(&re, dir, 1, out, 0)) {`
http-backend: Fix access beyond end of string. Found with valgrind while looking for Content-Length corruption in smart http. Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:57 +01:00			`size_t n;`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`if (strcmp(method, c->method))`
			`return bad_request(&hdr, c);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`cmd = c;`
http-backend: Fix access beyond end of string. Found with valgrind while looking for Content-Length corruption in smart http. Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-11-14 22:10:57 +01:00			`n = out[0].rm_eo - out[0].rm_so;`
use xmemdupz() to allocate copies of strings given by start and length Use xmemdupz() to allocate the memory, copy the data and make sure to NUL-terminate the result, all in one step. The resulting code is shorter, doesn't contain the constants 1 and '\0', and avoids duplicating function parameters. For blame, the last copied byte (o->file.ptr[o->file.size]) is always set to NUL by fake_working_tree_commit() or read_sha1_file(), so no information is lost by the conversion to using xmemdupz(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-07-19 17:35:34 +02:00			`cmd_arg = xmemdupz(dir + out[0].rm_so + 1, n - 1);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`dir[out[0].rm_so] = 0;`
			`break;`
			`}`
			`regfree(&re);`
			`}`

			`if (!cmd)`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`not_found(&hdr, "Request not supported: '%s'", dir);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
			`setup_path();`
			`if (!enter_repo(dir, 0))`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`not_found(&hdr, "Not a git repository: '%s'", dir);`
Smart-http: check if repository is OK to export before serving it Similar to how git-daemon checks whether a repository is OK to be exported, smart-http should also check. This check can be satisfied in two different ways: the environmental variable GIT_HTTP_EXPORT_ALL may be set to export all repositories, or the individual repository may have the file git-daemon-export-ok. Acked-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Tarmigan Casebolt <tarmigan+git@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-12-28 22:49:00 +01:00			`if (!getenv("GIT_HTTP_EXPORT_ALL") &&`
			`access("git-daemon-export-ok", F_OK) )`
http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`not_found(&hdr, "Repository not exported: '%s'", dir);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00
http-backend.c: replace `git_config()` with `git_config_get_bool()` family Use `git_config_get_bool()` family instead of `git_config()` to take advantage of the config-set API which provides a cleaner control flow. Signed-off-by: Tanay Abhra <tanayabh@gmail.com> Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2014-08-07 18:21:17 +02:00			`http_config();`
http-backend: spool ref negotiation requests to buffer When http-backend spawns "upload-pack" to do ref negotiation, it streams the http request body to upload-pack, who then streams the http response back to the client as it reads. In theory, git can go full-duplex; the client can consume our response while it is still sending the request. In practice, however, HTTP is a half-duplex protocol. Even if our client is ready to read and write simultaneously, we may have other HTTP infrastructure in the way, including the webserver that spawns our CGI, or any intermediate proxies. In at least one documented case[1], this leads to deadlock when trying a fetch over http. What happens is basically: 1. Apache proxies the request to the CGI, http-backend. 2. http-backend gzip-inflates the data and sends the result to upload-pack. 3. upload-pack acts on the data and generates output over the pipe back to Apache. Apache isn't reading because it's busy writing (step 1). This works fine most of the time, because the upload-pack output ends up in a system pipe buffer, and Apache reads it as soon as it finishes writing. But if both the request and the response exceed the system pipe buffer size, then we deadlock (Apache blocks writing to http-backend, http-backend blocks writing to upload-pack, and upload-pack blocks writing to Apache). We need to break the deadlock by spooling either the input or the output. In this case, it's ideal to spool the input, because Apache does not start reading either stdout _or_ stderr until we have consumed all of the input. So until we do so, we cannot even get an error message out to the client. The solution is fairly straight-forward: we read the request body into an in-memory buffer in http-backend, freeing up Apache, and then feed the data ourselves to upload-pack. But there are a few important things to note: 1. We limit the in-memory buffer to prevent an obvious denial-of-service attack. This is a new hard limit on requests, but it's unlikely to come into play. The default value is 10MB, which covers even the ridiculous 100,000-ref negotation in the included test (that actually caps out just over 5MB). But it's configurable on the off chance that you don't mind spending some extra memory to make even ridiculous requests work. 2. We must take care only to buffer when we have to. For pushes, the incoming packfile may be of arbitrary size, and we should connect the input directly to receive-pack. There's no deadlock problem here, though, because we do not produce any output until the whole packfile has been read. For upload-pack's initial ref advertisement, we similarly do not need to buffer. Even though we may generate a lot of output, there is no request body at all (i.e., it is a GET, not a POST). [1] http://article.gmane.org/gmane.comp.version-control.git/269020 Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2015-05-20 09:37:09 +02:00			`max_request_buffer = git_env_ulong("GIT_HTTP_MAX_REQUEST_BUFFER",`
			`max_request_buffer);`

http-backend: buffer headers before sending Avoid waking up the readers for unnecessary context switches for each line of header data being written, as all the headers are written in short succession. It is unlikely any HTTP/1.x server would want to read a CGI response one-line-at-a-time and trickle each to the client. Instead, I'd expect HTTP servers want to minimize syscall and TCP/IP framing overhead by trying to send all of its response headers in a single syscall or even combining the headers and first chunk of the body with MSG_MORE or writev. Verified by strace-ing response parsing on the CGI side. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2016-08-10 01:47:31 +02:00			`cmd->imp(&hdr, cmd_arg);`
Git-aware CGI to provide dumb HTTP transport The git-http-backend CGI can be configured into any Apache server using ScriptAlias, such as with the following configuration: LoadModule cgi_module /usr/libexec/apache2/mod_cgi.so LoadModule alias_module /usr/libexec/apache2/mod_alias.so ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/ Repositories are accessed via the translated PATH_INFO. The CGI is backwards compatible with the dumb client, allowing all older HTTP clients to continue to download repositories which are managed by the CGI. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com> 2009-10-31 01:47:32 +01:00			`return 0;`
			`}`